Title: A Computational Intelligence Scheme for the Prediction of the Daily Peak Load

Size: px

Start display at page:

Download "Title: A Computational Intelligence Scheme for the Prediction of the Daily Peak Load"

Lambert Summers
5 years ago
Views:

Accepted Manuscrpt Ttle: A Computatonal Intellgence Scheme for the Predcton of the Daly Peak Load Authors: Jawad Nag, Keem Sah Yap, Farrukh Nag, Seh Kong Tong, Syed Khaleel Ahmed PII:

1 Accepted Manuscrpt Ttle: A Computatonal Intellgence Scheme for the Predcton of the Daly Peak Load Authors: Jawad Nag, Keem Sah Yap, Farrukh Nag, Seh Kong Tong, Syed Khaleel Ahmed PII: S (11) DOI: do: /j.asoc Reference: ASOC 1252 To appear n: Appled Soft Computng Receved date: Revsed date: Accepted date: Please cte ths artcle as: J. Nag, K.S. Yap, F. Nag, S.K. Tong, S.K. Ahmed, A Computatonal Intellgence Scheme for the Predcton of the Daly Peak Load, Appled Soft Computng Journal (2010), do: /j.asoc Ths s a PDF fle of an unedted manuscrpt that has been accepted for publcaton. As a servce to our customers we are provdng ths early verson of the manuscrpt. The manuscrpt wll undergo copyedtng, typesettng, and revew of the resultng proof before t s publshed n ts fnal form. Please note that durng the producton process errors may be dscovered whch could affect the content, and all legal dsclamers that apply to the journal pertan.

2 A Computatonal Intellgence Scheme for the Predcton of the Daly Peak Load Jawad Nag a, Keem Sah Yap b, Farrukh Nag c, Seh Kong Tong b, Syed Khaleel Ahmed b Abstract a Dalle Molle Insttute for Artfcal Intellgence (IDSIA), CH-6928 Manno Lugano, Tcno, Swtzerland b Department of Electroncs and Communcaton Engneerng, Unversty Tenaga Nasonal, Kajang, Selangor, Malaysa c Department of Mechancal Engneerng, Unversty Tenaga Nasonal, Kajang, Selangor, Malaysa Forecastng of future electrcty demand s very mportant for decson makng n power system operaton and plannng. In recent years, due to prvatzaton and deregulaton of the power ndustry, accurate electrcty forecastng has become an mportant research area for effcent electrcty producton. Ths paper presents a tme seres approach for md-term load forecastng (MTLF) n order to predct the daly peak load for the next month. The proposed method employs a computatonal ntellgence scheme based on the self-organzng map (SOM) and support vector machne (SVM). Accordng to the smlarty degree of the tme seres load data, SOM s used as a clusterng tool to cluster the tranng data nto two subsets, usng the Kohonen rule. As a novel machne learnng technque, the support vector regresson (SVR) s used to ft the testng data based on the clustered subsets, for predctng the daly peak load. Our proposed SOM-SVR load forecastng model s evaluated n MATLAB on the electrcty load dataset provded by the Eastern Slovakan Electrcty Corporaton, whch was used n the 2001 European Network on Intellgent Technologes (EUNITE) load forecastng competton. Power load data obtaned from () Tenaga Nasonal Berhad (TNB) for pennsular Malaysa and () PJM for the eastern nterconnecton grd of the Unted States of Amerca s used to benchmark the performance of our proposed model. Expermental results obtaned ndcate that our proposed SOM-SVR technque gves sgnfcantly good predcton accuracy for MTLF compared to prevously researched fndngs usng the EUNITE, Malaysan and PJM electrcty load datasets. Accepted Manuscrpt Keywords: Computatonal ntellgence; Md-term load forecastng; Daly peak load; Selforganzng map; Support vector machne. Correspondng author. Tel.: (+41) (0) E-mal addresses: jawad@dsa.ch, yapkeem@unten.edu.my, farrukh@unten.edu.my, sehkong@unten.edu.my, syedkhaleel@unten.edu.my. 1 Page 1 of 41

3 1. Introducton Load forecastng s a key nstrument n power system operaton and plannng. Many operatonal decsons n power systems such as: unt commtment, economc dspatch, automatc generaton control, securty assessment, mantenance schedulng, and energy commercalzaton depend on the future behavor of loads. In partcular, wth the rse of deregulaton and free competton of the electrc power ndustry all around the world, load forecastng has become more mportant than ever before [1]. Along wth the power system prvatzed and deregulated, the ssue of accurately forecastng electrcty load has receved more attenton n recent years. The error of electrcty load forecastng ncreases operatonal costs [2 4]. Overestmaton of future load results n excess supply, whch s not welcome to the nternatonal energy network. In contrast, underestmaton of load leads to falure n provdng enough reserve and mples hgh costs. Thus, adequate electrcty producton requres each member of the global co-operaton beng able to forecast ts demands accurately [5]. Durng the last four decades, a wde varety of technques have been used for the problem of load forecastng [6 8]. Such a long experence n dealng wth the load forecastng problem has revealed tme seres modelng approaches based on statstcal methods and artfcal neural networks (ANNs). Statstcal models nclude movng average and exponental smoothng methods such as: mult-lnear regresson models, stochastc process, data mnng approaches, autoregressve movng average (ARMA) models, Box-Jenkns methods, and Kalman flterng-based methods [9 15]. Snce, load tme seres are usually nonlnear functons of exogenous varables; therefore, to ncorporate non-lnearty, ANNs have receved much attenton n solvng problems of load forecastng [16 19]. ANN based methods have reported farly good performances n forecastng. However, two major rsks n usng ANN models are the possbltes of less or excessve tranng data approxmaton,.e. under-fttng and over-fttng, whch ncrease the out-of-sample forecastng errors. Hence, due to the emprcal nature of ANNs ther applcaton s cumbersome and tme consumng. Recently, new machne learnng technques such as the support vector machnes (SVMs) have been used for load predcton and electrcty prce forecastng, and have acheved good performances [20],[21]. SVM, namely, support vector regresson (SVR) s a powerful machne learnng technque used for regresson, whch s based on recent advances n statstcal learnng theory [22]. Establshed on the structural rsk mnmzaton (SRM) prncple (estmate a functon by mnmzng an upper bound of the generalzaton error), SVMs have shown to be very resstant to the under-fttng and over-fttng problems caused by ANNs [20]. Accepted Manuscrpt In recent tmes, a Mult-Layer Perceptron SVM (MLP-SVM) was ntroduced n lterature [23]. Compared to standard SVM, MLP-SVM or Hdden Space Support Vector Machne (HSSVM) [24] can adopt to more knds of kernel functons that are not satsfed by Mercer s condtons, because the postve defnte property of the kernel functon s not a necessary condton [24]. From the vewpont of Mercer s condton, MLP-SVM s are less attractve because they are not suffcently understood for whch values of the hdden layer parameters the condton s satsfed [23]. Moreover, the drawbacks of the MLP-SVM n comparson to a standard SVM approach are ther hgher computatonal 2 Page 2 of 41

4 complexty, the problem of tunng a large number of parameters n the hdden layer and the problem of selectng the optmum number of hdden unts [23]. Due to these reasons MLP-SVM s and ther hgher computaton complexty, the choce of usng standard SVM, namely SVR s consderably more sutable for the problem of MTLF. In ths paper, we present our approach on the problem of MTLF n order to predct the daly peak load for the next month. Our study develops a computatonal ntellgence scheme of the Self Organzng Map (SOM) and Support Vector Regresson (SVR) usng the electrcty load data from the 2001 European Network on Intellgent Technologes (EUNITE) competton [25]. In order to evaluate the performance of our forecastng technque, power load data obtaned from () Tenaga Nasonal Berhad (TNB) for pennsular Malaysa and () PJM for the eastern nterconnecton grd of the Unted States of Amerca s used to benchmark the performance of the proposed SOM-SVR model. In our proposed model the SOM s appled to cluster the tranng data nto separate subsets usng the Kohonen rule. Accordng to the smlarty of tme seres samples the SOM clustered data s further used to ft the SVR model. Comparsons of recently proposed MTLF results as reported by authors on the EUNITE dataset have been conducted. The theoretcal parts are addressed n Sectons 2 to 5. Secton 6 presents the development and mplementaton of the SOM-SVR load forecastng model. Secton 7 shows expermental results and Secton 8 presents concludng remarks. 2. Recent Approaches to Load Forecastng In the last few decades, there have been wdespread references wth regards to efforts mprovng the accuracy of forecastng methods. One of these methods s a weathernsenstve approach whch uses hstorcal load data. It s famously known as the Box- Jenkns Autoregressve Integrated Movng Average (ARIMA) method dscussed n [14] and [26 28]. Chrstanse [29] and Park et al. [30] proposed exponental smoothng models by Fourer seres transformaton to forecast electrcty load. Douglas et al. [31] consdered verfyng the mpacts of a forecastng model n terms of temperature. To avod varable selecton problems, Azadeh et al. [32] employed a fuzzy system to provde an deal rule-base to determne the type of ARMA models that can be used. Wang et al. [33] proposed a hybrd Autoregressve and Movng Average wth Exogenous varables (ARMAX) model wth Partcle Swarm Optmzaton (PSO) to effcently solve the problem of trappng nto local mnmum caused by exogenous varables. Accepted Manuscrpt To acheve better accuracy of load forecastng, state space and Kalman flterng methods have been developed to reduce the dfference between the actual load and the predcted load [34 36]. Moghram and Rahman [37] proposed a model based on hstorcal data to construct the perodc load. Recently, Al-Hamad and Solman [38] employed a fuzzy rule-based logc by utlzng a movng wndow of current values of weather data, and hstorcal load data to estmate the optmal fuzzy parameters for the hourly load of the day. Amjady [39] proposed a hybrd model of the Forecast-Aded State Estmator (FASE) and the Mult-Layer Perceptron Neural Network (MLPNN), to forecast the shortterm bus load of power systems. 3 Page 3 of 41

5 Regresson models construct a causal-effect relatonshp between electrcty load and ndependent varables. The most popular model s the lnear regresson, proposed by Asbury [40], consderng the weather varable nto hs model. Papalexopoulos and Hesterberg [10] added holday and temperature factors nto ther proposed model. Solman et al. [41] proposed a multvarate lnear regresson model for load forecastng, ncludng temperature, wnd and humdty factors. Mrasgeds et al. [42] ncorporated weather meteorologcal varables such as: relatve humdty, heatng and coolng to forecast the electrcty demand n Greece. In contrast, Mohamed and Bodger [43] employed economc and geographc varables such as: GDP, electrcty prce and populaton to forecast the electrcty consumpton n New Zealand. Recently, Tsekouras et al. [44] ntroduced a non-lnear multvarable regresson approach to forecast the annual load by consderng correlaton analyss to select approprate nput varables. In the recent decade, many researchers have tred to apply machne learnng technques to mprove the accuracy of load forecastng. Rahman and Bhatnagar [45] constructed a Knowledge-Based Expert System (KBES) approach to electrcty load forecastng by smulatng the experences of the system operators [46]. Recently, applcatons of the Fuzzy Inference System (FIS) and fuzzy set theory have receved attenton n load forecastng. Yng and Pan [47] ntroduced an Adaptve Neuro-Fuzzy Inference System (ANFIS) by lookng for the mappng relaton between the nput and output data to. In addton, Pa [48] and Pandan et al. [49] employed fuzzy approaches to obtan superor performance n terms of load forecastng accuracy. ANNs have been appled to mprove the accuracy of load forecastng. Park et al. [50] proposed a Back-Propagaton Neural Network (BPNN) for daly load forecastng. Novak [51] appled the Radal Bass Functon Neural Network (RBFNN) to forecast electrcty load. Darbellay and Slama [52] appled ANNs to predct the regonal electrcty load n Czechoslovaka. Abdel-Aal [53] proposed an abductve network to conduct a one-hour ahead load forecast for fve years. Hsu and Chen [54] employed an ANN model to forecast the regonal electrcty load n Tawan. Recently, load forecastng applcatons of ANNs hybrd wth statstcal methods and other ntellgence technques have receved a lot of attenton. These nclude ANN models combned wth: Bayesan nference [55,56], Self-organzng Map (SOM) [57,58], Wavelet transform [59,60], PSO [61] and Dynamc mechansm [62]. Other machne learnng technques such as Support Vector Machnes (SVMs), are a promsng technque for classfcaton problems. SVMs mplement the Structural Rsk Mnmzaton (SRM) prncple [63], whch mnmzes the tranng error and maxmzes the confdence nterval, resultng n a good generalzaton performance [64]. Wth the ntroducton of Vapnks ξ-nsenstve loss functon [65], SVMs have been extended to solvng nonlnear regresson estmaton problems, and can be consdered as successful tools for forecastng problems. Accepted Manuscrpt SVMs have been recently appled by researchers to solve load forecastng problems. Cao [66] used the SVM experts for tme seres forecastng. Cao and Gu [67] proposed a Dynamc SVM (DSVM) model to deal wth non-statonary tme seres problems. Tay and Cao [68] used SVMs for forecastng the fnancal tme seres. Hong and Pa [69] appled SVMs to predct engne relablty. For electrcty load forecastng, Chen et al. [20] are the poneers for proposng the SVM model, whch was the wnnng entry of the Page 4 of 41

6 EUNITE competton [25]. Pa and Hong [70] employed the concepts of Jordan recurrent neural networks to construct a recurrent SVR model for regonal long-term load forecastng (LTLF) n Tawan. Smlarly, Pa and Hong [71] proposed a hybrd model composed of the SVR and the Smulated Annealng (SA) algorthm to forecast Tawan s LTLF. In more recent tmes, Yuancheng et al. [72] used the Least-Squares SVM (LS-SVM) for 24 hour ahead short-term load forecastng (STLF). Wu and Zhang [73] used LS-SVM wth the Chaos theory for load forecastng. SVM based on grey relaton analyss was used by Nu et al. [74] for daly load forecastng. In addton, Nu et al. [75] used a combnaton of SVM and ANN and found that the combnaton of SVM and ANN produced better results when combned together. SVMs wth reduced nput dmensons used for load forecastng were proposed by Tao et al. [76]. A comparson of STLF on the ablty of SVM verses BPNNs s gven n by Zhang [77], concludng the superorty of SVM n load forecastng problems. 3. EUNITE Data Analyss 3.1. EUNITE Competton Data In 2001, the European Network on Intellgent Technologes (EUNITE) organzed a competton on the problem of electrcty load forecastng [25]. The nformaton used n the competton contans data from the Eastern Slovakan Electrcty Corporaton for two years,.e., from January 1, 1997 untl December 31, 1998, whch s as follows: Half-hourly electrcty load Daly average temperature Annual holdays Accepted Manuscrpt Fg. 1. Half-hourly load data of March 21, Half-hourly load data represents consumpton over an entre day,.e. 48 consumpton values. 5 Page 5 of 41

7 The load, temperature and holday nformaton obtaned from the EUNITE competton dataset [25] s lsted n Table 1. The objectve of the competton s to predct the daly peak electrcty load for the 31 days of January 1999 usng the gven hstorcal data for the precedng two years,.e and For representaton of the EUNITE load data, the half-hourly load of March 21, 1997 s ndcated n Fgure 1. Snce 48 consumpton values of half-hourly load do not contrbute effectvely for predcton of the daly peak load, therefore, the maxmum half-hourly load of each day was selected as the daly peak load, as ndcated n Table Data Analyss Observatons regardng the EUNITE competton data were nvestgated to determne the relatonshp between the load data and other nformaton such as, temperature and annual holdays. The followng observatons are concluded for the gven data Load Data Table 1 Data from the 2001 EUNITE competton [25]. Data was provded by the Eastern Slovakan Electrcty Corporaton. Data Tran (Load) Predct (Load) Tran (Temperature) Predct (Temperature) Holdays Content and Format Descrpton Date Half-hourly load Daly Year Month Day 00:30 01:30..(etc) peak load (etc) 805 (etc) (etc) (etc) 793 (etc) (etc) 762 Year Month Day Temp ( C) Accepted Manuscrpt Tran Predct 01/01/ /01/ /01/ /12/ /12/ /12/1999 Through smple analyss of the graphs representng the yearly load data, t s observed that the electrcty load follows seasonal patterns,.e. hgh demand of electrcty n the wnter (September through March) whle low demand n the summer (Aprl through August) [20]. Ths pattern mples the relatonshp between the electrcty usage and weather condtons n dfferent seasons, as ndcated n Fgure 2 and Fgure 3. Secondly, 6 Page 6 of 41

another load pattern s observed, where load perodcty exsts n the profle of every week,.e., the load demand on the weekend (Saturday and Sunday) s usually lower than that on weekdays [20] (Monday through Frday), as shown n Fgure 4.

8 another load pattern s observed, where load perodcty exsts n the profle of every week,.e., the load demand on the weekend (Saturday and Sunday) s usually lower than that on weekdays [20] (Monday through Frday), as shown n Fgure 4. In addton, electrcty demand on Saturday s a lttle hgher than that on Sunday, and the peak load usually occurs n the mddle of the week,.e., on Wednesday. Fg. 2. Daly peak load from January 1, 1997 untl December 31, Load data represents a perod of two years EUNITE data Temperature Data By analyzng the temperature data, t s observed that the load data has seasonal varaton, whch ndcates a great nfluence n clmatc condtons. A negatve correlaton between the daly peak load and daly temperature s observed from the two year hstorcal data. The correlaton coeffcent s found to be [20] as ndcated n Fgure 5. Ths observaton concludes that electrcty consumpton n the summer s lower than n the wnter (due to the use of heaters n the wnter), as lower temperature causes hgher electrcty demands Holday Effects Accepted Manuscrpt Local events, ncludng holdays and festvtes, also affect the load demand. These events may lead to a hgher load demand for the extra usage of electrcty. From the two year hstorcal load data, t s observed that the load demand usually reduces on holdays. Further analyss reveals that the load demand also depends on the type of holday. On some major publc holdays such as Chrstmas or New Year, the demand for electrcty s affected more compared to other holdays [20]. 7 Page 7 of 41

9 Fg. 3. Daly temperature from January 1, 1997 untl December 31, Temperature data represents a perod of two years EUNITE data Mean Absolute Percentage Error The accuracy of load forecastng depends upon the error metrc, the Mean Absolute Percentage Error (MAPE) of the predcted result. MAPE s defned by the followng expresson [20]: 1 MAPE N n 1 L a L L a p 100 where La and Lp are the actual and the predcted values of the daly peak load on the th day of January 1999 respectvely, and N s the number of days n January Magntude of Maxmum Error The Magntude of Maxmum Error (MAX) s an addtonal error metrc used n the EUNITE competton, for comparson of the predcton accuracy. MAX s defned by the followng expresson [25]: Accepted Manuscrpt MAX max L a L p (1) (2) where La and Lp are the actual and the predcted values of the daly peak load on the th day of January 1999 respectvely for = 1,2,3 31 and max represents the maxmum value. 8 Page 8 of 41

Fg. 4. Daly peak load for the month of January 1997 and January 1998. Load data for January represents a perod of 31 days EUNITE data. Accepted Manuscrpt Fg. 5.

10 Fg. 4. Daly peak load for the month of January 1997 and January Load data for January represents a perod of 31 days EUNITE data. Accepted Manuscrpt Fg. 5. Correlaton between daly peak load and the daly temperature from January 1, 1997 untl December 31, Correlaton coeffcent s: [20] EUNITE data. 9 Page 9 of 41

11 4. Self-Organzng Map The self-organzng map (SOM) ntroduced by Kohonen n the 1980s s also referred to as a Kohonen Map [78]. The SOM s an unsupervsed artfcal neural network (ANN), whch learns the dstrbuton of a set of patterns wthout any class nformaton. SOMs nonlnearly project hgh-dmensonal data onto a low-dmensonal grd [79 82]. SOMs have been used n numerous applcatons snce they were ntally proposed. SOM ntally focused on applcatons such as: engneerng [81], mage processng [82 85], process montorng and control [86,87], speech recognton [88,89], and flaw detecton n machnery [90]. Recently, SOM applcatons n other felds have emerged ncludng busness and management, such as: nformaton retreval [91,92], medcal dagnoss [93,94], tme seres predcton [95,96], optmzaton [97], as well as fnancal forecastng and management [98 100]. The SOM projected data preserves the topologcal order n the nput space; hence, smlar data patterns n the nput space are assgned to the same map unt or nearby unts on the traned map. The core process of the projecton s frst, for each nput pattern, determnng ts best matchng unt (BMU) from the map unts. The BMU s the unt that s most smlar to the nput pattern. In short, the two key steps n tranng an SOM are: (1) Determnng the BMU and, (2) Updatng the BMU and ts neghbours [101]. Specfcally, consder the SOM algorthm [102] as a nonlnear, ordered and smooth mappng of an n-dmensonal nput space to an m-dmensonal output space. In the output space, each neuron or unt of the SOM s represented by a codebook vector, m = [m1, m2,,mn] T R n, where mk s the value of the kth component. The projecton of an nput pattern x = [x1, x2,,xn] T R n s acheved by assgnng x to the closest codebook vector mc wth respect to a general dstance measure d(x, m) [101], where mc s the wnnng neuron [103]. The method for determnng the BMU, c, wth respect to an nput pattern x s to dentfy the unt that s most smlar to x. The dstance functon s employed to measure the smlarty. The smaller the dstance s, the more smlar the neurons are. Formally, the BMU of an nput x s defned as [101]: Accepted Manuscrpt c arg mn d( x, m ) (3) where m s a unt on the map. The typcal method for computng the dstance d(x, m) s by usng the Eucldean dstance functon, defned by: d 1 n 2 2 ( x, m ) x m ( xk mk ) k 1 (4) Durng SOM tranng, the wnnng neuron and all the neurons n ts neghbourhood may adapt ther codebook vectors. In the sequental SOM algorthm, the rate of adaptaton s steered by a monotoncally decreasng and tme-dependent functon α(t), whch s known as the ncremental learnng rate [103]. The neghbourhood functon hc(t) s centered on the best-matchng neuron. Both σ(t) and hc(t) are monotoncally 10 Page 10 of 41

12 decreasng functons of tme. A typcal smooth neghbourhood functon s the Gaussan functon, s defned by: h c ( t) 2 r c r exp 2 ( t) (5) 2 where 0 < σ(t) < 1 s the wdth of the Gaussan kernel representng the sze of the neghbourhood and rc-r 2 s the dstance between the wnnng neuron and the neuron, wth rc and r representng the 2D postons of neurons c and on the SOM grd. Thus, the man steps of the sequental SOM tranng are as follows [103]: 1. Intalze the codebook vectors m(0) of all neurons. 2. Determne the wnnng neuron mc(t) for the nput x, usng the Eucldean dstance measure d(x, m). 3. Update the codebook vectors: Accepted Manuscrpt m ( t 1) m ( t) ( t) h ( t) x( t) m ( t) (6) 4. Repeat the last two steps untl a predefned number of steps are reached. The sequental SOM tranng s usually performed n two phases. The frst phase s characterzed by choosng large learnng rate and neghbourhood radus parameters, and the second phase s used to fne-tune the codebook vectors by settng smaller startng values for σ(t) and hc(t). 5. Support Vector Regresson Support vector machnes (SVMs) ntroduced by Vapnk n the 1960s, are based on the foundaton of the statstcal learnng theory [63]. SVMs are a set of related supervsed machne learnng methods used for classfcaton, whch have recently become an actve area of ntense research wth extensons to regresson and densty estmaton [64]. SVMs employ the structural rsk mnmzaton (SRM) prncple to overcome ntrnsc lmtatons of ANNs. Support vector regresson (SVR) n SVMs can be used for tme seres predcton, whch s useful for problems characterzed by non-lnearty, hgh dmenson and local mnma. SVRs have been successfully employed to solve regresson problems such as: tme seres modellng [66, 67], fnancal forecastng [68], electrcty load forecastng [70 77] and non-lnear control systems [104]. c The basc concept of SVR s to map the nput data, x, non-lnearly nto a hgher dmensonal feature space. Hence, gven a set of data G = {(x, d)} for = {1,,N}, where x s the th nput vector to the SVR, d s the actual th output value, and N represents the total number of data patterns, the SVR functon s defned by: y f ( x) w ( x) b (7) 11 Page 11 of 41

13 Page 12 of 41 Accepted Manuscrpt 12 where φ(x) s the feature that s non-lnearly mapped from the nput space, x. The coeffcents w and b are the support vector (SV) weght and bas respectvely, whch are calculated by mnmzng the followng regularzed rsk functon: 2 / ), ( ) / ( ) ( 2 1 w y d L N C C R N (8) where, otherwse y d y d y d L, 0, ), ( (9) Here, parameters C and ε are prescrbed parameters. In (8) Lε(d, y) s the ε-nsenstve loss functon, as llustrated n Fgure 6. The loss equals zero f the forecasted value s wthn the ε-tube [105,106], as ndcated n (9). The second term w 2 /2 n (8), measures the flatness of the functon. Fg. 6. The lnear ε-nsenstve loss functon of ε-svr. Therefore, C specfes the trade-off between the emprcal rsk and the model flatness. Both C and ε are user-determned parameters. Two postve slack varables ξ and ξ, represent the dstance from the actual values to the correspondng boundary values of the ε-tube, are ntroduced as shown n Fgure 7. Then the regularzed rsk functon n (8) s transformed nto the followng constraned form: N C w w R / ),, ( (10) subject to the constrants, 0,,, ) (, ) ( b x w d d b x w N N N 1,2,..., 2,..., 1, 2,..., 1,

14 Page 13 of 41 Accepted Manuscrpt 13 Fg 7. The ntensve band ±ε and slack varables ξ and ξ for ε-svr. Ths constraned optmzaton problem n (10) s solved usng the followng prmal Lagrangan form: ),,,,,,, ( b w L N C w 1 2 ) ( 2 1 N d b x w 1 ) ( N b x w d 1 ) ( N 1 (11) where (11) s mnmzed wth respect to the prmal varables w, b, ξ and ξ and maxmzed wth respect to the non-negatve Lagrange multplers α, α, β and β. Therefore, (12 15) are obtaned: N x w w L 1 0, ) ( (12) N w b L 1 0, (13) 0 C L (14) 0 C L (15) Fnally, the Karush-Kuhn-Tucker (KKT) condtons are appled to the regresson and (10) yelds the dual Lagrangan by substtutng (12 15) nto (11). Then, the dual Lagrangan (16) s obtaned by the followng expresson:

15 (, ) N N N N 1 d ( ) ( ) ( )( j j ) K( x, x j ) (16) 1 1 subject to the constrants, N 1 ( ) 0 0 C, 1,2,..., N 0 C, 1,2,..., N The Lagrange multplers n (16) satsfy the equalty (β β ) = 0. Lagrange multplers β and β are calculated and an optmal desred weght vector of the regresson hyperplane s gven by the followng expresson: w N 1 Hence the regresson functon s expressed by: ( ) ( x) l f ( x,, ) ( ) K( x, x 1 ) b In (18), K(x, x) s called the kernel functon. The value of the kernel s equal to the nner product of the two vectors x and x n the feature space φ(x) and φ(x),.e., K(x, x) = φ(x) φ(x). Four of the popular SVM kernel functons used n ths experment are [107] as follows: 1. Lnear kernel: 2. Polynomal kernel: K Accepted Manuscrpt (17) (18) T ( x, x ) x x (19) K( x, x T d ) ( x x r ), 0 (20) 3. Radal bass functon (RBF Gaussan) kernel: 2 K( x, y) exp x, 0 (21) 4. Sgmodal kernel: x T K( x, y) tanh( x x r), 0 (22) In general, t s dffcult to determne the type of kernel functon to use for specfc data patterns [106,108]. However, any functon that satsfes Mercer s condton by Vapnk [63] can be used as a kernel functon n SVMs. The selecton of three (3) parameters, ε, C and γ of ε-svr are mportant to the forecastng accuracy. As an example, f C s chosen as too large (approxmated to 14 Page 14 of 41

16 nfnty), then the objectve s to mnmze the emprcal rsk, Lε(d, y) wthout the model flatness n the optmzaton formulaton n (10). The parameter ε controls the wdth of the ε-nsenstve loss functon, whch s used to ft the tranng data. Large ε values result n a more flat regresson estmated functon. The parameter γ controls the wdth of Gaussan functon, whch reflects the dstrbuton range of x values n the tranng data. There are a lot of exstng practcal approaches for parameter selecton of ε, C and γ such as, user-defned based on pror knowledge and experence, cross-valdaton (CV), asymptotcal optmzaton [109] and grd search [107]. For experments n ths paper, LIBSVM [110], a lbrary for support vector machnes, s used. The selecton of optmal ε-svr parameters ε, C and γ s acheved usng the Grd Search method suggested by Hsu et al. n [107], whch s dscussed n Secton Methodology 6.1. Attrbute Selecton Our proposed load forecastng model s based on the past daly peak loads (hstorcal consumpton data) as one of the canddate nput varables. The best nput features for our forecastng model are those whch have the hghest correlaton wth the output varable (.e. peak load of the next day) and the hghest degree of lnear ndependency. Thus, the most effectve canddate nputs wth mnmum redundancy are selected as the model attrbutes. For the purpose of attrbute selecton, a two step correlaton analyss employed n [114] s used n ths research study, whch s descrbed as follows: a) In the frst step, the correlaton between each canddate nput and output varable s computed, where hgher correlaton ndcates more effectve canddates. If the correlaton ndex between a canddate varable and the output feature s greater than a prespecfed value then ths canddate s retaned for the next step; else, t s not consdered any further [114]. b) In the second step, for the retaned canddates, a cross-correlaton analyss s performed. If the correlaton ndex between any two canddate varables s smaller than a prespecfed value then both varables are retaned; else, only the varable wth the largest correlaton wth respect to the output s retaned, whle the other s not consdered any further [114]. Accepted Manuscrpt Ths correlaton analyss s appled to the canddate nputs,.e. the past daly peak loads and daly temperatures, excludng the major calendar ndcators: type of day and annual holdays. The type of days and annual holdays are excluded from the two step correlaton analyss, snce these ndcators are composed usng bnary varables and have low correlaton wth the output feature, even n the normalzed form. However, the ndcator can be stll useful to separate dfferent knds of days. 15 Page 15 of 41

17 Retaned canddates after the two steps of the correlaton analyss are selected as the nput features of the forecastng method. The proposed two step correlaton analyss resulted n the best feature selecton and forecast accuraces, due to consderaton of lnear ndependency of canddate nputs n addton to ther correlaton. The four attrbutes used n the SOM-SVR modellng process are mentoned below Daly Peak Load Snce, past load demand affects and mples the future load demand, therefore, ncludng the past daly peak load as a key attrbute, wll greatly nfluence mprovement n the forecastng performance Daly Temperature As prevously ndcated through our data analyss n Fgure 5, the electrcty load and the temperature have a causal relatonshp (hgh correlaton) between each other. Therefore, the daly temperature s used as an attrbute n the forecastng model Type of Day Weekly perodcty of the electrcty load s notced through load data analyss as shown n Fgure 4. As the electrcty demand on holdays s observed to be lower than on nonholdays, therefore, encodng nformaton of the type of day (calendar ndcator) nto the forecastng model wll beneft performance of the model Annual Holdays As t s commonly understood that annual holdays, local events and festvtes contrbute sgnfcantly towards the electrcty load demand, hence, ncluson of annual holdays as a calendar ndcator n the forecastng model wll have a sgnfcant mpact on the predcton accuracy Data Normalzaton Accepted Manuscrpt The daly peak power load and the daly temperature data needs to be represented n a normalzed scale for SVR tranng and predcton. Thus, all feature data are lnearly scaled n the range of [0, 1] usng the followng expresson: X N ( X X mn ) (23) ( X X ) max mn 16 Page 16 of 41

18 where X s the load or temperature feature data, Xmn represents the smallest value n the feature data X, and Xmax represents the largest value n the feature data X Tme Seres Modellng for Weekly Perodcty The daly peak electrcty load for the precedng two years s ntroduced nto the forecastng model wth the concept of a tme seres. If x for = 1,2,,730 s the hstorcal peak load data of two years from January 1, 1997 untl December 31, 1998 and y s the respectve target value for tranng, then the target vector y ncludes several next load values based on weekly perodcty. Therefore, x can be represented as a tranng load matrx R, n the form: x (1) x (2) R x (722) x (723) x (2) x (3) x (723) x (724) x (3) x (4) x (724) x (725) x (7) x (8) x (728) x (729) where R conssts of load values. The frst column n the R corresponds to the hstorcal load data for the precedng two years, excludng the last week. The second column n the R s shfted one day ahead wth respect to the frst column. The thrd column s shfted two days ahead wth respect to the frst column, and so on wth the last column beng shfted sx days ahead, such that all 7 columns contan 723 load values. Smlarly, the target vector, y s represented by the vector S, n the form: x (8) x (9) S x (729) x (730) where S conssts of 723 load data values. The target vector S s shfted one day ahead wth respect to the last column n the tranng load matrx, R,.e., the vector S s a contnuaton of the one day ahead day-shftng sequence from the matrx, R. Therefore, S represents a weekly perodcty consumpton pattern n accordance wth R, for the last 723 days of the precedng two years. Accepted Manuscrpt (24) (25) 6.4. Data Representaton After formulaton of the tranng matrx R and target vector S, all feature attrbutes are selected n a proper combnaton to prepare the tranng dataset. In the tranng dataset, 17 Page 17 of 41

19 a tranng sample [.e., x n (10)] encoded for a partcular th day s n the followng format: (26) where peak load represents the tranng matrx R, temperature represents the daly temperature, type of day represents the day n the week, and annual holdays represents the publc holdays. To represent the calendar nformaton, seven bnary attrbutes are used to encode the day type from Monday through Sunday, where the current day s represented by 1 and all other dgts are set to 0. Smlarly, annual holdays are represented usng one bnary dgt, where 1 s used to represent a publc holday and 0 represents no publc holday Load Forecastng Model A computatonal ntellgence scheme based the SOM and SVR s appled to reconstruct the dynamcs of electrcty load forecastng usng a tme seres approach. The proposed SOM-SVR load forecastng model s shown n Fgure 8. Accepted Manuscrpt 18 Page 18 of 41

20 Self-Organzng Map Tranng matrx Hstorcal data of the precedng two years excludng the last week L tran SOM T tran D tran Tranng matrx clustered nto two subsets H tran V tran Two cluster centres selected randomly based on the tranng data subsets Subset 1: L 1, T 1, D 1, H 1, V 1 Subset 2: L 2, T 2, D 2, H 2, V 2 d 1 < d 2 Eucldean dstance comparson of: Cluster centers of the tranng data and testng data d 2 <= d 1 Testng matrx Hstorcal and/or currently predcted data L test T test D test H test V test ε-svr1 Tranng usng Subset 1 Subset 1 Subset 2 ε-svr1 Predcton SVR models ε-svr2 Tranng usng Subset 2 ε-svr2 Predcton Predcted electrcty load for the 31 days of January Accepted Manuscrpt Fg. 8. Flowchart of the proposed SOM-SVR load forecastng model. Parameter subscrpt tran denotes the tranng data and subscrpt test denotes the testng data. In Fgure 8, the subscrpts tran and test represent the tranng and testng data respectvely. The parameter L represents the daly peak load matrx, T represents the daly temperature, D represents the type of day, H represents the annual holdays, and V represents the target vector n correspondence wth L. 19 Page 19 of 41

21 The load forecastng model proposed n Fgure 8 s based on hybrd archtecture of the SOM and the ε-svr. The SOM s used as a clusterng tool to cluster the tranng data nto two subsets, usng the Kohonen rule. The reason for usng two subsets s obtaned from the fact that, two years of hstorcal data s used to construct the forecastng model. After clusterng s complete, two ε-svrs (one for each tranng subset) are employed to ft the clustered data approprately for SVR tranng. The Eucldean dstance between the cluster centres and testng samples s used as a measure to select the approprate ε- SVR model for predctng the peak daly load. The forecastng model presented n ths research study s developed usng MATLAB R2009b. The computer used for testng s a 2.40 GHz Quad-core processor wth 4 GB of RAM. The SOM s mplemented usng the MATLAB Neural Network Toolbox and the SVR was mplemented usng LIBSVM [110] SOM Clusterng As ndcated n Fgure 8, n the frst stage the SOM s employed as a gatng network to dentfy the swtchng or pece-wse statonary characterstcs of the tranng data. Frstly, two cluster centres for the two year hstorcal tranng data are randomly selected based on the tranng data space. The SOM s appled to cluster the space of the two year hstorcal tranng data nto two dfferent subsets wth smlar propertes, as shown n Fgure 8. Ths process leads to the decomposton of the normalzed tranng data nto two subgroups as ndcated n Fgure 9. Each clustered subset s consdered as a statonary tme seres data. The SOM parameters selected for the clusterng process are ndcated n Table 2. Table 2 SOM parameters selected for the clusterng of tranng data. Parameters are selected by teratng dfferent combnatons usng tral and error. Parameter Descrpton Value α Intal learnng rate 0.96 w p Wnnng neuron crtera 10exp(10) E Tranng teratons 55 Accepted Manuscrpt As the ncremental learnng rate, αf of a SOM (based on the sze of the neghbourhood radus) calculates the wnnng neuron, the followng expresson s used to calculate αf on each teraton: ( 0.02e) ( e) (27) f 20 Page 20 of 41

Fg. 9. Two year hstorcal data clustered nto two subsets n a supervsed manner usng the Kohonen rule EUNITE data. 6.7.

22 Fg. 9. Two year hstorcal data clustered nto two subsets n a supervsed manner usng the Kohonen rule EUNITE data Model Tranng Each tranng sample conssts of 17 features, whch ncludes the followng: 8 normalzed peak load values (7 normalzed peak load data from the tranng load matrx, R and 1 normalzed peak load data from the target vector, S), 1 normalzed temperature data, 7 bnary dgts representng the type of day and 1 bnary dgt representng the annual holday, n the format, llustrated n eq. (26). Thus, the tranng matrx s composed of normalzed feature values (17 features from the frst 723 days of the precedng two years). SVR tranng nvolves tranng two ε-svr models (ε-svr1 and ε-svr2) wth the tranng matrx. The reason for selectng two ε-svr models s due to the fact that the hstorcal data s clustered nto two subsets. The two ε-svrs select the clustered tranng data approprately between themselves for tranng, based on the two cluster centres. Accepted Manuscrpt Tranng data for the ε-svrs s ndcated n Table 3. Optmal SVR parameters, (C, γ) were selected usng the Grd Search method proposed by Hsu et al. n [107]. Grd Search s mplemented by generatng exponentally growng sequences of parameters (C, γ) and teratng them for all possble combnatons to fnd the optmal SVR hyper-parameter set. The best ε parameter for the SVR models was determned usng the hghest 10-fold cross-valdaton (CV) accuracy obtaned. CV s used as a measure to evaluate the fttng provded by each parameter value set durng grd search n order to avod over-fttng. 21 Page 21 of 41

23 Table 3 Tranng data for the ε-svr. Tranng data corresponds to the tranng matrx, represented n the format llustrated n eq. (26). Input Parameter Detal descrpton 1 7 L tran Daly peak load data matrx, R n eq. (24) obtaned usng the hstorcal load data of the precedng two years. 8 T tran 9 15 D tran 16 H tran 17 V tran 6.8. Model Testng Daly temperature vector obtaned usng the temperature data of the precedng two years. Type of day bnary matrx representng calendar nformaton for the precedng two years wth seven bnary attrbutes. Annual holday vector for the precedng two years representng the publc holdays wth one bnary attrbute. Target vector S, n eq. (25) obtaned usng the hstorcal load data of the precedng two years. Durng the predcton phase, model attrbutes for January 1999 such as temperature data, type of day and annual holdays need to be encoded n the testng data. The type of day nformaton s obtaned from a calendar and annual/publc holdays for January 1999 are provded n the 2001 EUNITE competton data. The temperature nformaton for January 1999 s not provded. In order to encode the temperature data n our testng entres, we need to predct or estmate the temperature of January Temperature forecast s not easy, especally wth lmted data. We employ a straghtforward dea for temperature predcton whch s taken from [20]. Ths nvolves usng the average of the past temperature data for the estmaton. The daly temperature of the past four years for the 2001 EUNITE competton s provded by the Eastern Slovakan Electrcty Corporaton. Hence, the temperature of each of the 31 days n January 1999 s estmated by averagng the past January daly temperature data from 1995 to Each testng sample consst of 16 features, whch ncludes the followng: 7 normalzed peak load values (7 normalzed peak load data of the last 7 days from the precedng two years), 1 normalzed temperature data, 7 bnary dgts representng the type of day and 1 bnary dgt representng the annual holday, n the format llustrated n eq. (26). Thus, the testng matrx s composed of 7 16 normalzed feature values. Feature nformaton n the testng matrx ncludes: Accepted Manuscrpt 1. Intally for predctng the frst day of January 1, 1999, the daly peak load data conssts of the last week load of the precedng two years,.e. from December 25, 1998 untl December 31, The daly temperature data conssts of the January 1999 temperature data from January 1, 1999 untl January 31, The calendar nformaton comprses of all the type of days, for the 31 days of January The annual holdays nclude all publc holdays for the month of January Page 22 of 41

24 The tranng load matrx, R ncludes the tranng vector, S for SVR tranng (see Table 3). For SVR testng, the testng vector n the testng matrx was set to 0,.e. Vtest = 0 (see Table 4). Ths s because no lkely predcton values for the testng data are currently known or can be estmated. Testng data for the ε-svrs s ndcated n Table 4. Table 4 Testng data for the ε-svr. Testng data corresponds to the testng matrx represented n the format llustrated n eq. (26). Input Parameter Detal descrpton Daly peak hstorcal and/or currently predcted load data 1 7 L test vector for one week data. 8 T test Daly temperature for the current day of January 1999 to be predcted D test Type of day vector representng calendar nformaton for the current day of January 1999 to be predcted. 16 H test Annual holday vector representng publc holdays for the current day of January 1999 to be predcted. - V test Target vector for the testng matrx, set to 0 (no current values for the testng data are known) Daly Peak Load Predcton The steps nvolved for predctng the electrcty load of the 31 days of January 1999 wth the testng data (outlned n Table 4) are as follows: 1. Two Eucldean dstances (d1, d2) are computed between the two cluster centres and for each testng sample (see Fgure 8). 2. The logcal comparson of the Eucldean dstances provdes the decson of selectng the approprate ε-svr traned model to ft the testng sample. 3. If dstance d1 < d2 then ε-svr1 s selected for predcton of the daly peak load, else f dstance d2 <= d1 then ε-svr2 s selected for predcton. 4. After selecton of the approprate ε-svr, the electrcty load for the 31 days of January 1999 s predcted on a daly bass. 5. For the frst predcton of January 1, 1999, the last week load data from the precedng two years (December 25, 1998 untl December 31, 1998) s used wth the temperature and the calendar nformaton of January 1, Ths s the frst SVR testng sample. 6. To predct the peak load of January 2, 1999, the load data from December 26, 1998 untl December 31, 1998 and the predcted peak load of January 1, 1999 s used wth the temperature and the calendar nformaton of January 2, Ths s the second SVR testng sample. 7. The peak load of January 3, 1999 s predcted n a smlar way as for the case of the frst two days of January. The hstorcal load data from December 27, 1998 untl December 31, 1998 and the predcted peak load of the frst two days of January 1999 are used wth the temperature and the calendar nformaton of January 3, Ths s the thrd SVR testng sample. Accepted Manuscrpt 23 Page 23 of 41

8. The one day shftng sequence n the load data, as ndcated n steps (5), (6) and (7), follows the same procedure for predctng the peak load for remanng days of January 1999. 9.

25 8. The one day shftng sequence n the load data, as ndcated n steps (5), (6) and (7), follows the same procedure for predctng the peak load for remanng days of January The last predcton, January 31, 1999 s predcted usng the last seven predcted days of January Ths means that peak load data from January 24, 1999 untl January 30, 1999, s used wth the temperature and the calendar nformaton of January 31, Ths s the last SVR testng sample. 10. The 31 daly peak loads predcted by the two ε-svrs, contrbute to the predcted electrcty load for the month of January Expermental Results 7.1. SVR Kernel Selecton The behavour of four dfferent SVM kernels, namely: lnear, polynomal, radal bass functon (RBF) and sgmodal on the accuracy of predcton (MAPE) s observed usng 10-fold CV. Expermentally, by teratng varous parameter ε values, the best value obtanng the hghest CV accuracy was found to be ε = Accepted Manuscrpt Fg. 10. Comparson of the predcted electrcty load for January 1999 usng dfferent SVM kernels EUNITE data. In ths experment, default parameter values for all SVM kernels defned n (19 22) were used and results obtaned are gven n Table 5. Results obtaned ndcated that the best predcton accuracy was obtaned usng the RBF kernel, resultng n a MAPE of 1.32%. From ths pont onwards, all experments were performed usng the RBF kernel. The comparson plot of the predcted electrcty load of January 1999 usng dfferent kernels s shown n Fgure Page 24 of 41

26 Table 5 Predcton accuracy of forecastng model usng dfferent SVM kernels. Kernel type Kernel parameters MAPE Lnear No parameters 2.19% Polynomal C = 100, γ = 0.001, r = 0, d = % RBF C = 100, γ = % Sgmodal C = 100, γ = 0.001, r = % 7.2. SVR Parameter Optmzaton Usng the RBF kernel, two parameters (C, γ) need to be determned n SVM. For ths study, the Grd Search method proposed by Hsu et al. n [107] s used n LIBSVM [110] for SVR hyper-parameter optmzaton. In the Grd Search, exponentally growng sequences of parameters (C, γ) were used to dentfy SVR parameters obtanng the hghest predcton accuracy (lowest MAPE). Sequences of parameters, C = [2 1, 2 2, 2 3,,2 20 ] and γ = [2-1, 2-2, 2-3,,2-20 ] were used for = 10,000 combnatons. For each par of (C, γ) the valdaton performance was measured usng 10-fold CV accuracy. Expermentally t was found that: C1 = , γ1 = , C2 = and γ2 = for ε-svr1 and ε-svr2 respectvely obtaned the hghest 10-fold CV accuracy of 96.47%, for a MAPE of 0.97% and MAX of MW Model Evaluaton Usng EUNITE Data Wth Other Technques A comparatve study of our proposed model wth other machne learnng technques was performed usng the EUNITE data. Our proposed SOM-SVR method was compared wth two dfferent predcton technques: (1) SVR and (2) ML-BPNN. The predcton accuracy of our proposed model compared wth the dfferent predcton technques for predctng the electrcty load of January 1999 s shown n Table 6. Fgure 11 shows the comparson plot of the predcted electrcty load of January 1999 usng the SOM-SVR, SVR and ML-BPNN technques. Results obtaned n Table 6 ndcate that the predcton accuracy of the ML-BPNN s not satsfactory. Ths s due to the problems of local mnma and over-fttng assocated wth ANNs, whch tends to decrease the generalzaton performance for unseen data. Our proposed SOM-SVR model proves to be superor n terms of the MAPE compared to the SVR model. Ths s due to the presence of the SOM n our model, whch fts the clustered tranng data nto the approprate ε-svrs based on the Eucldean dstance. Accepted Manuscrpt Table 6 Comparson of the forecastng accuracy usng dfferent predcton technques. Method MAPE Type ML-BPNN 3.31% Comparson technque SVR 1.89% Comparson technque SOM-SVR 0.97% Proposed technque 25 Page 25 of 41

Table 7 compares the results of our proposed SOM-SVR model wth the results of the wnner of the competton.

27 Fg. 11. Comparson plot of predcted electrcty load for January 1999 usng SOM-SVR, SVR, and ML-BPNN technques EUNITE data. To further evaluate performance of our proposed model, results from the 2001 EUNITE competton were compared wth that of our proposed model. Table 7 compares the results of our proposed SOM-SVR model wth the results of the wnner of the competton. The results n Table 7 ndcate that our proposed forecastng model outperforms the wnnng model n the EUNITE competton, by achevng a lower MAPE and a lower MAX. The wnnng model was proposed by Ln et al. n [20, 111], as shown n Fgure 12 [25]. Fgure 12 ndcates the results of the 2001 EUNITE competton. Complete reports of the competton can be found n [25]. Accepted Manuscrpt Fg. 12. Results of the 2001 EUNITE competton ndcatng MAPE and the MAX metrcs [25]. Wnnng model was proposed by Ln et al. n [20]. 26 Page 26 of 41

28 Table 7 Comparson of our proposed SOM-SVR forecastng model wth the 2001 EUNITE competton wnner. Method MAPE MAX Reference SVR 1.95% [20], [111] SOM-SVR 0.97% Proposed method For the purpose of a comparatve study, recently proposed load forecastng technques as reported by authors on the 2001 EUNITE data are ndcated wth ther results n Table 8. From the tabulated, t s ndcated that our proposed SOM-SVR load forecastng model outperforms other load forecastng approaches n [73,76] and [ ] wth promsng results. The results obtaned ndcate that, better forecastng accuracy s possble when the tranng data s clustered and a dstance measure s used for the selectng a non-lnear regresson model to ft the testng samples based on smlar propertes. A comparson plot of the actual and predcted electrcty load for January 1999 s gven n Fgure 13. Table 8 Comparson of proposed forecastng model wth recently proposed load forecastng technques usng the EUNITE data. No. Method MAPE Reference 1. Locally Lnear Model Tree 1.98% [121] 2. SVM-GA 1.93% [112] 3. Autonomous ANN 1.75% [118] 4. Floatng Search + SVM 1.70% [76] 5. MLP-NN + Levenberg Marquardt 1.60% [114] 6. Feedforward and Feedback ANN 1.58% [116] 7. Auto-Regressve Recurrent ANN 1.57% [115] 8. Local Predcton Framework + SVR 1.52% [117] 9. Feedforward ANN 1.42% [113] 10. SOFNN + Blevel Optmzaton 1.40% [119] 11. LS-SVM + Chaos Theory 1.10% [73] Accepted Manuscrpt 12. DLS-SVM 1.08% [120] - SOM-SVR 0.97% Proposed model 27 Page 27 of 41

Fg. 13. Comparson of the actual and predcted electrcty load for January 1999 usng the proposed SOM-SVR model EUNITE data. 7.4.

29 Fg. 13. Comparson of the actual and predcted electrcty load for January 1999 usng the proposed SOM-SVR model EUNITE data Model Evaluaton Usng Malaysan Data To benchmark the performance of our proposed SOM-SVR load forecastng model, the load data for pennsular Malaysa was obtaned from Tenaga Nasonal Berhad (TNB), Malaysa s sole electrc utlty company. The power data provded by TNB contans the half hourly consumpton over a perod of two years,.e., from January 1, 2003 untl December 31, The daly peak load from the two year hstorcal Malaysan data was calculated n smlar context to the EUNITE data, as llustrated n Secton 3.1. The daly peak load profle of the two year hstorcal data s shown n Fgure 14. Accepted Manuscrpt Fg. 14. Daly peak load from January 1, 2003 untl December 31, Load data represents a perod of two years Malaysan data. 28 Page 28 of 41

30 Smlarly as wth the EUNITE data n Secton 3.2.1, load perodcty for the Malaysan data exsts n the profle of every week,.e., the load demand on weekends s usually lower than that on weekdays and the peak load usually occurs durng the mddle of the week, as ndcated n Fgure 15. Fg. 15. Daly peak load for the month of January 2003 and January Load data for January represents a perod of 31 days Malaysan data. Usng the Malaysan power load data for the two years of 2003 and 2004, we apply a smlar procedure as wth the case of the EUNITE data,.e. to predct the daly peak electrcty load for the 31 days of January Our proposed SOM-SVR model requres addtonal attrbutes other than the load data, such as temperature, type of day and annual holday nformaton. The type of day and Malaysan annual/publc holdays nformaton for the two years was retreved from onlne calendar systems. The data for the daly temperature of pennsular Malaysa for the two years of 2003 and 2004 s not avalable. As pennsular Malaysa s postoned n the equatoral zone, ths guarantees t a classc tropcal clmate wth relatve humdty levels. The weather n Malaysa s farly hot, averagng around 30 C (86 F) throughout the year. Due to the unavalablty of Malaysan daly temperature data and consderng temperature (clmate) stablty n Malaysa throughout the year, we have consdered to omt the temperature nformaton from the model attrbutes as t wll not have a sgnfcant mpact the forecastng accuracy. Hence, the remanng attrbutes selected to buld the SOM-SVR model for the Malaysan power load data are: Accepted Manuscrpt (28) 29 Page 29 of 41

31 Fg. 16. Comparson of the actual and predcted electrcty load for January 2005 usng the proposed SOM-SVR model Malaysan data. Our proposed SOM-SVR model usng the feature attrbutes n eq. (28) was evaluated usng the Malaysan load data wth the RBF kernel and Grd Search method ndcated n Secton 6.7. The best value of MAPE for the proposed model usng the Malaysan power data was found to be 1.04% usng ε = 0.2. To obtan ths lowest value of MAPE, the SVR parameters are: C1 = , γ1 = , C2 = and γ2 = for ε-svr1 and ε-svr2 respectvely. Ths sgnfcantly low value of MAPE supports our nterest n justfyng that our proposed model has the capablty to produce good predcton results for MTLF problems Model Evaluaton Usng PJM Data PJM whch operates an electrcty transmsson and dstrbuton system s part of the eastern nterconnecton grd of the Unted States of Amerca, whch s the world's largest compettve wholesale electrcty market [122]. PJM consders forecasts of load growth and addtons of demand response as nterconnecton requests for new and planned retrements of exstng generatng plants, and solutons to mtgate congeston on the transmsson and dstrbuton system. Accepted Manuscrpt The load data for the PJM Md-Atlantc Regon (PJM-E) was obtaned onlne from the PJM webste [123]. The onlne PJM power data contans the hourly consumpton load from 1998 untl the present day. For the case of experments n ths paper, we acqured the latest PJM-E power data over a perod of two years,.e. from January 1, 2009 untl December 31, The daly peak load from the two year hstorcal PJM-E data was calculated n smlar context to the EUNITE data, as ndcated n Secton 3.1. The daly peak load profle of the two year hstorcal data s shown n Fgure 17. As observed from Fgure 17, the daly peak load for the year of 2010 ndcates a sgnfcant growth from the year of Page 30 of 41

Fg. 17. Daly peak load from January 1, 2009 untl December 31, 2010. Load data represents a perod of two years PJM-E data.

32 Fg. 17. Daly peak load from January 1, 2009 untl December 31, Load data represents a perod of two years PJM-E data. Smlarly as wth the EUNITE data (see Fgure 4) and the Malaysan data (see Fgure 15), a form of load perodcty for the PJM-E data exsts n the profle of every week, as shown n Fgure 18, whch s dscussed n Secton earler. The PJM-E power load data n Fgure 17 for the two years of 2009 and 2010 s used wth the proposed SOM-SVR model n Fgure 8 to predct the daly peak electrcty load for the 31 days of January 2011, smlar to the case of the Malaysan and EUNITE datasets. Snce our SOM-SVR model requres addtonal attrbutes such as temperature, type of day and annual holday nformaton, the type of day and annual/publc Unted States holday nformaton for the two years was retreved from onlne calendar systems. Accepted Manuscrpt Fg. 18. Daly peak load for the month of January 2009 and January Load data for January represents a perod of 31 days PJM-E data. 31 Page 31 of 41

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest