Short-erm Load Forecasting Using Semigroup Based System-ype Neural Network Kwang Y. Lee, Fellow, IEEE, and Shu Du Abstract his paper presents a methodology for short-term load forecasting using a semigroup-based system-type neural network. A technique referred to as algebraic decomposition is proposed for the neural network architecture, where the network is decomposed into a semigroup channel and a function channel. he semigroup channel, made of coefficient vector, is shown to exhibit the dependency of the load on temperature, and the function channel extracts the basis vector to represent the fundamental characteristics of daily load cycles. Regression and rearrangement methods are applied to handle the roughness of the load data surface, and interpolation and extrapolation of coefficient vector is achieved based upon the hourly temperature. he recombination of basis vector and coefficient vector at each hour gives the load forecast. his methodology is verified by testing on the load data from New England Independent System Operator (ISO) and achieves satisfactory results. Index erms Algebraic decomposition, load forecasting, neural network, system-type architecture. I. INRODUCION ORECASING load accurately is very important for Felectric utilities in a competitive environment created by the electric industry deregulation. In order to supply high quality electric energy to the customer in a secure and economic manner, an electric company faces many economical and technical problems in operation, planning and control of an electric energy system []. Load forecasting helps an electric utility to make important decisions including decisions on purchasing and generating electric power, load switching, and infrastructure development. Load forecasting is also significant for energy suppliers, financial institutions, and other participants in electric energy generation, transmission, distribution, and markets []. Based on time intervals, load forecasting can be divided into three main categories: short-term forecasting, mediumterm forecasting and long-term forecasting. Short-term forecasting usually forecasts one hour to one week, mediumterm forecasting concerns the future load from a week to a month and long-term forecasting is often for period longer than a year. he long-term and medium-term forecasts are applied to determine the capacity of generation, transmission, or distribution system, and the type of facilities required in his work was supported in part by the National Science Foundation under grant ECCS 7838. K. Y. Lee and S. Du are with the Department of Electrical and Computer Engineering, Baylor University, Waco, X 76798 (e-mail: Kwang_Y_Lee@baylor.edu; Shu_Du@baylor.edu). transmission expansion planning, annual hydrothermal maintenance scheduling, etc. he short-term forecast is used for controlling and scheduling power system, and also as inputs to load flow study or contingency analysis []. Additionally, short-term load forecasting can assist in estimating load flows and in making decisions to prevent overloading. he load can be regarded as a dynamic system which is affected by two main factors: time of the day and weather conditions. he time dependence of the load reflects the existence of a daily load pattern, which may vary for different weekdays and seasons. Among weather variables, temperature usually is the dominant factor influencing load. Humidity and wind speed may also impact on power consumption. In general, the load has two distinct patterns: weekday and weekend patterns. In addition, holiday patterns are different from non-holiday patterns. Section II describes the proposed approach. In Section III, the proposed approach will be applied to the forecasting problem. Finally, some conclusions are drawn and presented in Section IV. II. HE PROPOSED APPROACH A. Failures/Shortcomings of Conventional Neural Networks Recently, there is a shift occurred in the overall architecture of neural networks from simple or component-type networks to system-type architectures. he most popular architecture seems to be the Modular Connectionist Architecture, which is advocated by Jacobs and Jordan [3]. he most serious drawback in the design of system-type neural networks is the short of a cohesive discipline in the architectural design and in the design of the learning algorithm. Virtually, the entire design is based upon an intuitive basis. o illustrate the lack of a cohesive discipline, in [4], the partitioning of components corresponds to separation of variables, which works if the variables are separated and does not work if the variables are not separated []-[7]. B. System-ype Neural Network Method In previous papers []-[], a system-type neural network which implemented extrapolation was proposed. In this metho the distributed parameter system (DPS) surface determined by a given data set was expanded by extrapolating along one axis. In general the load is a function, Load = f ( Day, Hour, Weather, Customerclasses), and is often considered as Load = f ( Day, Hour), which is
parameterized by weather and customer classes. In this paper, however, the load is modeled as Load = f ( emperature, Hour) due to the importance of the temperature among many factors that affect the load. In the following, it will be shown that the load can be represented in the form: L( emperature, Hour) = C( ) E( H ), where E (H) is a set of linearly independent orthonormal basis functions, and C ( ) is a preliminary coefficient vector as a function of the hourly temperatures. In order to obtain E (H ) and C ( ), a technique referred to as algebraic decomposition, which aims to approximate and model load data set, is applied. his process at first requires load profile L (, H ) to be parameterized as { L ( H )}, = i,,, then converts n chosen members using l Gram-Schmidt process to obtain an orthonormal basis set E( H ) = [ e ( H ), e ( H ),, en ( H )]. Finally the least squares method is performed to determine the coefficient vector C ( ), i.e., L(, H ) = C( ) E( H ) () where E H ) = [ e ( H ), e ( H ),, e ( n ( H C ) = [ c ( ), c ( ),, c ( )]. ( n Neural networks are being used for systems described by partial differential equations (PDEs) []. he system-type architecture is shown in Fig., which implements a load function L (, H ). he proposed architecture realizes a system-type structure using two neural network channels, a Function Channel and a Semigroup Channel. )] in E (H). he outputs of the two channels are (internally) linearly summed so that the channels span an n-dimensional function space. he Semigroup Channel can be adapted from the Diagonal Recurrent Neural Network (DRNN) or the Elman architecture [], in which the input is separated into a dynamic scalar component and one static vector component, the initial vector C (). he output of the Semigroup Channel is a vector C ( ), which is associated to the dynamic input and to the static input C () by the semigroup property: C( ) = Φ( ) C() () where Φ + ) = Φ( ) Φ( ). ( C. Learning Algorithm of Proposed System-ype Neural Network Since it uses RBF Networks, the first component of the system, namely the Function Channel, can be designe rather than trained. he second component, the Semigroup Channel, can be trained in a successive manner as illustrated in Fig.. In each of the trainings, the Semigroup Channel receives a preliminary coefficient vector C ( ) as input composed by the dynamic scalar component and the static initial vector component, and gives a smoothened coefficient vector C ( ) as output. herefore the primary objective of training is to replicate (an if necessary, to smoothen) the vector C ( ) with a vector C ( ), which has the semigroup property. However, there is a secondary objective in training the Semigroup Channel, that is, the channel must also replicate the semigroup property of the trajectory by gradually acquiring a semigroup property in its own weight space. he acquisition of this semigroup property in the weight space becomes the basis for extrapolation. In order to obtain this property gradually, it is necessary that the training in this second step (semigroup tracking) occurs in a gradual manner, as shown in Fig.. he semigroup channel is trained along each sliced sub-trajectories until a convergence of the overall weight pattern is found. Fig.. System-type architecture. Look for weight convergence W4 W3... N he first component, Function Channel, outputs a vector of basis functions E (H ), while the Semigroup Channel supplies the Function Channel with a coefficient vector C ( ) as a function of the index. hese two channels realize a semigroup-based implementation of the mapping L (, H ). he Function Channel can have Radial Basis Function (RBF) architecture [3]. It consists of n RBF networks. Each of these networks operates as one of the n orthonormal basis functions W W Fig.. Overview of new training method. data point D. Regression Method Regession method is one of the most widely used approaches for load forecasting []. However, it is not the
3 main tool to perform load forecasting in this paper. Because the electric load is a complex system influenced by many factors, it can be formulated as in the following form: L = L + L + L (3) total base weather other factors where L is the base load which is caused by time factor; base L is the weather sensitive load which is due to weather weather variables, in which temperature usually is dominant among various weather variables; and L is a component of other factors the load resulting from other factors. he objective of applying regression here is to reduce effects caused by other unknown factors as much as possible or even eliminate them. herefore, the output of regression method is supposed to be highly correlated with time and weather variables. E. Rearrangement of Load Since load and temperatures are already tightly correlate the load consumptions of a certain time interval fluctuate because of different temperatures. Assuming the load changes smoothly with temperature, the regression load is rearranged according to the magnitudes of hourly temperature instead of the original time sequences. If the data surface of the rearranged load is smooth, the extrapolation along a single coordinate (temperature) can be performed. Because the temperature is changing hourly, it is necessary to rearrange the load with respect to the temperature at each hour so that a smooth load surface can be obtained. Fig. 3 illustrates the process. he depth of color represents magnitude of temperature. previous (historical) days, or extrapolation if temperature forecast exceeds the temperature bounds. herefore, the load forecasting at this hour can be achieved by recombining the basis vector and the interpolated coefficient vector or the extrapolated coefficient vector. Exactly same procedure is executed for 4 hours so as to obtain the forecasting loads of the target day. Fig. 4. Interpolation and extrapolation of the coefficient. III. SIMULAION RESULS he proposed forecasting approach is tested by using the historical load data obtained from New England Independent System Operator (ISO). he hourly temperatures of each day are weighted average values of 8 weather stations in the New England area in degrees Fahrenheit. In the simulation, load data for the year is chosen for testing. he load is classified into two groups: weekdays and weekends load. In each group, a moving window containing previous four weeks load is created for each target day. Actual raw load data in the moving window passes through the regression filter, and then algebraic decomposition decomposes the regression data into a basis vector and a coefficient vector of dimensionality n, where n is set to two. he results are analyzed by the following formulas: ) Standard Deviation N σ = [ L( Lˆ( ] (4) N d = Fig. 3. Rearrangement of the regression load. F. Interpolation and Extrapolation Interpolation and extrapolation involve only the coefficient vector generated by the Semigroup Channel. Fig. 4 illustrates the interpolation and extrapolation procedures for a given hour. Red circles represent the historical load and green circles represent the forecasting load. Because the hourly temperatures for a forecasting day are assumed to be already known, the coefficient vector can be found by interpolation if the forecasted temperature at a given hour does not exceed the temperature bounds experienced in where L ( is empirical load data for a given day (d) and hour (, L ˆ( is the corresponding load forecast. ) Percent Error Error = L( Lˆ( / L( () Here an arbitrary weekday is chosen as the forecasting day to show the simulation results. he weekday loads of previous four weeks are selected as historical data. Figs. and 6 show the regression load before the rearrangement and after the rearrangement. Notice that the axis in Fig. 6 is changed from Day to emperature by the rearrangement. It is obvious that the rearranged load surface is much smoother than the one before rearrangement.
4 x 4 Regression Load before Rearrangement Comparison of Original and Smoothened Coefficient Vector Smoothened Original.. MW. C -. -. Day Hour - - - Fig. 7. Comparison of the original and smoothened coefficient C. Fig.. Regression load before rearrangement. Regression Load after Rearrangement Comparison of Original and Smoothened Coefficient Vector. x 4 C. -. MW. - -. Hour - - Smoothened Original Fig. 8. Comparison of the original and smoothened coefficient C. Extrapolation est C Fig. 6. Regression load after rearrangement. After the rearrangement, the load is decomposed into a basis vector and a preliminary coefficient vector. o fulfill those two objectives during the training of the Semigroup Channel, at first the preliminary coefficient vector needs to be smoothened and replicated. herefore the Semigroup Channel is trained with the initial coefficient vector C ( ) = [ c (), c()] and the dynamic scalar component using the proposed progressive training method. Figs. 7 and 8 show the smoothened coefficient vector plotted in two components. Secondly, it is necessary to determine whether the semigroup property occurs in the weight space. herefore the extrapolation test is performed to observe if the weight changes of the neural network can be replaced with a weight change model which owns the semigroup property. Figs. 9 and show that extrapolation can be carried out based on the similarity of two curves. When a forecasted temperature exceeds the temperature bounds of the moving window, extrapolation of the coefficient vector is require which is illustrated in Figs. and. C. -. - - - - Smoothened C Fig. 9. Extrapolation test for C. C. -. - - - Smoothened C Extrapolation est Extrapolation est C Extrapolation est - Fig.. Extrapolation test for C.
C. -. - - - - Smoothened C Fig.. Extrapolation of C. C. -. - - - - Smoothened C Fig.. Extrapolation of C. Extrapolation of C Extrapolation Extrapolation of C Extrapolation Here three typical weeks are selected to show the performance of the proposed metho winter, spring, and summer weeks. he results include percent error and standard deviation at each hour of each day, both being calculated from the regression load and the actual load. In the first week which is a typical winter week, the highest error is 7.9% at 3: on Wednesday and the lowest error is.% at 3: on Sunday for the regression load. On the other han for the actual load in the same week, the highest error is 9.87% at 8: on uesday and the lowest error is.4% at 4: on Friday. It should be noticed that uesday is the New Year s Day, but we still have treated it as a normal weekday. herefore the errors for this day are relatively large. In the th week which is a typical spring week, the highest error is.% at : on Friday and the lowest error is nearly.% at 6: on uesday for the regression load. With respect to the actual loa the highest error is 8.47% at 9: on Saturday and the lowest error is nearly % at : on Wednesday. In the 3th week, which is a typical summer week, the highest error is 8.84% at : on Friday and the lowest error is.3% at 9: on Saturday for the regression load. For the actual loa the highest error is 8% at 8: on uesday and the lowest error is nearly % at 4: on uesday and 8: on Wednesday. As shown above, high hourly percent errors with respect to the regression load may appear for some hours, e.g., 8.84% in the 3th week. his is due to relatively abnormal hourly temperature changes in the forecasting day, like sudden increase or drop, compared to the historical temperatures of previous days. able I shows the results of the regression load averaged for the whole year. Friday shows the best forecasting result ABLE I SAISICS OF WHOLE YEAR AVERAGE FORECASING RESULS (REGRESSION LOAD). Mon. ue. Wed. hr. Fri. Sat. Sun. Hour.4 38.73 378.76 44.73 48.47 99.39 4.97 366.7 6.7 3.8 46.68 38.47 94. 44.83 34 3.4.8 363.9 46.6 3.8.44 49.76 99 4.4 6.93 369.3 464.7 339.3 6.9 48.64 78.7 3.3 389.6 468.9 34.39 76.7 387. 3 6.3.97 44.4 48.7 36.4 8.8 49.8 6 7.4 3.74 4.69 39.64 379.34 9.79 64.88 88 8.4 33. 34.4 374.47 4.7 89.87 37.64 3 9. 9.39 333.7 387.49 43. 37.93.49 36.8 9.36 3.3 4.3 46. 34.69 69. 94. 77.6 88.8 44. 39. 33.89 99.7 9 9 3.99 94.33 464.7 346. 36. 6.49 34 3.97 44.7 3.48 47. 3.8 7. 6.74 393 4.9 9.33 46. 6. 343.9 9.9 68.87 47.7 8. 48.6 8.9 47.9 338.98 69.63 367 6.9 34.4 46.73 663.73 63. 3.9 648.79 4 7. 43.7.6 66.73 89. 37.3 673.88 467 8.9 439. 499.77 6.48 6.6 439.33 87.66 4 9.4 48.7 67.79 4.9 37.6 437.9 66.47 346.39 49.4 464.66 48.7 33. 4.6 9. 336.34 387.47 46.6 49. 379.6 39.7 6.6 37.4 37..6 474. 447.36 39.8 6.74 4 3.34 9.9 87.8 486.47 374.7 89.4 739.97 4 4.48 84. 83. 44.46 333.6 38.6 4.6 43 Avg.33 3.8 49.66 49.46 43.8 38. 6.7 3
6 and Saturday shows the worst. Based on the actual loa hursday shows the best result and Saturday shows the worst. Large errors are in late afternoons for weekdays and mornings for weekends. IV. CONCLUSIONS In this paper, a new neural network approach is developed for short-term load forecasting. his method is applied to the regression load instead of actual empirical load data. Rearrangement of the regression load along the temperature axis assures the smoothness of the coefficient vector so that the interpolation and extrapolation can be carried out. he results of the whole year show that the proposed method achieves satisfactory forecasting results with respect to regression load. However, the errors resulting from the use of actual load are generally beyond the short-term load forecasting requirement. his results from the existence of the load component which is dependent on unknown factors in the given load data. If the given load and temperature data are highly correlated to each other, it is expected that much better results can be delivered and the regression method can be removed from the proposed approach. systems using neural networks," Proc. of the American Control Conference, Vol. 6, pp. 4389-4394, Nov.. [] I. Moghram and S. Rahman, "Analysis and evaluation of five short-term load forecasting techniques," IEEE rans. Power Systems, vol. 4, pp. 484-49, Nov. 989. VII. BIOGRAPHIES Kwang Y. Lee received his B.S. degree in Electrical Engineering from Seoul National University, Korea, in 964, M.S. degree in Electrical Engineering from North Dakota State University, Fargo, in 968, and Ph.D. degree in System Science from Michigan State University, East Lansing, in 97. He has been with Michigan State, Oregon State, Univ. of Houston, Penn State, and Baylor University, where he is now a Professor and Chair of Electrical and Computer Engineering. His interests include power system control, operation, planning, and intelligent system applications to power systems. Dr. Lee is a Fellow of IEEE, Associate Editor of IEEE ransactions on Neural Networks, and Editor of IEEE ransactions on Energy Conversion. He is also a registered Professional Engineer. Shu Du received his B.S. degree in Automation from Beijing University of Chemical echnology in 3. He is currently enrolled in the Master program in Electrical and Computer Engineering at Baylor University, exas. His research interests are in neural networks and power systems. V. ACKNOWLEDGMEN he authors gratefully acknowledge the support by National Science Foundation under Grant #7838. VI. REFERENCES [] K. Y. Lee, Y.. Cha, and J. H. Park, "Short-term load forecasting using an artificial neural network," IEEE rans. Power Systems, vol. 7, pp. 4-3, Feb. 99. [] J. H. Chow, Applied Mathematics for Restructured Electric Power Systems: Optimization, Control, and Computational Intelligence. New York: Springer-Verlag,, ch.. [3] R. Jacobs and M. Jordan, "A competitive modular connectionist architecture," Advances in Neural Information Processing Systems, Vol. 3, p. 767-773, 99. [4] A. Atiya, R. Aiya and S. Shaheen, "A Practical Gated Expert System Neural Network," IEEE International Joint Conference on Neural Networks, vol., pp. 49-44, 998. [] K. Y. Lee, J. P. Velas, and B. H. Kim, "Development of an intelligent monitoring system with high temperature distributed fiberoptic sensor for fossil-fuel power plants," IEEE Power Engineering Society General Meeting, pp. 3-3, Jun. 4. [6] B. H. Kim, J. P. Velas, and K. Y. Lee, "Development of intelligent monitoring system for fossil-fuel power plants using system-type neural networks and semigroup theory," IEEE Power Engineering Society General Meeting, pp. 949-94,. [7] B. H. Kim, J. P. Velas, and K. Y. Lee, "Semigroup based neural network architecture for extrapolation of enthalpy in a power plant," International Conference on Intelligent System Applications to Power Systems, pp. 9-96,. [8] B. H. Kim, J. P. Velas, and K. Y. Lee, "Semigroup based neural network architecture for extrapolation of mass unbalance for rotating machines in power plants," International Federation of Automatic Control (IFAC) Symposium on Power Plants and Power Systems Control, in CD, 6. [9] B. H. Kim, J. P. Velas, and K. Y. Lee, "Short-term load forecasting using system-type neural network architecture," International Joint Conference on Neural Networks, pp. 69-66, 6. [] N. J. Hobbs, B. H. Kim, and K. Y. Lee, "Long-term load forecasting using system type neural network architecture," International Conference on Intelligent Systems Applications to Power Systems 7, pp. -7, Nov. 7. [] R. Padhi and S. N. Balakrishnan, "Proper orthogonal decomposition based feedback optimal control synthesis of distributed parameter