Deep Neural Network Based Demand Side Short Term Load Forecasting

Size: px

Start display at page:

Download "Deep Neural Network Based Demand Side Short Term Load Forecasting"

Christiana Pope
5 years ago
Views:

1 Deep Neural Network Based Demand Side Short Term Load Forecasting Seunghyoung Ryu Department of Electronic Engineering Sogang University Seoul, Korea Jaekoo Noh KEPCO Research Institute Korea Electric Power Corporation Daejeon, Korea Hongseok Kim Department of Electronic Engineering Sogang University Seoul, Korea Abstract In smart grid, one of the most important research areas is load forecasting; it spans from traditional time series analysis to recent machine learning approach and mostly focuses on forecasting aggregated electricity consumption. However, the importance of demand side energy management including individual load forecasting is becoming critical. In this paper, we propose deep neural network (DNN) based load forecasting models, and apply them to demand side empirical load database. DNNs are trained by two different ways: pre-training restricted Boltzmann machine and using rectified linear unit without pre-training. DNN forecasting models are trained by individual customer s electricity consumption data and regional meteorological elements. To verify the performance of DNNs, forecasting results are compared with shallow neural network (SNN), and double seasonal Holt- Winters (DSHW) model. The mean absolute percentage error (MAPE) and relative root mean square error (RRMSE) are used for verification. The results show that DNNs exhibit accurate and robust forecasts compared to other forecasting models, e.g., MAPE and RRMSE are reduced by up to 17% and 22% compared to SNN, and 9% and 29% compared to DSHW. I. INTRODUCTION The electricity load forecasting is one of the main research areas in smart grid and can be classified into several categories according to its target forecasting ranges. Among these, short term load forecasting (STLF), which focuses on forecasting load of hours or days ahead, secures attentions in the smart grid because of its wide applicability to demand side management, energy storage operation, peak load reduction, etc. One of the main characteristics of the smart grid is bidirectional communications, which breaks down the border between electricity generation and consumption. Bidirectional communications is realized with the deployment of advanced metering infrastructure (AMI), which is a prerequisite of many smart grid services such as 15 minute interval load monitoring, real time pricing, demand response, and trouble detection. However, the collapse of the border between generation and consumption increases the grid complexity. Now consumers can produce electricity from renewable sources such as photovoltaic cells and wind turbines. Furthermore, by using energy storage system (ESS), consumers can actively shape their power demand. In addition to these changes, deregulation and privatization of electricity markets make utility companies offer smart end-user services to their customers, and thus customer load forecasting becomes one of the basic elements in user services. These new characteristics imply that the roll of consumers would be more significant than before because the scope of forecasting is shifted to microscopic view from macroscopic view for stable electricity operation. There are many research activities on forecasting models from classical time series analysis to recent machine learning approach. Time series analysis [1], one of the most popular load forecasting methods, includes autoregressive integrated moving average (ARIMA), exponential smoothing, and Kalman filter. In [2], the authors developed a double seasonal Holt-Winters model (exponential smoothing) that showed better results than other models like ARIMA or neural networks. Another branch of load forecasting is to use an artificial neural network (ANN) model [3], [4]. Since ANN can model nonlinearity, it is widely used for various forecasting applications. The most common type of ANNs is a multilayer perceptron (MLP) that forecasts a load profile using previous load data (e.g., [5]). ANN based STLF models are well summarized in [6]. A deep neural network (DNN) is an ANN with more layers than typical three layered MLP. Deep structure increases the feature abstraction capability of neural networks. Recently, the progress of Internet of things (IoT) and big data enables the adoption of DNNs in diverse research fields. However, to the best of our knowledges, we could barely find DNN based electricity load forecasting models. For estimating building energy consumption, conditional restricted Boltzmann machine and factored conditional restricted Boltzmann machine are used in [7], and deep belief network (DBN) based reinforcement learning forecasting model is developed in [8]. In [9], DNN is used for time series wind forecasting, and DBN was used as a part of ensemble learning in [1]. In [11], the author trained DNN with discriminative pre-training method for electricity load forecasting. In this paper, we adopt DNN as a demand side day-ahead STLF model and analyze its forecasting results of 4 dominant industrial customers in Korea. We train DNNs in two different ways: pre-training restricted Boltzmann machine (RBM) and using rectified linear unit (ReLU) without pre-training. We compare DNN forecasting results with typical three layered shallow neural network (SNN), ARIMA, and double seasonal Holt-Winters (DSHW) model /16/$ IEEE

2 The rest of this paper is organized as follows. In Section II, basic ANNs and DNNs will be introduced. In Section III, we propose a DNN-based framework, and implementation issues of forecasting model will be followed in Section IV. Experimental results and conclusion will be given in Section V and VI, respectively. II. METHODOLOGY Recently, DNNs show tremendous progresses in many research fields such as acoustic modeling, natural language processing, and image recognition. A layout of ANNs had remained shallow due to the problems while training DNNs such as vanishing gradient, overfitting, and lack of computational power. In 26, Hinton et al., published a seminal paper about deep learning [12], and after that deep learning has been developing rapidly. With deeper layers, DNNs have an ability to capture highly abstracted features from data. Since load profiles have nonlinear characteristics among various factors that affect the shapes of load patterns, it is reasonable to use DNN as a forecast model. In this paper, we adopt two different DNN models to learn complicated relations between weather variables, date, and previous consumptions for individual customers. Then, the DNN produces a dayahead forecast of 24 hours load profile according to past observations. A. Artificial neural networks ANN is a system of connected artificial neurons, and it has an ability to model any arbitrary nonlinear function [13]. A typical structure of ANN forecasting model is based on MLP. MLP has three-layered structure; input, hidden, and output layers. Each layer consists of neurons without intralayer connections. However, between the layers, there exist full weighted connections among neurons. The elements of MLP are as follows. By using a superscript l to denote a layer, the output vector y of the layer l is y l = h(z l )=h((w l ) T y l 1 + b l ) (1) where h( ) represents a nonlinear vector activation function, W l =[wij l ] and bl represent a weight matrix and a bias vector, respectively. The back propagation (BP) algorithm is used to update weight matrices to minimize the error between target value and the value produced by MLP. Then, wij l is updated by w l l ij where represents learning rate and E is the error. wij l can be derived by the chain rule [14]. In the BP algorithm, error signal propagates backwards to calculate wij l. In SNN, propagates well to the input layer. However in DNN, due to the gradient of typical activation functions is zero in the saturation area, also goes to zero as it propagates toward input layer. This problem of training DNNs is called vanishing gradient and made it hard to train DNNs. (2) B. Deep neural networks We adopt two techniques to overcome the difficulties of training deep structure. The first method is to pre-train the DNN by stacking RBMs, which is known as DBN. It is known that neural networks with multiple hidden layers can be well trained by the BP algorithm when weights are initialized by stacked RBMs [12]. Also, pre-training leads the network to better local minima that support better generalization [15]. RBM is a neural network that has only two layers called visible and hidden layer. Typical RBM has binary nodes in each layer denoted by v = {v i } and h = {h j }, respectively. For any (v i,h j ) pair, there exists a bidirectional connection, but no intra-layer connection. The probability to have specific (v, h) configuration is determined by its energy E(v, h) in the form of p(v, h) = 1 Z e E(v,h) (3) where Z = P v,h e E(v,h) is a normalization factor. With bias vector b v for visible, b h for hidden and weight matrix W between v and h, the energy of RBM is defined as [16] E(v, h) = b T v v b T h h v T Wh. (4) Since there is no intra-layer connection, the states of the visible and hidden layer are conditionally independent to each other. Then the probability of a specific node to turn on is p(h j =1 v) = (b h,j + v T W j ), (5) p(v i =1 h) = (b v,i + h T (W T ) i ) (6) where (x) = 1/(1 + e x ) and W j represents the j-th column vector of the matrix W. RBM is then trained to maximize the log probability of v over given training data. The contrastive divergence algorithm is widely known to update W. It performs k-step Gibbs sampling to generate a reconstructed image of given training data and updates W with w ij = (hv, hi data hv, hi recon ) (7) where is a learning rate and the angle brackets are expectations under the distribution of subscript [16]. Geometrically, W is the direction of lowering/increasing the energy of training/reconstructed data. The RBM pre-training method initializes DNN with weights of stacked RBMs. RBM stacking process is as follows. First, train the RBM with training data. Then its upper RBM is trained from the hidden layer of the previous RBM. The hidden layer of the lower RBM becomes the visible layer of its new upper RBM. In this manner, learning and stacking procedure iterates until having a desired network structure. After unsupervised RBM pre-training, supervised fine tuning is performed with the BP algorithm. Because the network already knows the feature of the given training data, pretraining method helps to train DNN. Another method to train DNN is much simpler than pretraining: using ReLUs instead of typical sigmoid units [17]. The ReLU has a rectifier h(z) = max(,z) as an activation

3 Fig. 1. The proposed DNN short term load forecasting framework. function. Since the gradient of the rectifier is 1 if z, and otherwise, error signal is still alive when it propagates backward to the lowest layer, and DNN can be trained well with the BP algorithm without the vanishing gradient problem. Also using ReLU has advantage in training time due to the characteristic of its gradient. III. PROPOSED FRAMEWORK In this section, we describe the proposed load forecasting framework using DNNs as depicted in Fig. 1. The framework is based on the knowledge discovery in databases (KDD) process and consists of three modules: data processing module, DNN training module, and DNN forecasting module. A. Data processing The upper part of Fig. 1 shows data processing module. First, we acquire hourly measured customer load data provided by Korea Electric Power Corporation (KEPCO). Each customer has information about time, power consumptions, province, and industrial category in accordance with the Korea Standard Industrial Classification (KSIC). Customers consist of dominant power contractors in their province and industrial category. There are eight representative industrial categories where we randomly select five customers per category for this study. We also use weather data to forecast load profiles. From the regional weather database of the National Climate Data Service System (NCDSS), we select five parameters after examining correlation analysis result: daily average temperature, humidity, solar radiations, cloud cover, and wind speed. Next, accumulated time series load and weather data go through data preprocessing step. Data preprocessing contains data cleansing, normalization, and structure change. We find that load measurements have some defective data such as missing values and zero values. In data cleansing step, these defects are replaced with the average of previous loads. After Parameter y 1,y 2,...,y 24 x 1,x 2,...,x 24n x 24 n+1,...,x 26 n x 26 n+1,...,x 26 n+5 x 26 n+6,...,x 26 n+9 TABLE I TRAINING DATA COMPOSITION Configuration label, 24 hours normalized load profile past n normalized load profile date information of past n days (day of the week, weekday) weather parameters correspond to label date information of label (season, month, day of the week, weekday) data cleansing, all load and weather data are normalized with its maximum value. If training data has large values, weight matrix need be extremely small to make the weighted sum within appropriate input range of sigmoid function. However, small weights make it hard to use the BP algorithm, so we normalize the database first, and denormalize forecasting result to obtain valid load forecasts. The third part of preprocessing is restructuring the given database into training and test data. The training data contains labels (desired output) and input variables. For example, each label is a normalized daily load profile with 24 dimensions to obtain daily load profile from corresponding input vector. Input variables could be any parameters that affect electricity consumption. The input data consists of past electricity consumptions, weather parameters for target date, season, month, and date indicators. With a time window index n, the composition of training vector is described in Table I. Test data is used for evaluating accuracy of the proposed forecasting model and they are not used in training step. We construct validation data and exploit validation error as an indicator for selecting proper parameters. In the experiment, we apply 1-fold cross validation to determine the number of epochs. B. Training & forecasting The DNN training module is the main part of this framework, and the implementation details are given in Section IV. We first determine the structure of DNN, e.g., the numbers of hidden layers and neurons. The number of neurons in input and output layers are fixed to the dimension of training data and the period of forecast load profile. In the case of using 15 minute load data, 15 minute interval load forecasts can be obtained by adding more input and output neurons that correspond to each time slot. Once DNN is created, we select a way to train DNN (pre-training or ReLU), and DNN learns nonlinear relations between load profiles and past observations. The last part of the framework is the forecasting module. By using the trained DNN, we obtain forecasts of target dates. Then model accuracy is evaluated by test data. The mean absolute percentage error (MAPE) and relative root mean square error (RRMSE) are used for error measure. IV. IMPLEMENTATION OF DNNS We construct DNNs for each customer using R with darch package [18], and train with various customer s electricity consumption data.

4 12 1 Training Data Test Data Fig. 2. The structure of DNN in proposed STLF framework. The number of neurons is shown above of each layer. A. Size of data There could be some issues of using DNNs as a forecasting model if the size of load data is limited. DNNs shows remarkable performance in other fields with huge training database. For example, the ImageNet large scale visual recognition challenge in 214 has more than.4m training images. Using pre-training is not necessary with large training data. But load data is currently limited in numbers because gathering demand side load data just started recently. We could obtain roughly 75 daily load profiles per customer for three years except holiday. In contrast, the number of parameters in DNN may outnumber training data. This size-related problem could lead overfitting that memorizes the training data rather than learning. Therefore, we need to consider proper DNN structure and overfitting. B. Structure of DNNs The structure of DNNs is mainly determined by the numbers of layers and neurons. The numbers of input neurons and output neurons are automatically fixed by the dimension of training data and forecasting period. By adjusting the number of input and output neurons, DNN could deal with different measuring interval and forecasting period. In [19], the authors used heuristic method to choose the number of hidden neurons. The heuristics may work with SNN because there are only two possible actions: increase or decrease. However, in the case of DNNs, it is not feasible to test all possible combinations of layers and neurons since we need to consider both width and depth of network. Therefore we determine the structure by trial and error (adding or subtracting 5 neurons for each layer), and finally select four hidden layers and 15 neurons per layer. Hidden layers have either sigmoid unit for pre-training or ReLU, and the output layer has linear units to construct load profile. The proposed DNN structure is described in Fig. 2. In the experiment, we train DNNs with mini batch and resilient backpropagation [2]. In addition we train DNNs without nonbusiness day to focus on load profile of business day, and time window of five is used. C. Resolving overfitting Overfitting can be recognized by observing a learning curve that describes error rate according to the number of epochs. MAPE (%) Epochs Fig. 3. An example of customer learning curve. Errors of both training and test data gradually decrease. When overfitting occurs, there exists a certain point where test error starts to increase while training error still decreases. This means that model memorizes the given training data, but it cannot forecast well in real situation. Thus too much training with complex forecasting model leads to bad generalization. There are several methods to prevent overfitting such as dropout [21] or early stopping. However, when we observe the learning curve of DNNs, we did not observe overfitting occurs; test error also gradually decreases as training error does in both pre-training and ReLU models. An example of customer s learning curve with 1 days of test data is depicted in Fig. 3. Since customers have different learning curves, we apply k- fold cross validation to determine stopping criterion. From this study, the size of electricity consumption data seems less critical, but it may require further verification with different network structures and/or when more data are available later. V. EXPERIMENTAL RESULTS In this section, experimental results of STLF by DNNs are given. MAPE and RRMSE are used to measure the accuracy of daily forecasts. Experiments are focused on forecasting load profiles of business days. Forecasting results of DNNs are compared with SNN, ARIMA, and DSHW model in order to verify DNN as a forecasting model of individual customers. ARIMA and DSHW models are created for each daily forecast with past 8 weeks. A. Case 1: single load type We first present the result about one selected customer and analyze the forecasting results of four seasons. The customer belongs to networking business category and has typical daily load patterns with seasonal variation. Fig. 4 shows the customer s electricity consumptions of 212 to 214. There exists significant load increase in summer, and the average hourly electricity consumption is 1,642 kwh. We forecast load profiles of three weeks in the middle of each season. As an example, forecasts in summer are described in Fig. 5. An example of daily load profile is zoomed in the box. ARIMA with daily seasonality shows the worst forecasts; load shapes are similar to real loads, but ARIMA could not reflect the recent trend. The DSHW model considers both daily and weekly seasonality. It shows better accuracy than the ARIMA

5 Season Date TABLE II SEASONAL ERROR TABLE (%) OF THE SELECTED CUSTOMER. MAPE (%) RRMSE (%) RBM ReLU SNN DSHW ARIMA RBM ReLU SNN DSHW ARIMA Spring Summer Fall Winter Average than ARIMA, i.e., 69%. Furthermore, DNN models forecast well without being affected by seasonal variation. MWh Date index Time (hour) Fig. 4. An example of customer s load pattern. Daily load profiles can be recognized with load and time axes while seasonal load variations are expressed with load and date index axes. Load (kw) Time (hour) 2 Real DNN(RBM) DNN(ReLU) SNN DSHW ARIMA Fig. 5. Seasonal forecasting results of selected customer in summer. This figure includes hourly load forecasts of 15 days described in Table II. model. However, DSHW sometimes shows unusual patterns depicted in the box of Fig. 5 which seem to be the imprint of past abnormal consumptions. In contrast, the forecasts by two proposed DNN models are more stable than the ARIMA and DSHW models. When there exists a gap between real and DNN forecasts, other models exhibit wider gaps. However it is hard to find opposite cases. In all four seasons, forecasts from DNNs follow the recent trend and real load patterns well. The average errors of all seasons are shown in Table II. As can be seen, two DNN forecasting models show better accuracy than the SNN, DSHW, and ARIMA models in overall. DNN with pre-training shows the best accuracy in average by 3.2% and 4.1% in terms of MAPE and RRMSE. MAPE decreases by 27% than SNN, and 12% than DSHW. Due to the average gap of forecasts by ARIMA, MAPE of DNN is much lower B. Case 2: various load types To identify generalized performance of the proposed DNNs, we apply the DNNs to a large set of load database. We select eight industrial categories, and five customers are randomly selected in each category. We forecast daily load profiles of 1 weeks from Sep. 13 to Dec. 19 of 214. Considering the inaccurate result of Case 1 and burden of model selection, we exclude ARIMA in Case 2. As an example, error results of customers in the public administration category and average errors of total eight industries are shown in Table III. Due to space limitation, we can show only one industrial category in detail and present the average of eight industries with 4 customers in the bottom line. MAPE over 1% are not counted in the result. As described in the table, both DNN models show accurate forecasting result even though SNN or DSHW sometimes exhibits better result than the DNNs. In overall the average MAPE of DNN with ReLU decreases by 17% and 9% compared to the SNN and DSHW, respectively. Decrease in the RRMSE is more drastic; 22% & 29% lower than that of the SNN and DSHW. For further analysis, we compare the daily errors of customers. Fig. 6 shows the empirical cumulative distribution function (CDF) of daily RRMSEs. In terms of RRMSE CDF, two DNN models are always better than the SNN model. Since DNNs has more computational power, this result may not be surprising. Compared with DSHW, two DNN models and DSHW exhibits similar performance in small error region. Roughly when errors are under 3%, performance of DSHW is better than DNNs, but after that, DNNs begin to be better than DSHW; DNNs forecast more reliably even if loads are uncertain with high forecasting errors. These results imply the robustness of DNN models. Between two DNN models, there is no significant difference in accuracy. Since the RBM pre-training needs more computational power and time, the DNN with ReLU could be a better choice for demand side forecasting. VI. CONCLUSION In this paper, we proposed DNN-based demand side STLF framework. The proposed framework contains data processing, DNN training and DNN forecasting step, and forecast daily load profile day ahead based on weather, date and past electricity consumptions. We found that DNNs are effective to learn characteristics of load patterns and show better performance compared to other forecasting models. Our extensive

6 Industry TABLE III ERROR TABLE OF PUBLIC ADMINISTRATION CATEGORY AND AVERAGE ERRORS OF 8 INDUSTRIES (4 CUSTOMERS) Customer Peak MAPE (%) RRMSE (%) ID Load (kw) DNN(RBM) DNN(ReLU) SNN DSHW DNN(RBM) DNN(ReLU) SNN DSHW Public administration Average Total 4 customers Average CDF DNN(RBM) DNN(ReLU) SNN DSHW RRMSE (%) Fig. 6. Empirical CDF of daily RRMSE. experiments show that two proposed approaches, DNN with pre-training and DNN with ReLU, can be trained well with up to three years of customer load data without overfitting. The performance of forecasting model is evaluated based on the MAPE and RRMSE. Due to the space limitation, we cannot provide all the test results performed for 4 customers in eight industrial categories but we verify that in overall DNNs perform better than existing methods, e.g., MAPE and RRMSE are reduced by up to 17% and 22% than SNN, and 9% and 29% by DSHW. The results of this paper can be further extended in several directions. We are currently gathering nation-wide data and thus expect to train the DNN better when big data is available. Then, it would be possible to explore a better DNN structure, e.g., consider deep recurrent neural network as a time series forecasting model. In addition, analyzing the conditions (e.g., load type, load pattern, date, etc) when DNN is accurate, would be helpful to select a proper forecasting model or to use different data processing scheme for individual customers. ACKNOWLEDGMENT This research was partially supported by Korea Electric Power Corporation and Korea Electrical Engineering & Science Research Institute (grant number : R14XA2-5) and by Korea Institute of Energy Technology Evaluation and Planning (grand number : ). REFERENCES [1] M. T. Hagan and S. M. Behr, The time series approach to short term load forecasting, Power Systems, IEEE Transactions on, vol. 2, no. 3, pp , [2] J. W. Taylor, Short-term electricity demand forecasting using double seasonal exponential smoothing, Journal of the Operational Research Society, vol. 54, no. 8, pp , 23. [3] D. C. Park, M. El-Sharkawi, R. Marks, L. Atlas, M. Damborg et al., Electric load forecasting using an artificial neural network, Power Systems, IEEE Transactions on, vol. 6, no. 2, pp , [4] L. Hernandez, C. Baladrón, J. M. Aguiar, B. Carro, A. J. Sanchez- Esguevillas, and J. Lloret, Short-term load forecasting for microgrids based on artificial neural networks, Energies, vol. 6, no. 3, pp , 213. [5] A. G. Bakirtzis, V. Petridis, S. Kiartzis, and M. C. Alexiadis, A neural network short term load forecasting model for the Greek power system, Power Systems, IEEE Transactions on, vol. 11, no. 2, pp , [6] H. S. Hippert, C. E. Pedreira, and R. C. Souza, Neural networks for short-term load forecasting: A review and evaluation, Power Systems, IEEE Transactions on, vol. 16, no. 1, pp , 21. [7] E. Mocanu, P. H. Nguyen, M. Gibescu, and W. L. Kling, Deep learning for estimating building energy consumption, Sustainable Energy, Grids and Networks, vol. 6, pp , 216. [8] E. Mocanu, P. H. Nguyen, W. L. Kling, and M. Gibescu, Unsupervised energy prediction in a Smart Grid context using reinforcement crossbuilding transfer learning, Energy and Buildings, vol. 116, pp , 216. [9] M. Dalto, J. Matusko, and M. Vasak, Deep neural networks for ultrashort-term wind forecasting, in Industrial Technology (ICIT), 215 IEEE International Conference on. IEEE, 215, pp [1] X. Qiu, L. Zhang, Y. Ren, P. N. Suganthan, and G. Amaratunga, Ensemble deep learning for regression and time series forecasting. in CIEL, 214, pp [11] W. He, Deep neural network based load forecast, Computer Modeling and New Technologies, vol. 18, no. 3, pp , 214. [12] G. E. Hinton and R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science, vol. 313, no. 5786, pp , 26. [13] M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural networks, vol. 6, no. 6, pp , [14] Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T. Lin, Learning from data. AMLBook Berlin, Germany, 212. [15] D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio, Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research, vol. 11, pp , 21. [16] G. Hinton, A practical guide to training restricted Boltzmann machines, Momentum, vol. 9, no. 1, p. 926, 21. [17] A. L. Maas, A. Y. Hannun, and A. Y. Ng, Rectifier nonlinearities improve neural network acoustic models, in Proc. ICML, vol. 3, 213, p. 1. [18] M. Drees, Implementierung und Analyse von tiefen Architekturen in R, Master s thesis, Fachhochschule Dortmund, 213. [19] H. Shayeghi, H. Shayanfar, and G. Azimi, Intelligent neural network based STLF, International Journal of Computer Systems Science and Engineering, vol. 4, no. 1, 29. [2] M. Riedmiller and H. Braun, A direct adaptive method for faster backpropagation learning: The RPROP algorithm, in Neural Networks, 1993., IEEE International Conference On. IEEE, 1993, pp [21] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, vol. 15, no. 1, pp , 214.

Deep Learning Architecture for Univariate Time Series Forecasting

CS229,Technical Report, 2014 Deep Learning Architecture for Univariate Time Series Forecasting Dmitry Vengertsev 1 Abstract This paper studies the problem of applying machine learning with deep architecture