Nonparametric forecasting of the French load curve

Similar documents
Predicting intraday-load curve using High-D methods

High dimensional statistical learning methods applied to energy data sets

Graph Functional Methods for Climate Partitioning

Image processing and nonparametric regression

SHORT TERM LOAD FORECASTING

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

Forecasting the electricity consumption by aggregating specialized experts

peak half-hourly South Australia

peak half-hourly Tasmania

Forecasting demand in the National Electricity Market. October 2017

peak half-hourly New South Wales

2013 WEATHER NORMALIZATION SURVEY. Industry Practices

CHAPTER 6 CONCLUSION AND FUTURE SCOPE

A Short Introduction to the Lasso Methodology

Predicting the Electricity Demand Response via Data-driven Inverse Optimization

Wind power and management of the electric system. EWEA Wind Power Forecasting 2015 Leuven, BELGIUM - 02/10/2015

WEATHER NORMALIZATION METHODS AND ISSUES. Stuart McMenamin Mark Quan David Simons

A Sparse Linear Model and Significance Test. for Individual Consumption Prediction

SOLAR POWER FORECASTING BASED ON NUMERICAL WEATHER PREDICTION, SATELLITE DATA, AND POWER MEASUREMENTS

CP:

The risk of machine learning

Advances in promotional modelling and analytics

Computational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

FORECASTING: A REVIEW OF STATUS AND CHALLENGES. Eric Grimit and Kristin Larson 3TIER, Inc. Pacific Northwest Weather Workshop March 5-6, 2010

Frequency Forecasting using Time Series ARIMA model

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

SYSTEM OPERATIONS. Dr. Frank A. Monforte

Probabilistic forecasting of solar radiation

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Machine learning, shrinkage estimation, and economic theory

Multivariate Regression Model Results

Regression Shrinkage and Selection via the Lasso

Short-Term Electrical Load Forecasting for Iraqi Power System based on Multiple Linear Regression Method

Development of Short-term Demand Forecasting Model And its Application in Analysis of Resource Adequacy. For discussion purposes only Draft

AESO Load Forecast Application for Demand Side Participation. Eligibility Working Group September 26, 2017

Integrated Electricity Demand and Price Forecasting

22/04/2014. Economic Research

Short-Term Demand Forecasting Methodology for Scheduling and Dispatch

Defining Normal Weather for Energy and Peak Normalization

elgian energ imports are managed using forecasting software to increase overall network e 칁 cienc.

Forecasting Using Time Series Models

Systems Operations. PRAMOD JAIN, Ph.D. Consultant, USAID Power the Future. Astana, September, /6/2018

Optimization of the forecasting of wind energy production Focus on the day ahead forecast

Analysis and the methods of forecasting of the intra-hour system imbalance

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES

An Improved Method of Power System Short Term Load Forecasting Based on Neural Network

Regression, Ridge Regression, Lasso

Country scale solar irradiance forecasting for PV power trading

Chronological Time-Period Clustering for Optimal Capacity Expansion Planning With Storage

Weather Prediction Using Historical Data

Improved Holt Method for Irregular Time Series

Importance of Numerical Weather Prediction in Variable Renewable Energy Forecast

The BLP Method of Demand Curve Estimation in Industrial Organization

Biostatistics Advanced Methods in Biostatistics IV

NASA Products to Enhance Energy Utility Load Forecasting

Multi-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts

Electricity Demand Probabilistic Forecasting With Quantile Regression Averaging

STAT 462-Computational Data Analysis

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

ZONAL AND REGIONAL LOAD FORECASTING IN THE NEW ENGLAND WHOLESALE ELECTRICITY MARKET: A SEMIPARAMETRIC REGRESSION APPROACH

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Parking Occupancy Prediction and Pattern Analysis

About Nnergix +2, More than 2,5 GW forecasted. Forecasting in 5 countries. 4 predictive technologies. More than power facilities

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Univariate versus Multivariate Models for Short-term Electricity Load Forecasting

SMART GRID FORECASTING

ECON/FIN 250: Forecasting in Finance and Economics: Section 4.1 Forecasting Fundamentals

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

Spatially-Explicit Prediction of Wholesale Electricity Prices

Predicting Future Energy Consumption CS229 Project Report

Model Selection and Geometry

UNIVERSITETET I OSLO

Reduced Overdispersion in Stochastic Weather Generators for Statistical Downscaling of Seasonal Forecasts and Climate Change Scenarios

High-dimensional regression

Short Term Load Forecasting Using Multi Layer Perceptron

Time Series and Forecasting

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34

Chapter 13: Forecasting

Short Term Load Forecasting Based Artificial Neural Network

MISO September 15 Maximum Generation Event Overview. October 11, 2018

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Identifying the Monetary Policy Shock Christiano et al. (1999)

Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory

Weather generators for studying climate change

Chapter 7 Forecasting Demand

Electricity Demand Forecasting using Multi-Task Learning

Solar irradiance forecasting for Chulalongkorn University location using time series models

Forecasting of Electric Consumption in a Semiconductor Plant using Time Series Methods

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Uncertainty in energy system models

Extending clustered point process-based rainfall models to a non-stationary climate

WEATHER DEPENENT ELECTRICITY MARKET FORECASTING WITH NEURAL NETWORKS, WAVELET AND DATA MINING TECHNIQUES. Z.Y. Dong X. Li Z. Xu K. L.

Battery Energy Storage

Robust Building Energy Load Forecasting Using Physically-Based Kernel Models

Explanatory Information Analysis for Day-Ahead Price Forecasting in the Iberian Electricity Market

Forecasting in Hierarchical Models

WIND energy has become a mature technology and has

25 : Graphical induced structured input/output models

Transcription:

An overview RTE & UPMC-ISUP ISNPS 214, Cadix June 15, 214

Table of contents 1 Introduction 2 MAVE modeling 3 IBR modeling 4 Sparse modeling

Electrical context Generation must be strictly equal to consumption at all times Electricity cannot be stored on an industrial scale, and demand fluctuates heavily. The French Electricity Transmission System Operators (RTE) shall ensure the balance of electricity flows on the network at all times, and an optimum management of electricity flows across the network. There are lot of forecasting needs: consumption, electricity losses, market prices, wind and solar generation, from short (up to 7 days) to long (few years) term, at different scales. The daily coordination is facilitated by having short term demand-supply balance predictions on a day ahead horizon.

Short term load curve forecasting The French global electricity consumption ( load ) increases in winter due to the electrical heating, and also in summer due to the air-conditioning. RTE drafts its load curve forecasting taking into account historical consumption patterns, weather forecast and daily pricing information. The dispatchers make the final load balancing decisions, taking into account the most recent information, including unexpected modifications of consumption patterns (e.g. strikes, national sporting events) and of generation (e.g. thermal generation unavailabilities, wind and solar generation variations). Modifying forecasting (decision-support) tools for a TSO is very sensitive and costly.

Statistics context Actually, RTE uses a 3 steps procedure: Step 1 RTE corrects the half hourly load curve by modeling the impact of climate (temperature and cloud cover measured at 32 weather stations) and prices, in order to work on a time series that doesn t depend on exogenous variables. This step is done by using a regression model with dependent variables based on climate and tariff. We denote the corrected series by Z t. Step 2 RTE uses a SARIMA model to forecast Z t at the horizon H: Ẑ t+h. Step 3 RTE adds the forecasts given in Step 2 with the estimation given by the regression model using prices and forecasts for the temperature and cloud cover.

Electrical and statistics challenges Modeling problematics Parametric paradigm: rigid and not adapted to the structure of some time series. Nonparametric paradigm: curse of dimensionality, interpretability. Big data... and huge variability (smart grids context)

MAVE modeling Xia et al. (22) proposed the Minimum Average (conditional) Variance Estimation: ( ) Y = g B T X + ε where : Y R is the response variable X R p are p covariates ε is a random variable in L 1 (Ω, A, P). E (ε /X ) = almost surely g : R D R is the unknown link function B = ( β 1,..., β D), with D {1,..., p}, is an unknown p D matrix such B T B = I D

MAVE estimation Given D, Effective Dimension Reduction (EDR) B is solution of : [ ( ( / )) ] 2 min E Y E Y B T X B/B T B=I D Using the conditional variance of Y given B T X, σb 2 ( B T X ), minimisation problem is equivalent to : [ ( )] min E σb 2 B T X B/B T B=I D With a local linear expansion, one can obtain finally: min n n B/B T B=I D,(a j,b j) j {1,...,n} j=1 i=1 { [ 2 Y i a j + bj T B T (X i X j )]} ωij where (ω il ) i {1,...,n} are weights between X i and X l.

MAVE conclusions Some computational (iterative algorithm for weights) and specification difficulties (correlated data). Tests of a partially linear MAVE modeling. No good enough results to continue the project. 2 alternatives: IBR & sparse regression.

IBR modeling Joint work with P-A. Cornillon (Rennes 2), N. Hengartner (LANL), Eric Matzner-Løber (Rennes 2) Suppose the data (X i, Y i ) R p R are related via the following regression model: Y i = m(x i ) + ε i, i = 1,..., n where the errors are independent of all the covariates. It can be rewritten in vector form: Y = m + ε where Y = (Y 1,..., Y n ) t, m = (m(x 1 ),..., m(x n )) t and ε = (ε 1,..., ε n ) t

Linear smoothers Linear smoothers, which depend on a tuning parameter λ (trade-off between the smoothness of the estimate and the goodness-of-fit), can be written as: m = S λ (X )Y, where S λ (X ) is an n n smoothing matrix and m = Ŷ denotes the fitted values.

The IBR procedure Step 1 The pilot smoother is applied to the data (oversmoothing the data): m 1 = S λ (X )Y. Step 2 The bias E( m 1 X ) m = (S λ (X ) I )m is estimated by replacing m by a smooth linear estimate (possibly using the same pilot smoother). The initial estimator is then corrected by removing the estimated bias: m 2 = [S λ (X ) + S λ (X )(I S λ (X ))]Y. Step k The bias estimation and bias correction steps can be iterated to generate a sequence of bias corrected smoothers, which gives at step k m k = [I (I S λ (X )) k ]Y.

The number of iterations choice The number of iterations is obtained by a Generalized Cross Validation criterium: GCV (k) = log σ 2 k 2 log (1 tr([i (I S λ(x )) k ]) ), n where σ k 2 corresponds to the estimated variance of the current residuals at step k.

Practical Implementation Based on half-hourly data (Z 1,, Z T ) (T corresponds to 12am), we use the following model to predict Z T +25,, Z T +72 : Z (T 48i)+25 = f (Z T 48i,..., Z T 48i p ) + ε i > where p is the memory (48 or 96). We might consider also different models for any H [25,, 72]: Z (T 48i)+H = f H (Z T 48i,..., Z T 48i p ) + ε H i >. (1) The IBR procedure was applied using the R package ibr (available on CRAN).

IBR conclusions Some good empirical results (e.g. in 21, the IBR MAPE is.98% against 1.12 % for SARIMA). But for the moment no exogenous variable included.

Electrical Consumption Time series Joint work with M. Mougeot (Paris 7), D. Picard (Paris 7), K. Tribouley (Paris 7), L. Maillard-Teyssier (RTE) 6 x 14 2167 1 2162 7 5.5 5 4.5 4 3.5 3 5 1 15 2 25 3 35 Figure : Two weeks of electrical consumption

Intraday load curves 6 x 14 2167 1 2162 7 5.5 5 4.5 4 3.5 3 5 1 15 2 25 3 35 Figure : Functional data, Intra day load curves

Intraday load curve 5.5 x 14 2167 1 5 4.5 4 3.5 3 5 1 15 2 25 Intra day load curve, 3 sampling (48 pts), Y R n=48 (Y t, 1 t 28)

Intra-day load curves 7 x 14 Electrical Consumption signals 6.5 6 5.5 5 4.5 4 3.5 5 1 15 2 25 Intraday load curves for some days. 23-1-27: dashed dot line, 23-8-28: solid line, 23-1-1: dot line, 23-4-1: dashed line.

Spot of Temperatures, Cloud Cover and Wind information Figure : Temp., Cloud Cover spots ( 39) and wind data ( 293)

Modeling each signal as a function We investigate the problem in a supervised learning setting. We consider each time unit signal (U i, Y i ), i = 1,..., n The generic consumption signal observed on the time unit: Y i, i = 1,..., n The design (here equi-distributed): U i = i n We want to identify f (for each signal) in such a way that the model Y i = f (U i ) + ɛ i. makes sense (has small errors ɛ i s).

Using a dictionary Consider a dictionary D of functions D = {g 1,... g p } and Assume that f can be well fitted by this dictionary f = p β l g l + h l=1 where h is a small function (in absolute value). The model is p Y i = β l g l (U i ) + h(u i ) + ɛ i, i = 1,..., n l=1 which coincides with the linear model : Y = X β + ɛ with X (n p) putting ɛ i = h(u i ) + ɛ i and G il = g l (U i ).

High dimensional framework Solution: ˆβ = Argmin Y X β 2 - More variables than observations n << p β 1 y 1 x 11...... x 1p β 2 y 2 =...... x n1... x np y n β p + ɛ Fat matrix Infinity of ˆβ solutions. Need more assumptions on β to solve the problem e.g. Lasso, Ridge

Alternative procedure Learning Out of Leaders*: based on 2 Thresholding steps, weak complexity, sparse solution, Algorithm in 3 steps (normalized X column): step compute size 1. SELECTION Find b Leaders X b (n, b) (threshold) b < n << p 2. REGRESSION on Leaders β = (X T b X b) 1 X T b Y (1, b) 3. THRESHOLD the coefficients ˆβ (1, Ŝ)

Generic Dictionary For daily load curves, a good choice happened finally to be a mixture of the Fourier basis and the Haar basis (p = 62): 1 (1:1) constant function (1) 2 (2:24) cosine functions (with increasing frequencies) (23) 3 (25:47) sine functions (with increasing frequencies)(23) 4 (48:62) Haar functions (with increasing frequencies)(15)

November 18 th 27 S = 12, MAPE =.57 =, 57%. MAPE = 1 n=48 n i Y i Ŷ i /Y i 7.2 x 14 Electrical consumption 271118 7 x 14 271118 Dictionnary coefficients (12) 7 6 C S H 6.8 5 6.6 4 3 6.4 2 6.2 1 6 5.8.1.2.3.4.5.6.7.8.9 1 1 1 2 3 4 5 6 7 Figure : 27 11 18 left: observed signal - red line, approximated signal -blue line right: S coefficients on the dictionary

June 17 th, 29 S = 5, MAPE =.147 5.6 x 14 Electrical consumption 29617 5 x 14 29617 Dictionnary coefficients (6) 5.4 5.2 4 C S H 5 4.8 3 4.6 2 4.4 4.2 1 4 3.8 3.6.1.2.3.4.5.6.7.8.9 1 1 1 2 3 4 5 6 7 Figure : 23 4 3 left: observed signal - red line, approximated signal -blue line right: S coefficients on the dictionary

Segmentation of the intra-day load curves using Sparse Approximation on a Generic Dictionary 8 years of data: T = 28 intra day load curves (n = 48) Using the sparse approximation (same support, S = 8) Using a clustering algorithm in 2 steps (k-means algorithm) Segmentation of the daily signals in clusters... From Cluster to groups using calendar interpretation

Two step Clustering results x 1 4 x 1 4 x 1 4 x 1 4 9 9 9 9 8 8 8 8 7 7 7 7 6 6 6 6 5 5 5 5 4 4 4 4 3 Cluster 1 (66) 1 2 3 Cluster 2 (173) 1 2 3 Cluster 3 (62) 1 2 3 Cluster 4 (51) 1 2 Figure : T = 28 intra day load curves of size n = 48 (clustering using S = 7 approximated coefficients)

3 2 1 4 2 2 1 4 2 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 5 2 1 2 1 3 2 1 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 2 1 6 4 2 2 1 2 1 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 3 2 1 3 2 1 4 2 4 2 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 1 2 3 4 5 6 7 8 911112 From clusters to groups From clusters: Cluster 1.a Day Cluster 1.a Month Cluster 3.a Day Cluster 3.a Month Cluster 1.b Day Cluster 1.b Month Cluster 3.b Day Cluster 3.b Month Cluster 2.a Day Cluster 2.a Month Cluster 4.a Day Cluster 4.a Month Cluster 2.b Day Cluster 2.b Month Cluster 4.b Day Cluster 4.b Month To groups: calendar interpretation of the clusters Months Days 1 2 3 4 5 6 7 8 9 1 11 12 1 7 8 5 3 3 3 3 1 3 3 5 7 2 7 8 5 3 3 3 3 1 3 3 5 7 3 7 8 5 3 3 3 3 1 3 3 5 7 4 7 8 5 3 3 3 3 1 3 3 5 7 5 7 8 5 3 3 3 3 1 3 3 5 7 6 6 8 4 4 2 2 2 2 2 2 4 6 7 6 6 4 4 2 2 2 2 2 2 4 6

Intraday Specific Dictionary Each day t, Y t = X t β t + ɛ t With Dictionary of p functions D t = {g t 1,... g t p} Final model, p = 1 (p = 14) 1 2, Shape fonctions (group centroid, previous week day Y t 7 ) 2 8, Climate fonctions (Temperature and Cloud Cover Indicators computed over the 39 meteorological spots. (and Wind...(+4)) Forecast: Plug in estimated coefficients M(t), with M expert

Conclusion Universal approach for functional data (with intra day pattern) Sparse approximation using A Generic dictionary for compression and pattern extraction Intra day specific dictionaries for approximation and prediction Forecasting Various experts for prediction Agregation using exponential weights Competitive approach compared to usual time serie analysis with much less parameters.

Specialized Experts focus on 1 (2) Time depending (t-1, t-7) 2 (2) climatic configuration of the day (Temperature) 3 (2) constrained climatic configuration of the day (Temperature/Cloud Covering) 4 group constraint climatic configuration of the day (Temperature/group) 5 climatic configuration of the day constrained by the type of the day (Temperature/day) 6 climatic configuration of the day constrained by a calendar group (Temperature/calendar) 7 climatic configuration of the day (Nebulosity) 8 group constraint climatic configuration of the day (Cloud Covering/group) 9 climatic configuration of the day constrained by the type of the day (Cloud Covering/day) 1 climatic configuration of the day constrained by a calendar group (Cloud Covering/calendar)