High dimensional statistical learning methods applied to energy data sets
|
|
- Steven Wilcox
- 5 years ago
- Views:
Transcription
1 High dimensional statistical learning methods applied to energy data sets Dominique Picard Université Paris-Diderot LPMA
2 Data science challenge : Post-modern Rationality Replace human logic by statistical logic Prediction during Greek Antiquity : oracle Physical modeling (prediction) : mathematical calculation using knowledge of the causes and mathematical equations for the consequences Post-modern prediction : predictive algorithms using rather elementary indices of statistical correlations between phenomena on large data sets.
3 Learning in high dimension Basic ideas : 1 Representation of the data 2 Sparsity- regularity- Smoothing 3 Divide and conquer 4 Agregation
4 Parametric-nonparametric 1 Representation of the data 2 Sparsity : only a few number of parameter matters 3 Parametric : you know which parameters matter 4 NonParametric : you don t know which parameters matter : Smoothing
5 Three examples 1 Forecasting the electrical consumption 2 Modeling-Forecasting wind production 3 Segmentation of the country into regions using meteorological data
6 Segmentation of the country into homegeneous climate regions ( ) Homogeneous climate regions using learning algorithms 2018, submitted, M. Mougeot, D. P., V. Lefieux, M. Marchand
7 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Arpege French Meteorological data At n = 259 locations, - Temperature and Wind - for 14 years - hourly sample rate - d = points for raw data - Y data matrix (n x d) - n << d Temperature Wind
8 Objective and Questions : Goals Segmentation of the country into regions using meteorological data Temperature and/or Wind Study the Between Year variability
9 Wind and Temperature spots for 2014 Temp day 2014 Chamonix Temp day 2014 Brest Wind Temp Wind Temp Wind day 2014 Chamonix Wind day 2014 Brest Temp day 2014 Toulouse Wind Temp Wind day 2014 Toulouse
10 Segmentation for 2001, 2007, 2014 daily data Temperature Kmeans 5 ( ) Temp day 2001 Thres 0.05 (dicota) Kmeans 5 ( ) Temp day 2007 Thres 0.05 (dicota) Kmeans 5 ( ) Temp day 2014 Thres 0.05 (dicota) Wind Kmeans 4 ( ) Wind day 2001 Thres 0.05 (dicota) Kmeans 4 ( ) Wind day 2007 Thres 0.05 (dicota) Kmeans 5 ( ) Wind day 2014 Thres 0.05 (dicota)
11 Segmentation results : stability Number of regions stability Geographical stability (no GPS coordinates in the data) Between years stability Wind-temperature stability
12 Intraday load curve forecasting -here 48h- 5.6 x Y 3.8 Yapx tpred
13 Forecasting procedure Construction of an encyclopedia of past scenarios on a learning set of days ( ). For each day : Explanatory variables. Exogenous and endogenous variables. Sparse Approximation of the consumption of the day Prediction procedure Build a set of prediction experts Aggregate the prediction experts ( ) Forecasting Intra Day Load Curves Using Sparse Functional Regression (2015) Lecture Notes in Statistics, vol 217 M. Mougeot, D. P., V. Lefieux, L. Maillard-Teyssier
14 cp = cp = 0.44 Builder curve 0 Electrical power [kw] Wind power with machine learning Wind speed [m.s 1] 15 20
15 Wind power with machine learning Models inspired by physics Persistence : Y t = Yt 1 Turbine builder s power curve Parametric models cp = cp = 0.44 Builder curve 0 Electrical power [kw] Wind speed [m.s 1] Linear regression : Y t = a0 + a1 Wt Logistic regression : Y t = 1+exp(aC0 +a1 Wt ) Polynomial logistic regression : Y t = C 1+exp(a0 +a1 Wt +a2 Wt2 +a3 Wt3 ) LASSO Statistical learning models CART Bagging+CART Random Forest Support Vector Machine k-nearest Neighbors
16 Aim Statistical learning for wind power Can a statistical model be of interest compared to physically based model? In a forecast view, what is the effect on the modeling of the lack of precision on the meteorological variables? ( ) Statistical learning for wind power : A modeling and stability study towards forecasting, Wind Energy, 2017, A. Fisher, L. Montuelle, M. Mougeot, D. P. Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
17 Forecasting procedure ( ) Forecasting Intra Day Load Curves Using Sparse Functional Regression (2015) Lecture Notes in Statistics, vol 217 M. Mougeot, D. P., V. Lefieux, L. Maillard-Teyssier
18 Forecasting pipeline 1 Construction of a smart encyclopedia of past scenarios out of a data basis using learning algorithms. 2 Build a set of prediction experts consulting the encyclopedia. 3 Aggregate the prediction experts
19 Data basis The past data basis Electrical consumption of the past Meteorological input Endogenous variables : calendar data, functional bases
20 Electrical consumption of the past Recorded every half hour from January 1 st, 2003 to August 31 th, For this period of time, the global consumption signal is split into N = 2800 sub signals (Y 1,...,Y t,...,y N ). Y t R n, defines the intra day load curve for the t th day of size n = 48.
21 Intraday load curve for seven days Monday January 25 th to Sunday January 31 th 9 x
22 Calendar and functional effects 6 x x x x Figure: autumn, winter, spring and summer
23 Functional aspect : dictionary Figure: Functions of the dictionary. Constant (black-solid line), cosine (blue-dotted line), sine (blue-dashdot line), Haar (red-dashed line) and temperature (green-solid line with points) functions.
24 Meteorological inputs A total of 371 (=2x39+293) meteorological variables recorded each day half-hourly over the 2800 days of the same period of time. Temperature : T k for k = 1,...,39 measured in 39 weather stations scattered all over the French territory. Cloud Cover : N k for k = 1,...,39 measured in the same 39 weather stations. Wind : W k for k = 1,...,293 available at 293 network points scattered all over the territory.
25 Weather stations Figure: Temperature and Cloud covering measurement stations. Wind stations
26 Explanatory variables For each t index of the day of interest, we register the daily electrical consumption signal Y t and Z t = [P t M t ] = [C t B t [T] t [N] t [W] t ] (1) is the concatenation of the calendar and functional variables and climate variables previously described.
27 Sparse approximation on the learning set Sparse Approximation of each consumption day on a learning set of days ( ), using exogenous and endogenous variables. For each day t of the learning set, we build an approximation Ŷt of the (observed) signal Y t with the help of the endogenous and exogenous variables (Z t ) : Ŷ t = G t (Z t )
28 Forecasting procedure SL-E-A Sparse learning of each consumption day on the past set. Construction of a set of forecasting experts. Aggregation of the experts.
29 Expert associated to the strategy M Forecasting experts Strategy. : M a function purely non anticipative, data dependent or not, from N to N such that for any d N,M(d) < d. Plug-in To the strategy M we associate the expert Ỹ t M : the prediction of the signal of day t using forecasting strategy M Ỹ M t = G M(t) (Z t )
30 Examples of experts : time depending tm1 : Refer to the day before : (The coefficients used for prediction are those calculated the previous day) M(d) = d 1 tm7 : Refer to one week before : M(d) = d 7
31 Experts introducing climatic patterns of the day T : Find the day having the closest temperature indicators (min, max, med, std) regarding the sup distance (over the days, and over the indicators.) : M(d) = ArgMin t sup k {1,...,6}, i {1,...,48} T k d (i) Tk t (i) T m : Find the day having the closest med. temperature with the sup distance (over the day.) : M(d) = ArgMin t sup i {1,...,48} T 3 d (i) T3 t (i)
32 Experts introducing a climatic configuration of the day constrained by the type of the day s/j : Closest cloud covering indicators (min, max, med, std) regarding the sup distance for days which are of the same type as Y t. : M = ArgMin J(d)=J(t). sup i,k N k d (i) Nk t (i) within the same type of day as Y t
33 MAPE error For day t, the prediction MAPE error over the interval [0,T] is defined by : MAPE(Y,Ỹ M t )(T) = 1 T MISE(Y,Ỹ M t )(T) = 1 T T i=1 T i=1 Ỹ M t (i) Y t (i) Y t (i) Ỹ M t (i) Y t (i) 2
34 Prediction evaluation M mean med min max Naive Apx tm tm T Tm T/N Tm/N T/G T/d T/c Ns/G N/d N/c Table: Mape errors for the different strategies for intraday forecasting M, L5.
35 Aggregation of predictors : Exponential weights (inspired by various theoretical results -see Lecue, Rigollet, Stolz, Tsybakov,...-) with M Ỹ wgt m=1 d = wm d Ỹ d m M m=1 wm d w M d = exp( 1 Tθ T i=1 Ỹ M d (i) Y t(i) 2 ) θ is a parameter, (often called temperature in physic applications) T = Tpred.
36 Forecasting (mape=0.7%). 5.6 x Y 3.8 Yapx tpred
37 Sparse approximation on the learning set Sparse Approximation For each day t of the learning set, Z t = [P t M t ] = [C t B t [T] t [N] t [W] t ] is the concatenation of calendar variables and climate variables. We build an approximation Ŷt of the (observed) signal Y t using (Z t ) : Ŷ t = G t (Z t ) G t (Z t ) = Z tˆβt
38 High dimensional Linear Models Y = Xβ.+ǫ β. IR p is the unknown parameter (to be estimated) ǫ = (ǫ 1,...,ǫ n ) is a (non observed) vector of random errors. It is assumed to be variables i.i.d. N(0,σ 2 ) X is a known matrix n p. High dimension : p >> n t.
39 Sparsity conditions Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
40 Penalization for sparsity Many penalizations introduced historically in the regression framework (to put identification constraints on β) Ridge : E(β,λ) = Y Xβ 2 +λσ j βj 2 Lasso : E(β,λ) = Y Xβ 2 +λσ j β j Scad : E(β,λ) = Y Xβ 2 +λσ j w j g(β j ) 2 step thresholding procedure Many others... The goal is to choose the relevant variables -for each day- Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
41 Sparse approximation on the learning set Sparse Approximation of each consumption day on a learning set of days ( ), using the set of potentially explanatory variables. For each day t of the learning set, we build an approximation Ŷt of the (observed) signal Y t with the help of the set of explanatory variables (Z t ) : Ŷ t = G t (Z t ) G t (Z t ) = Z tˆβt ( ) Sparse Approximation and Knowledge Extraction for Electrical Consumption Signals, 2012, M. Mougeot, D. P., K. Tribouley & V. Lefieux, L. Teyssier-Maillard Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
42 Forecasting (mape=0.7%). 5.6 x Y 3.8 Yapx tpred Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
43 Winter forecast 9 x Figure: Forecast (solid blue line) and observed (dashed dark line) electrical consumption for a winter week from Monday February 1 st to Sunday January 7 th Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
44 Spring forecast 6 x Figure: Forecast (solid blue line) and observed (dashed dark line) electrical consumption for a spring week from Monday June 14 th to Sunday June 21 th Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
45 Wind farm modeling ( ) Statistical learning for wind power : A modeling and stability study towards forecasting, Wind Energy, 2017, A. Fisher, L. Montuelle, M. Mougeot, D. P. Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
46 Wind farm modeling-data 3 farms in the North and East of France 4 to 6 turbines per farm 8.2 to 12.3 MW per farm On each turbine, we measure Power 2 wind speed measures + the average Wind direction, coded with cos and sin Temperature State of the turbine (on/off/starting) Three additional variables Variance of the wind speed Variance of the wind direction
47 Aim Statistical learning for wind power Can a statistical model be of interest compared to physically based model? In a forecast view, what is the effect on the modeling of the lack of precision on the meteorological variables? Dominique Picard (Paris-Diderot-LPMA) High dimensional statistical learning methods applied to energy February data sets / 64
48 Modeling Models inspired by physics Persistence : Y t = Y t 1 Turbine builder s power curve Parametric models Linear regression : Y t = a 0 +a 1 W t Logistic regression : Y t = C 1+exp(a 0+a 1W t) Polynomial logistic regression : Y t = LASSO Statistical learning models CART Bagging+CART Random Forest Support Vector Machine k-nearest Neighbors C 1+exp(a 0+a 1W t+a 2W 2 t +a3w3 t )
49 CART (Breiman & all 1984)!"#!"#$%&'&%() $%!"#$*&'&+(,!"#$%&'&,(%!"#$*&'&+(-!"#$%&'&)(%!"#$%&'&*()!"#$%&'&--!" #$% &&& "$'!() ))%( )&'" )%&! Grow a binary tree : choose the variable that provides the best cut criterion : intra-leaf variance Prune the tree to avoid over-fitting : cross-validation minimize MSE Prediction : mean of the leaf
50 Random Forest (Breiman & all 1984)!"#!"#$%&'&%() $%!"#$*&'&+(,!"#$%&'&,(%!"#$*&'&+(-!"#$%&'&)(%!"#$%&'&*()!"#$%&'&--!" #$% &&& "$'!() ))%( )&'" )%&! Bootstrap one sample per tree to be grown Grow a binary tree : choose randomly m variables among these variables, calculate the best split criterion : minimize the variance of the children Prediction of each tree : mean of the leaf Average the trees
51 Real time modeling One model per turbine Prediction of the farm power Training on around 8000 instant-points Test on 10 data sets of 724 points (around 15 days)
52 Persistence RMSE Linear Reg Logistic Reg Poly Log Reg CART RF Linear Reg Logistic Reg Poly Log Reg Lasso Cart Cart+Bagging RF SVM knn
53 !"#$%$&"'(",97B >== <== )%'"*#+,"- )%'"*#+,"- ).-%$&%(+,"- ).-%$&%(+,"-!./0+).-+,"-!./0+).-+,"- 12,3 12,3,4,4 )%'"*#+,"- )%'"*#+,"- ).-%$&%(+,"- ).-%$&%(+,"-!./0+).-+,"-!./0+).-+,"- )*$$. )*$$. 1*#& 1*#& 1*#&56*--%'- 1*#&56*--%'-,4, :;; :;;
54 Segmentation of French territory ( ) Homogeneous climate regions using learning algorithms 2018, submitted, M. Mougeot, D. P., V. Lefieux, M. Marchand
55 Segmentation for 2001, 2007, 2014 daily data Temperature Kmeans 5 ( ) Temp day 2001 Thres 0.05 (dicota) Kmeans 5 ( ) Temp day 2007 Thres 0.05 (dicota) Kmeans 5 ( ) Temp day 2014 Thres 0.05 (dicota) Wind Kmeans 4 ( ) Wind day 2001 Thres 0.05 (dicota) Kmeans 4 ( ) Wind day 2007 Thres 0.05 (dicota) Kmeans 5 ( ) Wind day 2014 Thres 0.05 (dicota)
56 3 steps There are generally 3 important steps (linked in fact) : 1 Representation of the data 2 Smoothing 3 Selection procedure
57 Representation of the data Ad-hoc representations PCA Functional representation Kernel clustering Spectral clustering...
58 Natural-time aggregation smoothing Temp day 2014 Brest Temp The data are observed hourly. It is commonly admitted to take 1 the average on a day : daily observed data T = 365 for one year. 2 the average on a week : daily observed data T = 52 for one year. 3 the average on a month : daily observed data T = 12 for one year.
59 PCA-reduction Projection of the observations using a data driven orthonormal basis X centered data matrix (n,d) n = 259, d >> n large The Feature matrix (n,t) is computed by projection, T << d : Z = XU T U T is the matrix defined by the first eigenvectors of T, the Variance-Covariance matrix. T chosen so that? λ λ T Σ j λ j = κ pca (0.95) Global linear method involving all the n = 259 spots to compute U T Is U T similar between years?
60
61 hours Functional smoothing Data are (in fact) functions of time regularly spaced. X i t = fi (t/d)+ǫ i t, f i is unknow, ǫ i N(0,σ 2 ), t = 1,...,d. Nonparametric estimation of f i : f i = T l=1 βi l g l with D = {g 1,...,g p } dictionary of functions. Temperature in Paris, May Here We note ˆX i j 0 = j 0 j=1 ˆβi (j) g j How to choose T? with ˆβ i (1)... ˆβ i (n), and ˆX i (j 0 ) 2 X i 2 T NP (= 0.95).
62 Selection procedure : Kmeans Clustering Choose k the number of clusters Find the Arg min (in C 1,...,C k ) of : k r=1 Other methods j j, C r Y j Y j 2 = 2 Ȳ r = 1 C j k r=1 j C r Y j. j C r Y j Ȳr 2,
63 Segmentation results : stability Number of regions stability Geographical stability Between years stability Wind-temperature stability
64 Thank you for your attention
Predicting intraday-load curve using High-D methods
Predicting intraday-load curve using High-D methods LPMA- Université Paris-Diderot-Paris 7 Mathilde Mougeot UPD, Vincent Lefieux RTE, Laurence Maillard RTE Horizon Maths 2013 Intraday load curve during
More informationGraph Functional Methods for Climate Partitioning
Graph Functional Methods for Climate Partitioning Mathilde Mougeot - with D. Picard, V. Lefieux*, M. Marchand* Université Paris Diderot, France *Réseau Transport Electrique (RTE) Buenos Aires, 2015 Mathilde
More informationNonparametric forecasting of the French load curve
An overview RTE & UPMC-ISUP ISNPS 214, Cadix June 15, 214 Table of contents 1 Introduction 2 MAVE modeling 3 IBR modeling 4 Sparse modeling Electrical context Generation must be strictly equal to consumption
More informationForecasting the electricity consumption by aggregating specialized experts
Forecasting the electricity consumption by aggregating specialized experts Pierre Gaillard (EDF R&D, ENS Paris) with Yannig Goude (EDF R&D) Gilles Stoltz (CNRS, ENS Paris, HEC Paris) June 2013 WIPFOR Goal
More informationA Sparse Linear Model and Significance Test. for Individual Consumption Prediction
A Sparse Linear Model and Significance Test 1 for Individual Consumption Prediction Pan Li, Baosen Zhang, Yang Weng, and Ram Rajagopal arxiv:1511.01853v3 [stat.ml] 21 Feb 2017 Abstract Accurate prediction
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationUltra High Dimensional Variable Selection with Endogenous Variables
1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High
More informationA Bayesian Perspective on Residential Demand Response Using Smart Meter Data
A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu
More informationMultivariate Regression Model Results
Updated: August, 0 Page of Multivariate Regression Model Results 4 5 6 7 8 This exhibit provides the results of the load model forecast discussed in Schedule. Included is the forecast of short term system
More informationHierarchical models for the rainfall forecast DATA MINING APPROACH
Hierarchical models for the rainfall forecast DATA MINING APPROACH Thanh-Nghi Do dtnghi@cit.ctu.edu.vn June - 2014 Introduction Problem large scale GCM small scale models Aim Statistical downscaling local
More informationA Short Introduction to the Lasso Methodology
A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationReal Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report
Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Hujia Yu, Jiafu Wu [hujiay, jiafuwu]@stanford.edu 1. Introduction Housing prices are an important
More informationMachine Learning Approaches to Crop Yield Prediction and Climate Change Impact Assessment
Machine Learning Approaches to Crop Yield Prediction and Climate Change Impact Assessment Andrew Crane-Droesch FCSM, March 2018 The views expressed are those of the authors and should not be attributed
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationStatistics and learning: Big Data
Statistics and learning: Big Data Learning Decision Trees and an Introduction to Boosting Sébastien Gadat Toulouse School of Economics February 2017 S. Gadat (TSE) SAD 2013 1 / 30 Keywords Decision trees
More informationNon-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets
Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Nan Zhou, Wen Cheng, Ph.D. Associate, Quantitative Research, J.P. Morgan nan.zhou@jpmorgan.com The 4th Annual
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationpeak half-hourly New South Wales
Forecasting long-term peak half-hourly electricity demand for New South Wales Dr Shu Fan B.S., M.S., Ph.D. Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Business & Economic Forecasting Unit Report
More informationDay 3: Classification, logistic regression
Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised
More informationConstruction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting
Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting Anne Philippe Laboratoire de Mathématiques Jean Leray Université de Nantes Workshop EDF-INRIA,
More informationDefining Normal Weather for Energy and Peak Normalization
Itron White Paper Energy Forecasting Defining Normal Weather for Energy and Peak Normalization J. Stuart McMenamin, Ph.D Managing Director, Itron Forecasting 2008, Itron Inc. All rights reserved. 1 Introduction
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne
More informationShort-Term Electrical Load Forecasting for Iraqi Power System based on Multiple Linear Regression Method
Volume 00 No., August 204 Short-Term Electrical Load Forecasting for Iraqi Power System based on Multiple Linear Regression Method Firas M. Tuaimah University of Baghdad Baghdad, Iraq Huda M. Abdul Abass
More informationProbabilistic Energy Forecasting
Probabilistic Energy Forecasting Moritz Schmid Seminar Energieinformatik WS 2015/16 ^ KIT The Research University in the Helmholtz Association www.kit.edu Agenda Forecasting challenges Renewable energy
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationImportance of Numerical Weather Prediction in Variable Renewable Energy Forecast
Importance of Numerical Weather Prediction in Variable Renewable Energy Forecast Dr. Abhijit Basu (Integrated Research & Action for Development) Arideep Halder (Thinkthrough Consulting Pvt. Ltd.) September
More informationPredicting Future Energy Consumption CS229 Project Report
Predicting Future Energy Consumption CS229 Project Report Adrien Boiron, Stephane Lo, Antoine Marot Abstract Load forecasting for electric utilities is a crucial step in planning and operations, especially
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationRobust Building Energy Load Forecasting Using Physically-Based Kernel Models
Article Robust Building Energy Load Forecasting Using Physically-Based Kernel Models Anand Krishnan Prakash 1, Susu Xu 2, *,, Ram Rajagopal 3 and Hae Young Noh 2 1 Energy Science, Technology and Policy,
More informationpeak half-hourly Tasmania
Forecasting long-term peak half-hourly electricity demand for Tasmania Dr Shu Fan B.S., M.S., Ph.D. Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Business & Economic Forecasting Unit Report for
More informationElectric Load Forecasting Using Wavelet Transform and Extreme Learning Machine
Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine Song Li 1, Peng Wang 1 and Lalit Goel 1 1 School of Electrical and Electronic Engineering Nanyang Technological University
More informationSemi-Nonparametric Inferences for Massive Data
Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationday month year documentname/initials 1
ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationABC random forest for parameter estimation. Jean-Michel Marin
ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint
More informationBayesian spatial quantile regression
Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationFeature Engineering, Model Evaluations
Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering
More informationPredicting the Electricity Demand Response via Data-driven Inverse Optimization
Predicting the Electricity Demand Response via Data-driven Inverse Optimization Workshop on Demand Response and Energy Storage Modeling Zagreb, Croatia Juan M. Morales 1 1 Department of Applied Mathematics,
More informationChapter 14 Combining Models
Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationWEATHER NORMALIZATION METHODS AND ISSUES. Stuart McMenamin Mark Quan David Simons
WEATHER NORMALIZATION METHODS AND ISSUES Stuart McMenamin Mark Quan David Simons Itron Forecasting Brown Bag September 17, 2013 Please Remember» Phones are Muted: In order to help this session run smoothly,
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationModel Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University
Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates
More informationClassification: The rest of the story
U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher
More informationThe Dayton Power and Light Company Load Profiling Methodology Revised 7/1/2017
The Dayton Power and Light Company Load Profiling Methodology Revised 7/1/2017 Overview of Methodology Dayton Power and Light (DP&L) load profiles will be used to estimate hourly loads for customers without
More informationModel Selection. Frank Wood. December 10, 2009
Model Selection Frank Wood December 10, 2009 Standard Linear Regression Recipe Identify the explanatory variables Decide the functional forms in which the explanatory variables can enter the model Decide
More informationOnline Learning and Sequential Decision Making
Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning
More informationHomogeneity Pursuit. Jianqing Fan
Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles
More informationPREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou
PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS Chenlin Wu Yuhan Lou Background Smart grid: increasing the contribution of renewable in grid energy Solar generation: intermittent and nondispatchable
More informationUNIVERSITETET I OSLO
UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Examination in: STK4030 Modern data analysis - FASIT Day of examination: Friday 13. Desember 2013. Examination hours: 14.30 18.30. This
More informationFast learning rates for plug-in classifiers under the margin condition
Fast learning rates for plug-in classifiers under the margin condition Jean-Yves Audibert 1 Alexandre B. Tsybakov 2 1 Certis ParisTech - Ecole des Ponts, France 2 LPMA Université Pierre et Marie Curie,
More informationClassification using stochastic ensembles
July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationElectricity Demand Forecasting using Multi-Task Learning
Electricity Demand Forecasting using Multi-Task Learning Jean-Baptiste Fiot, Francesco Dinuzzo Dublin Machine Learning Meetup - July 2017 1 / 32 Outline 1 Introduction 2 Problem Formulation 3 Kernels 4
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationCS229 Final Project. Wentao Zhang Shaochuan Xu
CS229 Final Project Shale Gas Production Decline Prediction Using Machine Learning Algorithms Wentao Zhang wentaoz@stanford.edu Shaochuan Xu scxu@stanford.edu In petroleum industry, oil companies sometimes
More informationElectrical Energy Modeling In Y2E2 Building Based On Distributed Sensors Information
Electrical Energy Modeling In Y2E2 Building Based On Distributed Sensors Information Mahmoud Saadat Saman Ghili Introduction Close to 40% of the primary energy consumption in the U.S. comes from commercial
More informationSolar irradiance forecasting for Chulalongkorn University location using time series models
Senior Project Proposal 2102499 Year 2016 Solar irradiance forecasting for Chulalongkorn University location using time series models Vichaya Layanun ID 5630550721 Advisor: Assist. Prof. Jitkomut Songsiri
More informationSystems Operations. PRAMOD JAIN, Ph.D. Consultant, USAID Power the Future. Astana, September, /6/2018
Systems Operations PRAMOD JAIN, Ph.D. Consultant, USAID Power the Future Astana, September, 26 2018 7/6/2018 Economics of Grid Integration of Variable Power FOOTER GOES HERE 2 Net Load = Load Wind Production
More informationStats 170A: Project in Data Science Predictive Modeling: Regression
Stats 170A: Project in Data Science Predictive Modeling: Regression Padhraic Smyth Department of Computer Science Bren School of Information and Computer Sciences University of California, Irvine Reading,
More informationThat s Hot: Predicting Daily Temperature for Different Locations
That s Hot: Predicting Daily Temperature for Different Locations Alborz Bejnood, Max Chang, Edward Zhu Stanford University Computer Science 229: Machine Learning December 14, 2012 1 Abstract. The problem
More informationThe prediction of house price
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationMachine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression
Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression Due: Monday, February 13, 2017, at 10pm (Submit via Gradescope) Instructions: Your answers to the questions below,
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationMinwise hashing for large-scale regression and classification with sparse data
Minwise hashing for large-scale regression and classification with sparse data Nicolai Meinshausen (Seminar für Statistik, ETH Zürich) joint work with Rajen Shah (Statslab, University of Cambridge) Simons
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationINTRODUCTION TO DATA SCIENCE
INTRODUCTION TO DATA SCIENCE JOHN P DICKERSON Lecture #13 3/9/2017 CMSC320 Tuesdays & Thursdays 3:30pm 4:45pm ANNOUNCEMENTS Mini-Project #1 is due Saturday night (3/11): Seems like people are able to do
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationDavid John Gagne II, NCAR
The Performance Impacts of Machine Learning Design Choices for Gridded Solar Irradiance Forecasting Features work from Evaluating Statistical Learning Configurations for Gridded Solar Irradiance Forecasting,
More informationRemote Ground based observations Merging Method For Visibility and Cloud Ceiling Assessment During the Night Using Data Mining Algorithms
Remote Ground based observations Merging Method For Visibility and Cloud Ceiling Assessment During the Night Using Data Mining Algorithms Driss BARI Direction de la Météorologie Nationale Casablanca, Morocco
More informationl 1 and l 2 Regularization
David Rosenberg New York University February 5, 2015 David Rosenberg (New York University) DS-GA 1003 February 5, 2015 1 / 32 Tikhonov and Ivanov Regularization Hypothesis Spaces We ve spoken vaguely about
More informationMachine Learning. Regression basics. Marc Toussaint University of Stuttgart Summer 2015
Machine Learning Regression basics Linear regression, non-linear features (polynomial, RBFs, piece-wise), regularization, cross validation, Ridge/Lasso, kernel trick Marc Toussaint University of Stuttgart
More information1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as
ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available
More informationAnalysis of Fast Input Selection: Application in Time Series Prediction
Analysis of Fast Input Selection: Application in Time Series Prediction Jarkko Tikka, Amaury Lendasse, and Jaakko Hollmén Helsinki University of Technology, Laboratory of Computer and Information Science,
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationFORECASTING: A REVIEW OF STATUS AND CHALLENGES. Eric Grimit and Kristin Larson 3TIER, Inc. Pacific Northwest Weather Workshop March 5-6, 2010
SHORT-TERM TERM WIND POWER FORECASTING: A REVIEW OF STATUS AND CHALLENGES Eric Grimit and Kristin Larson 3TIER, Inc. Pacific Northwest Weather Workshop March 5-6, 2010 Integrating Renewable Energy» Variable
More informationSmoothed Prediction of the Onset of Tree Stem Radius Increase Based on Temperature Patterns
Smoothed Prediction of the Onset of Tree Stem Radius Increase Based on Temperature Patterns Mikko Korpela 1 Harri Mäkinen 2 Mika Sulkava 1 Pekka Nöjd 2 Jaakko Hollmén 1 1 Helsinki
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationBayesian Grouped Horseshoe Regression with Application to Additive Models
Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne
More informationCHAPTER 6 CONCLUSION AND FUTURE SCOPE
CHAPTER 6 CONCLUSION AND FUTURE SCOPE 146 CHAPTER 6 CONCLUSION AND FUTURE SCOPE 6.1 SUMMARY The first chapter of the thesis highlighted the need of accurate wind forecasting models in order to transform
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationDimension Reduction Methods
Dimension Reduction Methods And Bayesian Machine Learning Marek Petrik 2/28 Previously in Machine Learning How to choose the right features if we have (too) many options Methods: 1. Subset selection 2.
More informationGreene, Econometric Analysis (7th ed, 2012)
EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationpeak half-hourly South Australia
Forecasting long-term peak half-hourly electricity demand for South Australia Dr Shu Fan B.S., M.S., Ph.D. Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Business & Economic Forecasting Unit Report
More informationIs the test error unbiased for these programs? 2017 Kevin Jamieson
Is the test error unbiased for these programs? 2017 Kevin Jamieson 1 Is the test error unbiased for this program? 2017 Kevin Jamieson 2 Simple Variable Selection LASSO: Sparse Regression Machine Learning
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Matrix Notation Mark Schmidt University of British Columbia Winter 2017 Admin Auditting/registration forms: Submit them at end of class, pick them up end of next class. I need
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationLectures in AstroStatistics: Topics in Machine Learning for Astronomers
Lectures in AstroStatistics: Topics in Machine Learning for Astronomers Jessi Cisewski Yale University American Astronomical Society Meeting Wednesday, January 6, 2016 1 Statistical Learning - learning
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More information1 Handling of Continuous Attributes in C4.5. Algorithm
.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous
More information