Electrical Energy Modeling In Y2E2 Building Based On Distributed Sensors Information

Similar documents
Introduction to Support Vector Machines

Jeff Howbert Introduction to Machine Learning Winter

SVMs, Duality and the Kernel Trick

Support Vector Machines

An Improved Conjugate Gradient Scheme to the Solution of Least Squares SVM

Abstract. 1 Introduction. Cointerpretation of Flow Rate-Pressure-Temperature Data from Permanent Downhole Gauges. Deconvolution. Breakpoint detection

Predicting Time of Peak Foreign Exchange Rates. Charles Mulemi, Lucio Dery 0. ABSTRACT

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)

Support Vector Machines

Review: Support vector machines. Machine learning techniques and image analysis

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature

Support Vector Machines

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report

Formulation with slack variables

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

UVA CS 4501: Machine Learning

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina

Machine Learning. Regression basics. Marc Toussaint University of Stuttgart Summer 2015

Support Vector Machine for Classification and Regression

Introduction to Machine Learning Midterm Exam

Pattern Recognition 2018 Support Vector Machines

Support Vector Machine Regression for Volatile Stock Market Prediction

Reduced Order Greenhouse Gas Flaring Estimation

10-701/ Machine Learning - Midterm Exam, Fall 2010

Kernel Methods and Support Vector Machines

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Statistical Pattern Recognition

Introduction to Support Vector Machines

Prediction for night-time ventilation in Stanford s Y2E2 building

Machine Learning, Midterm Exam

Probabilistic Regression Using Basis Function Models

Machine Learning Lecture 6 Note

Introduction to Machine Learning Midterm Exam Solutions

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

CSE546: SVMs, Dual Formula5on, and Kernels Winter 2012

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Nearest Neighbors Methods for Support Vector Machines

Support Vector Machines

Introduction to SVM and RVM

Machine Learning, Fall 2009: Midterm

ML (cont.): SUPPORT VECTOR MACHINES

Nonlinear System Identification using Support Vector Regression

Lecture 10: A brief introduction to Support Vector Machine

Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013

(Kernels +) Support Vector Machines

Robust Building Energy Load Forecasting Using Physically-Based Kernel Models

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Support Vector Machines. Maximizing the Margin

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Support vector machines Lecture 4

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Introduction to Logistic Regression and Support Vector Machine

Outliers Treatment in Support Vector Regression for Financial Time Series Prediction

PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou

SVMC An introduction to Support Vector Machines Classification

Learning with kernels and SVM

Kernel Methods. Barnabás Póczos

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning

PAC-learning, VC Dimension and Margin-based Bounds

Support Vector Machine (continued)

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines

Max Margin-Classifier

Support Vector Machine (SVM) and Kernel Methods

A Sparse Linear Model and Significance Test. for Individual Consumption Prediction

Support Vector Machines

Kernel Methods in Machine Learning

Kernel methods, kernel SVM and ridge regression

PAC-learning, VC Dimension and Margin-based Bounds

10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers

Machine Learning. Model Selection and Validation. Fabio Vandin November 7, 2017

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine

i=1 cosn (x 2 i y2 i ) over RN R N. cos y sin x

SVM optimization and Kernel methods

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Kernelized Perceptron Support Vector Machines

SUPPORT VECTOR MACHINE

Second Order Cone Programming, Missing or Uncertain Data, and Sparse SVMs

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

CS246 Final Exam, Winter 2011

Outline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren

Predicting Future Energy Consumption CS229 Project Report

Classifier Complexity and Support Vector Classifiers

SUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE

Machine Learning, Fall 2011: Homework 5

Square Penalty Support Vector Regression

Homework 3. Convex Optimization /36-725

Support Vector Machines for Classification and Regression

Machine Learning. Support Vector Machines. Manfred Huber

Support Vector Machines

Perceptron Revisited: Linear Separators. Support Vector Machines

Machine Learning - MT Linear Regression

CS446: Machine Learning Fall Final Exam. December 6 th, 2016

Support Vector and Kernel Methods

CSC 411 Lecture 17: Support Vector Machine

An Introduction to Machine Learning

Transcription:

Electrical Energy Modeling In Y2E2 Building Based On Distributed Sensors Information Mahmoud Saadat Saman Ghili Introduction Close to 40% of the primary energy consumption in the U.S. comes from commercial and residential buildings []. Therefore, reducing this energy consumption is very important, both economically, and environmentally due to the amount of CO2 emitted in the process of generating the electricity). Predicting the energy consumption of buildings energy modeling) is a key component in reducing the building energy consumption. Two traditional types of energy modeling are forward modeling using purely physical simulations) and inverse modeling using statistical methods along with expert knowledge, to relate the energy consumption to a set of general inputs such as the outdoor temperature, etc.) [2]. If we have access to sensor data both inside and outside the building) however, a better alternative is sensor based energy modeling using statistical machine learning techniques. [2] uses seven different machine learning algorithms to predict the total hourly energy consumption of three residential buildings called Campbell Creek Homes) designed to evaluate the effectiveness of residential construction and efficiency technologies in a controlled environment. In this project, we apply three of these algorithms Simple Linear Regression, Support Vector Regression or SVR, and Least Squares Support Vector Machines or LS-SVM) to the energy consumption and sensor data from the Y2E2 building on Stanford campus. This rich data-set allows us to model different parts of the total consumption i.g. lighting, AC, plug loads, etc.) separately, which could lead to more accurate results compared to the case where we are modeling the total consumption as a whole. 2 The Y2E2 data The Yang and Yamazaki Environment & Energy Building Y2E2), is a good case study for energy performance modeling. A number of electrical power sensors are installed throughout the building. The power sensors measure lighting, plug load, and overall power at different floors and portions of the building. As an example, Figure a below shows that light is measured on every floor, plug loads are measured on west and east portions of each floor separately and a number of other power sensors such as the server room, cafe, etc. are also available. The sensors data is collected every minute in SEE-IT database and can be accessed, plotted, and exported using SEE-IT application. An example of a plot generated by SEE-IT is shown in Figure b. In this plot, the electrical power of the caf on the first floor and outside lighting are illustrated in a one-week period from 3//203 to 3/8/202.

a) The location of sensors. b) Sample power consumption time series plotted by SEE-IT. 3 Algorithms, attributes, and features 3. Algorithms Let x be a vector of attributes, φx) be a feature mapping, and {x i), y i) )} m i= be our training set. We are looking for a hypothesis in the form of h w,b x) = w T φx) + b. In linear regression, w and b are minimizers of m i= hw,b x i) ) y i)) 2. In SVR, w and b are obtained from the following quadratic program [3]: minimize w,b 2 wt w + C m ξ i + ξi ) 3.) i= y i) w T φx i) ) b ɛ + ξ i, i m w T φx i) ) + b y i) ɛ + ξi, i m and in LS-SVM we have to solve the following optimization problem [4]: minimize w,b 2 wt w + C m ξi 2 3.2) i= w T φx i) ) + b + ξ i = y i), where C and ɛ are tuning parameters, and ξ i and ξ i i m are slack variables. 3.2 Attributes First we note that time measured in hours) should be included in the model for any part of the total consumption. If everything else was fixed, one would expect φ to be a periodic function of time period=h). Therefore we include sin ) ) t 2π and cos t 2π in our attributes. Besides time, we could include different attributes when modeling each data-set. In modeling the lighting power, we add the illumination L measured in Lux). In modeling the power consumed by the AC system however, we add two other attributes, the outside dry bulb temperature T, and an binary variable s indicating whether the main window is open or closed on each floor. Also note that some of these attributes are different on different floors, which means that some of the modelings should be done separately for each floor. When modeling the total power 2

consumption of the building, we include all of these attributes. In order to distinguish between weekends or holidays and weekdays, we can add a binary attribute d that is on the weekdays and 0 otherwise, or we can model them separately. 3.3 Features If we use the kernel trick for training in LS-SVM and SVR, we will not need to explicitly write down the features. However, in LS, or when we are using a linear kernel in LS-SVM and SVR, we need to state the features explicitly. Here we briefly explain what features we are going to use for those cases. For features that are functions of time, because of the periodicity, we use 2n t features {sin ) ) kt 2π, cos kt 2π } n t k=, where n t will be determined using a feature selection scheme. For the outside temperature and illumination, we use polynomials of degrees n T and n L respectively. Although the degree of each of these polynomials can also be determined using a feature selection scheme, they are easier to tune intuitively. For example, it makes sense to use a polynomial of degree 3 in the outside temperature. The reason is that a cubic polynomial can capture the facts that i) the needed power is positive if the outside temperature is either below or above the set-point temperature, and ii) the power needed for cooling is different from that of heating. Finally, the interaction of these features can also be used. Specifically, we should include the interaction of the holiday and weekend indicator d with all the other parameters. This is equivalent to modeling these days and the weekdays separately. 4 Implementation and results We train the LS and LS-SVM for the power consumption data gathered every over a 9 months from February st to October st, 203) once every 5 minutes, and assess the accuracy of the trained model by looking at the Coefficient of Variance CV) for a test data-set of power consumption collected over an almost) three-week period from October 4 th to October 23 rd, 203, one data point per hour). The Coefficient of Variance is defined as follows: CV = N N i= y i) ŷ i)) 2 00 4.) ȳ where N is the size of the test set, y i) s are the actual consumption values, and ŷ i) s are the oned predicted by the model. Before training, a simple preprocessing was performed by scaling all the attributes, so that they all are of the same order. We then trained different models to the following data-sets: AC power for each of the three floors, the lighting power of the building, the plug load for the whole building, and the total power consumption. We trained Linear Least Squares, and LS-SVM with three different Kernels: linear, polynomial, and Gaussian. The tuning parameters were found by a 4-fold cross validation and grid-search as suggested in [5]. We implemented both methods in MATLAB. We also tried implementing SVR. However the slow running time of the optimization problem for SVR due to inequality constraints even when using LibSVM), together with the facts that there is one more regularization parameter ɛ) and we use a bruteforce grid search method for picking the cross-validation parameters made the feasible size of The Y2E2 sensor data is gathered every minute, however using all the data would hugely increase the training size without adding much information due to the auto-correlation in the time series data. 3

the data set on which we could train the model very small, and we decided to include the results only for LS and LS-SVM. We used forward feature selection in choosing n t, n L, and n T. In all cases, n t was either 2 or 3, n L was or 2, and n T was 2 or 3. When using a Gaussian kernel K G x, z) = exp x z 2 2 2σ 2 ), the parameter σ was included in the cross-validation with grid-search. For a polynomial kernel K P x, z) = ) x T DP z + c P, c P was found in the grid search, while D P was incremented until the validation error started to increase similar to forward selection). As a final note before we present the results, we note that in building energy modeling, the CV errors are usually larger compared to modeling the energy consumption of a city, a state, or a country. The reason is that, roughly speaking, the power consumption time series in a building is noisier than that of the whole city or state. For example, the CV errors obtained in [2] are of the order of %0, and sometimes as high as %40 for almost all the methods tried. Table summarizes the CV errors in percent, rounded to the nearest integer) for the application of different methods to different data sets. Although we have modeled the AC power for different floors separately because the window parameter is different for them), we present the CV error for the sum of these powers here. Total Consumption Plug Loads Lighting AC Power LS 9% 5% % 7% LS-SVM - Linear Kernel 9% 4% % 5% LS-SVM - Polynomial Kernel % 20% 33% 7% LS-SVM - Gaussian Kernel 6% 4% 25% 5% Table : The CV errors in prediction of different hourly power consumptions by different methods. We can see that LS-SVM with a linear kernel is at least as good as LS, and they are especially very close to each other when the error is large. The first part is expected form the definition of these methods. One can easily show that with the same feature mappings, the solution of LS-SVM converges to that of LS if we send the regularization parameter C to infinity. In the cases where the two models are close, our cross-validation chose a high value for C. LS-SVM with a Gaussian kernel seems to be the most accurate learning method for all of the data-sets, while LS-SVM with a polynomial kernel seems to have the worst accuracy. We suspect that the sub-par performance of the polynomial kernel might be due to effect of the parameter C P. A large C P would give higher wights to the interactions between different attributes. Physical intuition and the results for LS-SVM with a linear kernel) suggest that monomial functions of the attributes T,L, and t are more important predictors of the power consumption than their interactions, which suggests that a small C P should be used. However, as explained before, the interaction between the holiday indicator attribute d and the other variables helps us capture the difference between the power consumption on the weekends and the weekdays, which we may not be able to capture very well with a small C P. Although CV is a good indicator for performance of our methods, there are qualitative differences between the methods that are not reflected through CV. Comparing the predicted time series with the actual one can be useful for a qualitative assessment of the algorithms. Here we look a data set that all four methods approximated relatively well, the AC power consumption. Figure shows the actual and predicted AC power consumption on the second floor for each method over our 20-day test period. 4

0 00 200 0 400 500 3 3 29 29 27 27 25 25 23 23 20 0 00 200 0 400 500 0 00 200 0 400 500 20 0 00 200 0 400 500 c) Linear regression LS). d) LS-SVM with linear e) LS-SVM with Gaussian f) LS-SVM with polynomial Figure : The predicted AC power consumption on the 2 nd floor. In all plots the red curve is the predicted time series and the blue curve is the actual power data. It can be seen that all methods are more accurate on the weekdays, compared to the weekends and holidays. This is likely because for those days i) there are more training examples, and ii) the consumption patterns are more regular. If we wanted to compare the methods qualitatively, again LS-SVM with a Gaussian kernel seems to be the best. Specifically, it is more accurate on the weekends. LS-SVM with linear kernel is a close second. Its performance on the weekends is close to or maybe even slightly better) than Gaussian kernel, however it shows a slight undershoot on the weekdays. Simple regression has good accuracy on the weekdays, but has a large undershoot on the weekends. And finally, LS-SVM with polynomial is less accurate than all the other methods, the weekdays, and on the weekends, its accuracy is close to that of LS, while worse than the other two. 5 Conclusion In this project, we implemented the linear Least squares, and the support vector machines least squares algorithms with various kernel functions for predicting the hourly power consumption in the using the sensor measurement data in the Y2E2 building, and compared qualitatively and quantitatively) the accuracy of the results obtained using these methods. References [] U.S. Department of Energy, Buildings Energy Data Book, D&R International, Ltd., 200, available at: http://buildingsdatabook.eren.doe.gov/. [2] Richard E. Edwards, Joshua New, and Lynne E. Parker. Predicting future hourly residential electrical consumption: A machine learning case study. energy and Buildings, 490) 202) 59-603 [3] A.J. Smola and B. Schlkopf, A tutorial on Support Vector Regression. Statistics and Computing, 4 3) 2004) 992. [4] J. Suykens, T.V. Gestel, J.D. Brabanter, B.D. Moor, J. Vandewalle, Least Squares Support Vector Machines. World Scientific Publ. Co. Inc., 2002. [5] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Line, A Practical Guide to Support Vector Classification 200. 5