ClassCast: A Tool for Class-Based Forecasting

Similar documents
Load Forecasting Using Artificial Neural Networks and Support Vector Regression

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

A Hybrid Method of CART and Artificial Neural Network for Short-term term Load Forecasting in Power Systems

Section 7: Discriminant Analysis.

The Dayton Power and Light Company Load Profiling Methodology Revised 7/1/2017

Peak Load Forecasting

Implementation of Artificial Neural Network for Short Term Load Forecasting

Package TSPred. April 5, 2017

Abram Gross Yafeng Peng Jedidiah Shirey

ANN based techniques for prediction of wind speed of 67 sites of India

Electricity Demand Forecasting using Multi-Task Learning

A Hybrid Model of Wavelet and Neural Network for Short Term Load Forecasting

Calendarization & Normalization. Steve Heinz, PE, CEM, CMVP Founder & CEO EnergyCAP, Inc.

BV4.1 Methodology and User-friendly Software for Decomposing Economic Time Series

Short Term Load Forecasting Using Multi Layer Perceptron

Multi-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts

LOADS, CUSTOMERS AND REVENUE

ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe

Fuzzy Systems. Introduction

Integrated Electricity Demand and Price Forecasting

Supervisor: Prof. Stefano Spaccapietra Dr. Fabio Porto Student: Yuanjian Wang Zufferey. EPFL - Computer Science - LBD 1

Short Term Load Forecasting Based Artificial Neural Network

A Support Vector Regression Model for Forecasting Rainfall

A Fuzzy Logic Based Short Term Load Forecast for the Holidays

Fuzzy Systems. Introduction

Sparse Gaussian conditional random fields

Tutorial. Getting started. Sample to Insight. March 31, 2016

Mermaid. The most sophisticated marine operations planning software available.

Appendix 4 Weather. Weather Providers

Context-based Reasoning in Ambient Intelligence - CoReAmI -

That s Hot: Predicting Daily Temperature for Different Locations

GIS Visualization Support to the C4.5 Classification Algorithm of KDD

Weighted Fuzzy Time Series Model for Load Forecasting

FEATURE REDUCTION FOR NEURAL NETWORK BASED SMALL-SIGNAL STABILITY ASSESSMENT

Forecast solutions for the energy sector

MDSS Functional Prototype Display System Preview April 2002

Reading, UK 1 2 Abstract

HOMEWORK #4: LOGISTIC REGRESSION

JDemetra testing Statistics Finland. Henri Luomaranta / Lauri Saarinen

Probabilistic Graphical Models for Image Analysis - Lecture 1

GIS Functions and Integration. Tyler Pauley Associate Consultant

LOAD POCKET MODELING. KEY WORDS Load Pocket Modeling, Load Forecasting

Demand Forecasting. for. Microsoft Dynamics 365 for Operations. User Guide. Release 7.1. April 2018

Search for the Gulf of Carpentaria in the remap search bar:

Influence of knn-based Load Forecasting Errors on Optimal Energy Production

Bloomsburg University Weather Viewer Quick Start Guide. Software Version 1.2 Date 4/7/2014

CityGML XFM Application Template Documentation. Bentley Map V8i (SELECTseries 2)

Design of a Weather- Normalization Forecasting Model

Lecture 5. Symbolization and Classification MAP DESIGN: PART I. A picture is worth a thousand words

Algorithmic probability, Part 1 of n. A presentation to the Maths Study Group at London South Bank University 09/09/2015

Electricity Demand Probabilistic Forecasting With Quantile Regression Averaging

Radio observations of the Milky Way from the classroom

A Comparison Between Polynomial and Locally Weighted Regression for Fault Detection and Diagnosis of HVAC Equipment

Exploratory Data Analysis and Decision Making with Descartes and CommonGIS

Internet Engineering Jacek Mazurkiewicz, PhD

ADRIANA.Code and SONNIA. Tutorial

Using SPSS for One Way Analysis of Variance

Copernicus Data Driven Services for Regional & Local Government in Greece

One-way ANOVA (Single-Factor CRD)

You w i ll f ol l ow these st eps : Before opening files, the S c e n e panel is active.

LOAD FORECASTING APPLICATIONS for THE ENERGY SECTOR

CWMS Modeling for Real-Time Water Management

Short-Term Load Forecasting Using ARIMA Model For Karnataka State Electrical Load

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report

Materials Science Forum Online: ISSN: , Vols , pp doi: /

Demand Forecasting Models

Defining Normal Weather for Energy and Peak Normalization

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A.

The Development of a Quality Control and Analysis Application for the ThermoFluor High Throughput Screening Assay

LEC1: Instance-based classifiers

Distributed Clustering and Local Regression for Knowledge Discovery in Multiple Spatial Databases

How to Model Stream Temperature Using ArcMap

Modelling the Electric Power Consumption in Germany

Multivariate Regression Model Results

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES

CS 188: Artificial Intelligence Spring Announcements

Fuzzy Logic Approach for Short Term Electrical Load Forecasting

A BASE SYSTEM FOR MICRO TRAFFIC SIMULATION USING THE GEOGRAPHICAL INFORMATION DATABASE

On Computational Limitations of Neural Network Architectures

Orbital Insight Energy: Oil Storage v5.1 Methodologies & Data Documentation

Matlab Software to Determine the Saving in Parallel Pumps Optimal Operation Systems, by Using Variable Speed.

EasySDM: A Spatial Data Mining Platform

STATISTICAL LOAD MODELING

Machine Learning for Physicists Lecture 1

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

PP - Work Centers HELP.PPBDWKC. Release 4.6C

An Improved Method of Power System Short Term Load Forecasting Based on Neural Network

About Nnergix +2, More than 2,5 GW forecasted. Forecasting in 5 countries. 4 predictive technologies. More than power facilities

THE CRYSTAL BALL SCATTER CHART

Bayesian Networks Basic and simple graphs

Wind Power Capacity Assessment

Short Term Load Forecasting via a Hierarchical Neural Model

News Debate: Snowed Out!

Machine learning for pervasive systems Classification in high-dimensional spaces

Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting

CSC 4510 Machine Learning

DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA

Transcription:

c Springer, 19th International GI/ITG Conference on Measurement, Modelling and Evaluation of Computing Systems (MMB). This is an author s version of the work with permission of Springer. Not for redistribution. ClassCast: A Tool for Class-Based Forecasting Florian Heimgaertner, Thomas Sachs, and Michael Menth University of Tuebingen, Chair of Communication Networks, Sand 13, 72076 Tuebingen, Germany {florian.heimgaertner,menth}@uni-tuebingen.de, thomas.sachs@student.uni-tuebingen.de Abstract. We present ClassCast, a tool for class-based forecasting. It partitions an input time series into class-specific time series. It uses conventional prediction methods for each of these time series and maps class-specific values to classified future time indices. It is a simple means to account for non-linear factors in time-series for the purpose of prediction. Keywords: Forecasting, prediction, tool 1 Introduction Various approaches have been discussed for load forecasting in the domain of electrical energy [1, 2]. Especially methods based on machine learning, such as artificial neural networks [3] or support vector machines [4], have attracted interest. However, when looking at time series y j of energy consumption of administrative buildings, a clear dependency on working and non-working days and on the weather can be observed. Therefore, a simple forecast method accounting for these influencing factors can outperform sophisticated forecast methods neglecting that influence. This basic forecasting concept is known as similardays method [5]. A very simple forecast is the average energy consumption specific for working and non-working days. The average prediction may be further adapted to cold and warm working and non-working days. We propose classbased forecasting as a generalization of the similar-days method. Cold/warm and working/non-working are an example for a classification c j of the time index j. Besides basic calendar information, additional data in the time series that is known for both the past and the future can be used for classification, e.g., scheduled operating times of specific devices. Essentially, we compute classspecific means Y (c) with c C and C being the class range. These class-specific means may be used for prediction y j = Y (c j ) of future time indices j which are classified as c j. c Springer, 19 th International GI/ITG Conference on Measurement, Modelling and Evaluation of Computing Systems (MMB), Erlangen, Germany, February 2018

2 Fig. 1. Operation of ClassCast. This class-based forecast method can be adapted to more complex forecasting methods. We exemplify this for linear regression. It predicts future values y j with a dependence on a regressor x j : y j = β 0 + β 1 x j. Class-specific forecast implies that linear regression is applied to a class-specific subsets of the time series with (yj c) = (y j) j:cj=c for c C. Based on those time series, class-specific forecast models {(β0, c β1) c : c C} are derived for each class c C which predict y j = β cj 0 + βcj 1 x j for j : c j = c. We have developed ClassCast [6], a tool for class-based forecast supporting prediction based on mean values and linear regression. This work is structured as follows. Section 2 provides a high-level description of ClassCast. Section 3 explains the concept of the tool and describes the classification and forecasting mechanism. In Section 4, we present implementation details and the user interface of ClassCast. Section 6 concludes the paper. 2 Tool Description The input of ClassCast is an annotated time series y j. The data series consists of past values of the dependent variable, i.e., the variable to be forecasted. Annotations are values of one or multiple independent variables x i j, 0 i n that are supposed to influence the values of the dependent variables. An example for a dependent variable is the total energy consumption while independent variables could be day of week, time of day, outside temperature, or the state of a monitored device. The second part of the input is the future time series. The future time series consists of values for the independent variables but does not contain values for the dependent variable. The independent variables of the future time series are used to forecast values of the dependent variables using the approaches presented in Section 3. Figure 1 depicts the operation of ClassCast. Past time series y j x i j and future time series x i j are supplied to the tool. For each independent variable of the input time series, the user selects whether it is discarded, used as classifier, or (if applicable) as regressor. This selection determines the forecasting method. If one or multiple independent variables are selected as regressors, linear regression is used for forecasting. Otherwise, class-specific averages are used for forecasting.

3 The future values for the dependent variables y j are computed based on the combination of values for the independent variables x i j of the future time series. The forecast can be saved to a file or visualized for further analysis. 3 Concept The past data set for the forecast consists of the time series (y j ) j=s,..., 1 where y j are the dependent variables and j are the time series indices. The index s < 0 denotes the start index of the time series and is negative as it relates to the past. Additionally, the input data set contains n series of independent variables ( ) x i i=0,...,n 1 j. The future data set starts at the index 0 and contains values j=s,..., 1 ) i=0,...,n 1 for the independent variables ( x i j where m denotes the forecasting j=0,...,m horizon. ClassCast uses information from the past and future data set to compute the forecast time series (y j ) j=0,...,m. The basic concept of ClassCast is classifying time series entries according to the values of the independent variables. The independent variables are elements of an n-dimensional annotation space X = X 0... X n 1. A function f : X C maps the annotations to a q-dimensional class space C = C 0... C q 1. The mapping function f of our tool is limited in the way that any annotation dimension is mapped to at most one class dimension or not used for the mapping. Furthermore, any class dimension is determined by exactly one annotation dimension so that q n holds. We use the mapping function f to calculate class-specific model parameters and to determine the class of future independent variables x j as a base for class-based forecasting. Class-specific means are calculated by Y (c) = 1 J c y j, for c C, J c = {j : f(x j ) = c} j J c If at least one independent variable is selected as regressor, the past time series is partitioned into class-specific subsets (y c j ) = (y j) j:f(xj)=c. Linear regression can be used to compute regression parameters β 0, β 1 for each class-specific partition. Independent variables can be given as symbolic or numeric values. The scale of measure [7] for the independent variables is derived from the input data type. Symbolic values (e.g., Monday, Tuesday,... ) are interpreted as nominal scale, integers as ordinal scale, and real numbers as interval scale. By default, ClassCast uses independent variables on nominal and ordinal scale as classifiers and independent variables on interval scale as regressors. If an independent variable containing real numbers is manually selected as classifier, classes need to be defined based on intervals. 4 Implementation ClassCast is implemented in Java and features a graphical user interface (GUI) based on the JavaFX [8] framework. Input data series are processed in comma-

4 Fig. 2. Validation of an electrical load forecast. separated values (CSV) format. The tool automatically detects separators (comma, semicolon, tab), line break encoding, and input data types. Values for independent variables can be given as strings, integer numbers, or floating point numbers. Strings are interpreted as nominal scale, integers as ordinal scale, and floating point numbers as interval scale. The first line of the input data file is interpreted as heading and is used for visualization only. If the past data set is insufficient for forecasting, ClassCast presents a dialogue to the user for selecting an independent variable to discard. E.g., if the past data series is a time series recorded in winter, a forecast for summer cannot be computed using class-specific means without discarding the independent variables representing the weather. ClassCast can be used both as an interactive tool with a GUI and as a non-interactive command line tool. In non-interactive mode, a CSV file is given as command line argument. Configuration like interval size for floating point classifiers can be given as option. The forecast data series is written to standard output in the same CSV format as used for the input data series. 5 Forecast Validation ClassCast implements a very generic forecast approach. The forecast quality depends on the availability of sufficient data for the input time series, the existence of a strong dependence of dependent variables y j on independent variables x i j, and the selection of appropriate classifiers. Therefore, ClassCast includes an interactive validation feature. For validation, the input data series is split in two parts. The first part is used to forecast the second part of the data series. The forecast can be visually compared to the actual measurement values and the mean square error can be computed to quantify the forecast quality.

5 Figure 2 shows a screenshot of the interactive validation, with an electrical load time series sampled from a school building in Germany. The forecast is computed as class-specific means with day of week and time of day as classifiers. The measured loads are plotted in blue, the forecast for the second part of the time series is plotted in green. 6 Conclusion In this paper, we presented ClassCast, a simple tool for class-based forecasting. ClassCast classifies entries of time series according to the values of the independent variables and predicts future values for dependent variables using class-specific means or linear regression. The conception of ClassCast was motivated by energy load forecasting. However, ClassCast is agnostic to input data semantics. Therefore, it can be applied for forecasting time series in other domains under the following two conditions. First, time indices can be classified based on annotated information (independent variables). Second, this annotated information has a major impact on the values of interest (dependent variables). Acknowledgement. The research leading to these results received funding from the German Ministry for Economic Affairs and Energy under the ZIM programme (Zentrales Innovationsprogramm Mittelstand), grant reference no. 16KN039521. The authors alone are responsible for the content of this paper. The authors thank Prof. Joachim Grammig for fruitful discussions and helpful advice. References 1. Feinberg, E.A., Genethliou, D.: Applied Mathematics for Restructured Electric Power Systems: Optimization, Control, and Computational Intelligence pp. 269 285 (2005) 2. Campbell, P.R., Adamson, K.: Methodologies for Load Forecasting. In: International IEEE Conference on Intelligent Systems (2006) 3. Park, D.C., El-Sharkawi, M., Marks, R., Atlas, L., Damborg, M.: Electric Load Forecasting Using an Artificial Neural Network. IEEE Transactions on Power Systems 6(2), 442 449 (1991) 4. Chen, B.J., Chang, M.W., et al.: Load Forecasting Using Support Vector Machines: A Study on EUNITE Competition 2001. IEEE Transactions on Power Systems 19(4), 1821 1830 (2004) 5. Mu, Q., Wu, Y., Pan, X., Huang, L., Li, X.: Short-Term Load Forecasting Using Improved Similar Days Method. In: IEEE PES Power and Energy Engineering Conference (APPEEC) (2010) 6. Heimgaertner, F., Sachs, T., Menth, M.: ClassCast. https://github.com/uni-tuekn/classcast (2018) 7. Stevens, S.S.: On the theory of scales of measurement. Science 103(2684), 677 680 (1946) 8. Oracle Corporation: JavaFX API Documentation. https://docs.oracle.com/javase/8/javafx/api/ (2015)