Stochastic optimal control of a domestic microgrid equipped with solar panel and battery

Similar documents
An introduction to the theory of SDDP algorithm

Subway stations energy and air quality management

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

Vehicle Arrival Models : Headway

Basic Circuit Elements Professor J R Lucas November 2001

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Lecture 20: Riccati Equations and Least Squares Feedback Control

Financial Econometrics Jeffrey R. Russell Midterm Winter 2009 SOLUTIONS

Energy Storage and Renewables in New Jersey: Complementary Technologies for Reducing Our Carbon Footprint

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

Chapter 3 Boundary Value Problem

Energy Storage Benchmark Problems

Notes on Kalman Filtering

SUPPLEMENTARY INFORMATION

Random Walk with Anti-Correlated Steps

CHAPTER 12 DIRECT CURRENT CIRCUITS

Reading from Young & Freedman: For this topic, read sections 25.4 & 25.5, the introduction to chapter 26 and sections 26.1 to 26.2 & 26.4.

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

BU Macro BU Macro Fall 2008, Lecture 4

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

3.1 More on model selection

CENTRALIZED VERSUS DECENTRALIZED PRODUCTION PLANNING IN SUPPLY CHAINS

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

Air Traffic Forecast Empirical Research Based on the MCMC Method

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is

Scheduling of Crude Oil Movements at Refinery Front-end

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

Math 333 Problem Set #2 Solution 14 February 2003

Lecture 2 October ε-approximation of 2-player zero-sum games

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

The Brock-Mirman Stochastic Growth Model

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

Chapter 2. First Order Scalar Equations

STATE-SPACE MODELLING. A mass balance across the tank gives:

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Online Appendix to Solution Methods for Models with Rare Disasters

Computation of the Effect of Space Harmonics on Starting Process of Induction Motors Using TSFEM

Final Spring 2007

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

6.01: Introduction to EECS I Lecture 8 March 29, 2011

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Class Meeting # 10: Introduction to the Wave Equation

A Dynamic Model of Economic Fluctuations

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Testing for a Single Factor Model in the Multivariate State Space Framework

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

5. Stochastic processes (1)

Robert Kollmann. 6 September 2017

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Single and Double Pendulum Models

arxiv: v1 [math.ca] 15 Nov 2016

1 Review of Zero-Sum Games

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

On a Discrete-In-Time Order Level Inventory Model for Items with Random Deterioration

Two Coupled Oscillators / Normal Modes

t dt t SCLP Bellman (1953) CLP (Dantzig, Tyndall, Grinold, Perold, Anstreicher 60's-80's) Anderson (1978) SCLP

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Linear Response Theory: The connection between QFT and experiments

6.2 Transforms of Derivatives and Integrals.

Christos Papadimitriou & Luca Trevisan November 22, 2016

Global Optimization for Scheduling Refinery Crude Oil Operations

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Online Convex Optimization Example And Follow-The-Leader

8. Basic RL and RC Circuits

Some Basic Information about M-S-D Systems

Approximation Algorithms for Unique Games via Orthogonal Separators

Simulating models with heterogeneous agents

A Hop Constrained Min-Sum Arborescence with Outage Costs

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

EECE251. Circuit Analysis I. Set 4: Capacitors, Inductors, and First-Order Linear Circuits

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Comparison between the Discrete and Continuous Time Models

Notes for Lecture 17-18

The expectation value of the field operator.

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time.

Unsteady Flow Problems

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL

An Introduction to Stochastic Programming: The Recourse Problem

14 Autoregressive Moving Average Models

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC

RC, RL and RLC circuits

Lecture 3: Exponential Smoothing

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

CHAPTER 2 Signals And Spectra

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

Competitive and Cooperative Inventory Policies in a Two-Stage Supply-Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

20. Applications of the Genetic-Drift Model

Transcription:

1 Sochasic opimal conrol of a domesic microgrid equipped wih solar panel and baery François Pacaud, Pierre Carpenier, Jean-Philippe Chancelier, Michel De Lara Efficaciy ENSTA CERMICS-ENPC arxiv:1801.06479v1 mah.oc] 19 Jan 018 Absrac Microgrids are inegraed sysems ha gaher and operae energy producion unis o saisfy consumers demands. This paper deails differen mahemaical mehods o design he Energy Managemen Sysem EMS) of domesic microgrids. We consider differen socks coupled ogeher a baery, a domesic ho waer ank and decenralized energy producion wih solar panel. The main challenge of he EMS is o ensure, a leas cos, ha supply maches demand for all ime, while considering he inheren uncerainies of such sysems. We benchmark wo opimizaion algorihms o manage he EMS, and compare hem wih a heurisic. The Model Predicive Conrol MPC) is a well known algorihm which models he fuure uncerainies wih a deerminisic forecas. By conras, Sochasic Dual Dynamic Programming SDDP) models he fuure uncerainies as probabiliy disribuions o compue opimal policies. We presen a fair comparison of hese wo algorihms o conrol microgrid. A comprehensive numerical sudy shows ha i) opimizaion algorihms achieve significan gains compared o he heurisic, ii) SDDP ouperforms MPC by a few percens, wih a reasonable compuaional overhead. A. Conex I. INTRODUCTION A microgrid is a local energy nework ha produces par of is energy and conrols is own demand. Such sysems are complex o conrol, because of he differen socks and inerconnecions. Furhermore, a local scale, elecrical demands and weaher condiions hea demand and renewable energy producion) are highly variable and hard o predic; heir sochasic naure adds uncerainy o he sysem. We consider here a domesic microgrid see Figure 1), equipped wih a baery, an elecrical ho waer ank and a solar panel. We use he baery o sore energy when prices are low or when he producion of he solar panel is above he elecrical demand. The microgrid is conneced o an exernal grid o impor elecriciy when needed. Furhermore, we model he building s envelope o ake advanage of he hermal ineria of he building. Hence, he sysem has four socks o sore energy: a baery, a ho waer ank, and wo passive socks being he building s walls and inner rooms. Two kind of uncerainies affec he sysem. Firs, he elecrical and domesic ho waer demands are no known in advance. Second, he producion of he solar panel is heavily perurbed because of he varying nebulosiy affecing heir producion. We aim o compare wo classes of algorihms o ackle he uncerainy in microgrid Energy Managemen Sysems EMS). The renowned Model Predicive Conrol MPC) algorihm views he fuure uncerainies wih a deerminisic forecas. Then, MPC relies on deerminisic opimizaion algorihms BATTERY THERMAL DEMAND f b f Figure 1. Elecrical microgrid f ne NETWORK TANK f h φ pv d el d hw SOLAR PANEL ELECTRICAL DEMAND DOMESTIC HOT WATER o compue opimal decisions. The conender Sochasic Dual Dynamic Programming SDDP) is an algorihm based on he Dynamic Programming principle. Such algorihm compues offline a se of value funcions by backward inducion; opimal decisions are compued online as ime goes on, using he value funcions. We presen a balanced comparison of hese wo algorihms, and highligh he advanages and drawbacks of boh mehods. B. Lieraure 1) Opimizaion and energy managemen sysems: Energy Managemen Sysems EMS) are inegraed auomaed ools used o monior and conrol energy sysems. The design of EMS for buildings has raised ineres in recen years. In 1], he auhors give an overview concerning he applicaion of opimizaion mehods in designing EMS. The well-known Model Predicive Conrol MPC) ] has been widely used o conrol EMS. We refer noably o 3], 4], 5] for applicaions of MPC in buildings. Differen soluions are invesigaed o ackle uncerainies, such as Sochasic MPC 3] or robus opimizaion 6]. ) Sochasic Opimizaion: as we said, a local scale, elecrical demand and producion are highly variable, especially as microgrids are expeced o absorb renewable energies. This leads o pay aenion o sochasic opimizaion approaches. Apar from microgrid managemen, sochasic opimizaion

has found some applicaions in energy sysems see 7] for an overview). Hisorically, sochasic opimizaion has been widely applied o hydrovalleys managemen 8]. Oher applicaions have arisen recenly, such as inegraion of wind energy and sorage 9] or insulaed microgrids managemen 10]. Sochasic Dynamic Programming SDP) 11] is a general mehod o solve sochasic opimal conrol problems. In energy applicaions, a varian of SDP, Sochasic Dual Dynamic Programming SDDP), has proved is adequacy for large scale applicaions. SDDP was firs described in he seminal paper 8]. We refer o 1] for a generic descripion of he algorihm and is applicaion o he managemen of hydrovalleys. A proof of convergence in he linear case is given in 13], and in he convex case in 14]. Wih he growing adopion of sochasic opimizaion mehods, new researches aim o compare algorihms such as SDP and SDDP wih MPC. We refer o he recen paper 15]. C. Srucure of he paper We deail a modelling of he microgrid in Sec. II, hen formulae an opimizaion problem in Sec. III. We ouline he differen opimizaion algorihms in Sec. IV. Finally, we provide in Sec. V numerical resuls concerning he managemen of he microgrid. II. ENERGY SYSTEM MODEL In his secion, we depic he physical equaions of he energy sysem model described in Figure 1. These equaions wrie naurally in coninuous ime. We model he baery and he ho waer ank wih sock dynamics, and describe he dynamics of he building s emperaures wih an elecrical analogy. Such physical model fulfills wo purposes: i will be used o assess differen conrol policies; i will be he basis of a discree ime model used o design opimal conrol policies. A. Load balance Based on Figure 1, he load balance equaion of he microgrid wries, a each ime : φ pv ) + f ne ) = f b ) + f ) + f h ) + d el ). 1) We now commen he differen erms. In he lef hand side of Equaion 1), he load produced consiss of he producion of he solar panel φ pv ), he imporaion from he nework f ne ), supposed nonnegaive we do no expor elecriciy o he nework). In he righ hand side of Equaion 1), he elecrical demand is he sum of he power exchanged wih he baery f b ), he power injeced in he elecrical heaer f ), he power injeced in he elecrical ho waer ank f h ), he inflexible demands lighning, cooking...), aggregaed in a single demand d el ). B. Energy sorage We consider a lihium-ion baery, whose sae of charge a ime is denoed by b). The sae of charge is bounded: b b) b. ) Usually, we se b = 30% b so as o avoid empy sae of charge, which proves o be sressful for he baery. The baery dynamics is given by he differenial equaion db d = ρ cf b )) + 1 ρ d f b )), 3) wih ρ c and ρ d being he charge and discharge efficiency and f b ) denoing he power exchange wih he baery. We use he convenion f + = max0, f) and f = max0, f). As we canno wihdraw an infinie power from he baery a ime, we bound he power exchanged wih he baery: C. Elecrical ho waer ank f b f b ) f b. 4) We use a simple linear model for he elecrical ho waer ank dynamics. A ime, we denoe by T h ) he emperaure inside he ho waer ank. We suppose ha his emperaure is homogeneous, ha is, ha no sraificaion occurs inside he ank. A ime, we define he energy h) sored inside he ank as he difference beween he ank s emperaure T h ) and a reference emperaure T ref h) = ρv h c p T h ) T ref ), 5) where V h is he ank s volume, c p he calorific capaciy of waer and ρ he densiy of waer. The energy h) is bounded: The enhalpy balance equaion wries where 0 h) h. 6) dh d = β hf h ) d hw ), 7) f h ) is he elecrical power used o hea he ank, saisfying 0 f h ) f h, 8) d hw ) is he domesic ho waer demand, β h is a conversion yield. A more accurae represenaion would model he sraificaion inside he ho waer ank. However, his would grealy increase he number of saes in he sysem, rendering he numerical resoluion more cumbersome. We refer o 16] and 17] for discussions abou he impac of he ank s modeling on he performance of he conrol algorihms.

3 D. Thermal envelope We model he evoluion of he emperaures inside he building wih an elecrical analogy: we view emperaures as volages, walls as capaciors, and hermal flows as currens. A model wih 6 resisances and capaciors R6C) proves o be accurae o describe small buildings 18]. The model akes ino accoun wo emperaures: he wall s emperaure θ w ), he inner emperaure θ i ). Their evoluion is governed by he wo following differenial equaions c m dθ w d = θi ) θ w ) R i + R s } {{ } Exchange Indoor/Wall + θe ) θ w ) R m + R e + R i Φ in ) + R i + R }{{ s } Radiaion hrough windows dθ i c i d = θw ) θ i ) R i + R }{{ s } Exchange Indoor/Wall where we denoe } {{ } Exchange Oudoor/Wall + θe ) θ i ) R v } {{ } Venilaion + γf ) }{{} Heaer R e R e + R m Φ ex ) } {{ } Radiaion hrough wall, 9a) + θe ) θ i ) R f }{{} Windows + 1 γ)f ) + R s Φ in ), 9b) }{{} R i + R s Heaer }{{} Radiaion hrough windows he power injeced in he heaer by f ), he exernal emperaure by θ e ), he radiaion on he wall by Φ ex ), he radiaion hrough he windows by Φ in ). The ime-varying quaniies θ e ), Φ in ) and Φ ex ) are exogeneous. We denoe by R i, R s, R m, R e, R v, R f he differen resisances of he R6C model, and by c i, c m he capaciies of he inner rooms and he walls. We denoe by γ he proporion of heaing dissipaed in he wall hrough conducion, and by 1 γ) he proporion of heaing dissipaed in he inner room hrough convecion. E. Coninuous ime sae equaion We denoe by x = b, h, θ w, θ i ) he sae, u = f b, f, f h ) he conrol, and w = d el, d hw, φ pv ) he uncerainies. The coninuous sae equaion wries ẋ = F, x, u, w), 10) where he funcion F is defined by Equaions 3)-7)-9). III. OPTIMIZATION PROBLEM STATEMENT Now ha we have described he physical model, we urn o he formulaion of a decision problem. We aim o compue opimal decisions ha minimize a daily operaional cos, by solving a sochasic opimizaion problem. A. Decisions are aken a discree imes The EMS akes decisions every 15 minues o conrol he sysem. Thus, we have o provide decisions in discree ime. We se = 15mn, and we consider an horizon T 0 = 4h. We adop he following convenion for discree processes: for {0, 1,, T = T0 }, we se x = x ). Tha is, x denoes he value of he variable x a he beginning of he inerval, + 1). Oherwise saed, we will denoe by, + 1 he coninuous ime inerval, + 1). B. Modeling uncerainies as random variables Because of heir unpredicable naure, we canno anicipae he realizaions of he elecrical and he hermal demands. A similar reasoning applies o he producion of he solar panel. We choose o model hese quaniies as random variables over a sample space Ω). We adop he following convenion: a random variable will be denoed by an uppercase bold leer Z and is realizaion will be denoed in lowercase z = Zω). For each = 1,..., T, we define he uncerainy vecor W = D el, D h, Φ pv ), 11) modeled as a random variable. The uncerainy W akes value in he se W = R 3. C. Modeling conrols as random variables As decisions depend on he previous uncerainies, he conrol is a random variable. We recall ha, a each discree ime, he EMS akes hree decisions: how much energy o charge/discharge he baery F b, how much energy o sore in he elecrical ho waer ank F h, how much energy o injec in he elecrical heaer F. We wrie he decision vecor random variable) U = F b, F h, F ), 1) aking values in U = R 3. Then, beween wo discree ime indexes and + 1, he EMS impors an energy F+1 ne from he exernal nework. The EMS mus fulfill he load balance equaion 1) whaever he demand D el +1 and he producion of he solar panel Φ pv +1, unknown a ime. Hence F+1 ne is a recourse decision aken a ime + 1. The load balance equaion 1) now wries as F ne +1 = F b + F + F h + Del +1 Φ pv +1 P a.s., 13) where P a.s. indicaes ha he consrain is fulfilled in he almos sure sense. Laer, we will aggregae he solar panel producion Φ pv +1 wih he demands Del +1 in Equaion 13), as hese wo quaniies appear only by heir sum. The need of a recourse variable is a consequence of sochasiciy in he supply-demand equaion. The choice of he recourse variable depends on he modeling. Here, we choose he recourse F+1 ne o be provided by he exernal nework, ha is, he exernal nework miigaes he uncerainies in he sysem.

4 D. Saes and dynamics The sae becomes also a random variable X = B, H, θ w, θ i ). 14) I gahers he socks in he baery B and in he elecrical ho waer ank H, plus he wo emperaures of he hermal envelope θ i, θ w ). Thus, he sae vecor X akes values in X = R 4. The discree dynamics wries x +1 = f x, u, w +1 ), 15) where f corresponds o he discreizaion of he coninuous dynamics 10) using a forward Euler scheme, ha is, x +1 = x + F, x, u, w +1 ). By doing so, we suppose ha he conrol u and he uncerainy w +1 are consan over he inerval, +. The dynamics 15) rewries as an almos-sure consrain: ) X +1 = f X, U, W +1 P a.s.. 16) We suppose ha we sar from a given posiion x 0, hus adding a new iniial consrain: X 0 = x 0. E. Non-anicipaiviy consrains The fuure realizaions of uncerainies are unpredicable. Thus, decisions are funcions of previous hisory only, ha is, he informaion colleced beween ime 0 and ime. Such a consrain is encoded as an algebraic consrain, using he ools of Probabiliy heory 19]. The so-called non-anicipaiviy consrain wries σu ) F, 17) where σu ) is he σ-algebra generaed by U and F = σw 1,, W ) he σ-algebra associaed o he previous hisory W 1,..., W ). If Consrain 17) holds rue, he Doob lemma 19] ensures ha here exiss a funcion π such ha U = π X 0, W 1,..., W ). 18) This is how we urn an absrac) algebraic consrain ino a more pracical funcional consrain. The funcion π is an example of policy. F. Bounds consrains By Equaions ) and 6), he socks in he baery B and in he ank H are bounded. A ime, he conrol F b mus ensure ha he nex sae B +1 is admissible, ha is, b B +1 b by Equaion ), which rewries, b B + ρ c F b )+ + 1 ρ d F b ) ] b. 19) Thus, he consrains on F b depends on he sock B. The same reasoning applies for he ank power F h. Furhermore, we se bound consrains on conrols, ha is, f b F b f b, 0 F h f h, 0 F f. 0) Finally he load-balance equaion 13) also acs as a consrain on he conrols. We gaher all hese consrains in an admissible se on conrol U depending on he curren sae X : U U ad X ) P a.s.. 1) G. Objecive A ime, he operaional cos L : X U W +1 R aggregaes wo differen coss: L x, u, w +1 ) = p e f ne +1 + p d max0, θ i θ i ). ) Firs, we pay a price p e o impor elecriciy from he nework beween ime and + 1. Hence, elecriciy cos is equal o p e F+1 ne. Second, if he indoor emperaure is below a given hreshold, we penalize he induced discomfor wih a cos p d max0, θ i θ i ), where p d is a virual price of discomfor. The cos L is a convex piecewise linear funcion, which will prove imporan for he SDDP algorihm. We add a final cos K : X T R o ensure ha socks are non empy a final ime T Kx T ) = κ max0, x 0 x T ), 3) where κ is a posiive penalizaion coefficien calibraed by rials and errors. As decisions U and saes X are random, he coss L X, U, W +1 ) become also random variables. We choose o minimize he expeced value of he daily operaional cos, yielding he crierion E T 1 =0 ] L X, U, W +1 ) + KX T ), 4) yielding an expeced value of a convex piecewise linear cos. H. Sochasic opimal conrol formulaion Finally, he EMS problem ranslaes o a generic Sochasic Opimal Conrol SOC) problem T min E 1 ] L X, U, W +1 ) + KX T ), 5a) X,U =0 X 0 = x 0, ) X +1 = f X, U, W +1 P a.s., 5b) 5c) U U ad X ) P a.s., 5d) σu ) F. 5e) Problem 5) saes ha we wan o minimize he expeced value of he coss while saisfying he dynamics, he conrol bounds and he non-anicipaiviy consrains. IV. RESOLUTION METHODS The exac resoluion of Problem 5) is ou of reach in general. We propose wo differen algorihms ha provide policies π : X 0 W 1 W U ha map available informaion x 0, w 1,..., w a ime o a decision u. A. Model Predicive Conrol MPC) MPC is a classical algorihm commonly used o handle uncerainies in energy sysems. A ime, i considers a

5 deerminisic forecas w +1,..., w T ) of he fuure uncerainies W +1,..., W T ) and solves he deerminisic problem min u,,u T 1 ) T 1 j= ] L j x j, u j, w j+1 ) + Kx T ), 6a) x j+1 = f j xj, u j, w j+1 ), 6b) u j U ad j x j ). 6c) A ime, we solve Problem 6), rerieve he opimal decisions u,..., u T 1 ) and only keep he firs decision u o conrol he sysem beween ime and + 1. Then, we resar he procedure a ime + 1. As Problem 6) is linear and he number of ime seps remains no oo large, we are able o solve i exacly for every. B. Sochasic Dual Dynamic Programming SDDP) 1) Dynamic Programming and Bellman principle: he Dynamic Programming mehod 0] looks for soluions of Problem 5) as sae-feedbacks π : X U. Dynamic Programming compues a serie of value funcions backward in ime by seing V T x T ) = Kx T ) and solving he recursive equaions V x ) = min u U ad x ) W +1 L x, u, w +1 )+ V +1 f x, u, w +1 ) )] µ of +1 dw +1), 7) where µ of +1 is a offline) probabiliy disribuion on W +1. Once hese funcions are compued, we compue a decision a ime as a sae-feedback: π x ) arg min u U ad x ) W +1 L x, u, w +1 )+ V +1 f x, u, w +1 ) )] µ on +1dw +1 ), 8) where µ on +1 is an online probabiliy disribuion on W +1. This mehod proves o be opimal when he uncerainies W 1,..., W T are sagewise independen and when µ on = µ of is he probabiliy disribuion of W in 7). ) Descripion of Sochasic Dual Dynamic Programming: Dynamic Programming suffers from he well-known curse of dimensionaliy 11]: is numerical resoluion fails for sae dimension ypically greaer han 4 when value funcions are compued on a discreized grid. As he sae X in III-D has 4 dimensions, SDP would be oo slow o solve numerically Problem 5). The Sochasic Dual Dynamic Programming SDDP) can bypass he curse of dimensionaliy by approximaing value funcions by polyhedral funcions. I is opimal for solving Problem 5) when uncerainies are sagewise independen, coss L and K are convex and dynamics f are linear 14]. SDDP provides an ouer approximaion V k of he value funcion V in 7) wih a se of supporing hyperplanes {λ j, β j )} j=1,,k by V x ) = min θ, θ R λ j, x + β j θ, j = 1,, k. 9a) 9b) Each ieraion k of SDDP encompasses wo passes. During he forward pass, we draw a scenario x 0,..., wt k of uncerainies, and compue a sae rajecory { } x k along his scenario. Saring from =0 T posiion x 0, we compue x k +1 in an ieraive fashion: i) we compue he opimal conrol a ime using he available V k +1 funcion u k arg min L x k, u, w +1 )+ u U ad x ) W +1 V k +1 f x k, u, w +1 ) )] µ of +1 dw +1), 30) and ii), we se x k +1 = f x k, u k, w+1) k where f is given by 15). During he backward pass, we updae he approximaed value funcions { V k } backward in ime along he =0,,T rajecory { } x k. A ime, we solve he problem =0,,T θ k+1 = min L x k u U ad, u, w +1 )+ W +1 x ) V k+1 +1 f x k, u, w +1 ) )] µ of +1 dw +1), 31) and we obain a new cu λ k+1, β k+1 ) where λ k+1 is a subgradien of opimal cos 31) evaluaed a poin x = x k and β k+1 = θ k+1 λ k+1, x k. This new cu allows o updae he funcion V k+1 : V k+1 = max{v k, λ k+1,. + β k+1 }. Oherwise saed, SDDP only explores he sae space around ineresing rajecories hose compued during he forward passes) and refines he value funcions only in he corresponding space regions backward passes). 3) Obaining online conrols wih SDDP: in order o compue implemenable decisions, we use he following procedure. Approximaed value funcions { } V are compued wih he SDDP algorihm see IV-B). These compuaions are done offline. The approximaed value funcions { } V are hen used o compue online a decision a any ime for any sae x. More precisely, we compue he SDDP policy π sddp by π sddp x ) arg min L x, u, w +1 )+ W +1 u U ad x ) V +1 f x, u, w +1 ) )] µ on +1dw +1 ), 3) which corresponds o replacing he value funcion V +1 in Equaion 8) wih is approximaion V +1. The decision π sddp x ) is used o conrol he sysem beween ime and + 1. Then, we resolve Problem 3) a ime + 1. To solve numerically problems 30)-31)-3) a ime, we will consider disribuions wih finie suppor w 1,..., w S. The now wries: µof +1 = S s=1 p sδ w s +1 offline disribuion µ of +1 where δ w s +1 is he Dirac measure a w+1 s and p 1,..., p S ) are probabiliy weighs. The same reasoning applies o he online disribuion µ on +1. For insance, Problem 3) reformulaes as π sddp x ) arg min u U ad x ) s=1 S p s L x, u, w+1)+ s V +1 f x, u, w s +1) )]. 33)

6 A. Case sudy V. NUMERICAL RESULTS 1) Seings: we aim o solve he sochasic opimizaion problem 5) over one day, wih 96 ime seps. The baery s size is 3 kwh, and he ho waer ank has a capaciy of 10 l. We suppose ha he house has a surface A p = 0 m of solar panel a disposal, oriened souh, and wih a yield of 15%. We penalize he recourse variable F ne +1 in ) wih on-peak and off-peak ariff, corresponding o Élecricié de France s EDF) individual ariffs. The building s hermal envelope corresponds o he French RT01 specificaions 1]. Meeorological daa comes from Meeo France measuremens corresponding o he year 015. ) Demands scenarios: we have scenarios of elecrical and domesic ho waer demands a 15 minues ime seps, obained wih SRoBe ]. Figure displays 100 scenarios of elecrical and ho waer demands. We observe ha he shape of hese scenarios is consisen: demands are almos null during nigh, and here are peaks around midday and 8 pm. Peaks in ho waer demands corresponds o showers. We aggregae he producion of he solar panel Φ pv and he elecrical demands D el in a single variable D el o consider only wo uncerainies D el, D hw ). Elec. kwh] 5 4 3 1 0 DHW Conso. kwh] 10 8 6 4 0 Figure. Elecrical lef) and domesic ho waer righ) demand scenarios. 3) Ou of sample assessmen of sraegies: o obain a fair comparison beween SDDP and MPC, we use an ou-ofsample validaion. We generae,000 scenarios of elecrical and ho waer demands, and we spli hese scenarios in wo disinc pars: he firs N op scenarios are called opimizaion scenarios, and he remaining N sim scenarios are called assessmen scenarios. We made he choise N op = N sim = 1, 000. Firs, during he offline phase, we use he opimizaion scenarios o build models for he uncerainies, under he mahemaical form required by each algorihm see Sec. IV). Second, during he online phase, we use he assessmen scenarios o compare he sraegies produced by hese algorihms. A ime during he assessmen, he algorihms canno use he fuure values of he assessmen scenarios, bu can ake advanage of he observed values up o o updae heir saisical models of fuure uncerainies. B. Numerical implemenaion 1) Implemening he algorihms: we implemen MPC and SDDP in Julia 0.6, using JuMP 3] as a modeler, SochDynamicProgramming.jl as a SDDP solver, and Gurobi 7.0 4] as a LP solver. All compuaions run on a Core i7.5 GHz processor, wih 16Go RAM. ) MPC procedure: Elecrical and hermal demands are naurally correlaed in ime 5]. To ake ino accoun such a dependence across he differen ime-seps, we chose o model he process W 1,..., W T wih an auo-regressive AR) process. a) Building offline an AR model for MPC: we fi an AR1) model upon he opimizaion scenarios we do no consider higher order lag for he sake of simpliciy). For i {el, hw}, he AR model wries d i +1 = α i d i + β i + ε i, 34a) where he non-saionary coefficiens α i, β i ) are, for all ime, soluions of he leas-square problem α i, β i ) = arg min a,b N op d i,s +1 adi,s s=1 b. 34b) The poins d i,1,... d i,nop ) correspond o he opimizaion scenarios. The AR residuals ε el, ε hw ) are a whie noise process. b) Updaing he forecas online: once he AR model is calibraed, we use i o updae he forecas during assessmen see IV-A). The updae procedure is hreefold: i) we observe he demands w = d el, d hw ) beween ime 1 and, ii) we updae he forecas w +1 a ime + 1 wih he AR model w +1 = d el +1, d hw ) ) +1 = α el d el +β el, α hw d hw +β hw, iii) we se he forecas beween ime + and T by using he mean values of he opimizaion scenarios: w τ = 1 N op N op wτ i τ = +,, T. i=1 Once he forecas w +1,..., w T ) is available, i is fed ino he MPC algorihm ha solves Problem 6). 3) SDDP procedure: even if elecrical and hermal demands are naurally correlaed in ime 5], he SDDP algorihm only relies upon marginal disribuions. a) Building offline probabiliy disribuions for SDDP: raher han fiing an AR model like done wih MPC, we use he opimizaion scenarios o build marginal probabiliy disribuions µ of ha will feed he SDDP procedure in 30)- 31). We canno direcly consider he discree empirical marginal probabiliy derived from all N op scenarios, because he suppor size would be oo large for SDDP. This is why we use opimal quanizaion o map he N op opimizaion scenarios o S represenaive poins. We use a Loyd-Max quanizaion scheme 6] o obain a discree probabiliy disribuion: a each ime, we use he N op opimizaion scenarios o build a pariion Ξ = Ξ 1,, Ξ S ), where Ξ is he soluion of he opimal quanizaion problem S ) min w i w s 35) Ξ s=1 w i Ξs

7 where w s 1 = cardξ s) w i wi Ξs is he so-called cenroid of Ξ s. Then, we se for all ime = 0,, T he discree offline disribuions µ of = S s=1 p sδ w s, where δ w s is he Dirac measure a poin w s and p s = cardξ s )/N op is he associaed probabiliy weigh. We have chosen S = 0 o have enough precision. b) Compuing value funcions offline: hen, we use hese probabiliy disribuions as an inpu o compue a se of value funcions wih he procedure described in IV-B. c) Using he value funcions online: once he value funcions have been compued by SDDP, we are able o compue online decisions wih Equaion 33) 1. SDDP, on he conrary of MPC, does no updae he online probabiliy disribuion µ on during assessmen o consider he informaion brough by he previous observaions. 4) Heurisic procedure: we choose o compare he MPC and SDDP algorihms wih a basic decision rule. This heurisic is as follows: he baery is charged whenever he solar producion Φ pv is available, and discharged o fulfill he demand if here remains enough energy in he baery; he ank is charged F h > 0) if he ank energy H is lower han H 0, he heaer F is swiched on when he emperaure is below he sepoin θ i and swiched off whenever he emperaure is above he sepoin plus a given margin. C. Resuls 1) Assessing on differen meeorological condiions: we assess he algorihms on hree differen days, wih differen meeorological condiions see Table I). Therefore, we use hree disinc ses of N sim assessmen scenarios of demands, one for each ypical day. Dae Temp. C) PV Producion kwh) Winer Day February, 19h 3.3 8.4 Spring Day April, 1s 10.1 14.8 Summer Day May, 31s 14.1 3.3 Table I DIFFERENT METEOROLOGICAL CONDITIONS These hree differen days corresponds o differen heaing needs. During Winer day, he heaing is maximal, whereas i is medium during Spring day and null during Summer day. The producion of he solar panel varies accordingly. ) Comparing he algorihms performances: during assessmen, we use MPC see 6)) and SDDP see 3)) sraegies o compue online decisions along N sim assessmen scenarios. Then, we compare he average elecriciy bill obained wih hese wo sraegies and wih he heurisic. The assessmen resuls are given in Table II: means and sandard deviaion σ are compued by Mone Carlo wih he N sim assessmen scenarios; he noaion ± corresponds o he inerval ±1.96, σ Nsim which is a 95% confidence inerval. We observe ha MPC and SDDP exhibi close performance, and make beer han he heurisic. If we consider mean elecriciy bills, SDDP achieves beer savings han MPC 1 In pracice, he quanizaion size of µ on have a greaer accuracy online is bigger han hose of µ of, o SDDP MPC Heurisic Offline ime 50 s 0 s 0 s Online ime 1.5 ms 0.5 ms 0.005 ms Elecriciy bill e) Winer day 4.38 ± 0.0 4.59 ± 0.0 5.55 ± 0.0 Spring day 1.46 ± 0.01 1.45 ± 0.01.83 ± 0.01 Summer day 0.10 ± 0.01 0.18 ± 0.01 0.33 ± 0.0 Table II COMPARISON OF MPC, SDDP AND HEURISTIC STRATEGIES during Summer day and Winer day, bu SDDP and MPC display similar performances during Spring day. In addiion, SDDP achieves beer savings han MPC for he vas majoriy of scenarios. Indeed, if we compare he difference beween he elecriciy bills scenario by scenario, we observe ha SDDP is beer han MPC for abou 93% of he scenarios. This can be seen on Figure 3 ha displays he hisogram of he absolue gap savings beween SDDP and MPC during Summer day. The disribuion of he gap exhibis a heavy ail ha favors SDDP on exreme scenarios. Similar analyses hold for Winer and Spring day. Thus, we claim ha SDDP ouperforms MPC for he elecriciy savings. Concerning he performance on hermal comfor, emperaure rajecories are above he emperaure sepoins specified in III-G for boh MPC and SDDP. In erm of numerical performance, i akes less han a minue o compue a se of cus as in IV-B wih SDDP on a paricular day. Then, he online compuaion of a single decision akes 1.5 ms on average, compared o 0.5 ms for MPC. Indeed, MPC is favored by he lineariy of he opimizaion Problem 6), whereas, for SDDP, he higher he quanizaion size S, he slowes is he resoluion of Problem 3), bu he more informaion he online probabiliy disribuion µ on carries. Figure 3. day Number of occurences 300 50 00 150 100 50 0 0.8 0.6 0.4 0. 0.0 Absolue gap ] Absolue gap savings beween MPC and SDDP during Summer 3) Analyzing he rajecories: we analyze now he rajecories of socks in assessmen, during Summer day. The heaing is off, and he producion of he solar panel is nominal a midday. Figure 4 displays he sae of charge of he baery along a subse of assessmen scenarios, for SDDP and MPC. We observe ha SDDP charges earlier he baery a is maximum. On he conrary MPC charges he baery laer, and does no use he full poenial of he baery. The wo algorihms discharge he baery o fulfill he evening demands. We noice ha each rajecory exhibis a single cycle of charge/discharge, hus decreasing baery s aging.

8 Figure 5 displays he charge of he domesic ho waer ank along he same subse of assessmen scenarios. We observe a similar behavior as for he baery rajecories: SDDP uses more he elecrical ho waer ank o sore he excess of PV energy, and he level of he ank is greaer a he end of he day han in MPC. Charge kwh] 3.0.5.0 1.5 1.0 0.5 SDDP 3.0.5.0 1.5 1.0 0.5 MPC Figure 4. Baery charge rajecories for SDDP and MPC during Summer day Charge kwh] Figure 5. day 9 8 7 6 5 4 3 SDDP 9 8 7 6 5 4 3 MPC Ho waer ank rajecories for SDDP and MPC during Summer This analysis suggess ha SDDP makes a beer use of sorage capaciies han MPC. VI. CONCLUSION We have presened a domesic microgrid energy sysem, and compared differen opimizaion algorihms o conrol he socks wih an Energy Managemen Sysem. The resuls show ha opimizaion based sraegies ouperform he proposed heurisic procedure in erm of money savings. Furhermore, SDDP ouperforms MPC during Winer and Summer day achieving up o 35% coss savings and displays similar performance as MPC during Spring day. Even if SDDP and MPC exhibi close performance, a comparison scenario by scenario shows ha SDDP beas MPC mos of he ime more han 90% of scenarios during Summer day). Thus, we claim ha SDDP is beer han MPC o manage uncerainies in such a microgrid, alhough MPC gives also good performance. SDDP also makes a beer use of sorage capaciies. Our sudy can be exended in differen direcions. Firs, we could mix SDDP and MPC o recover he benefis of hese wo algorihms. Indeed, SDDP is designed o handles he uncerainies variabiliy bu fails o capure he ime correlaion, whereas MPC ignores he uncerainies variabiliy, bu considers ime correlaion by means of a mulisage opimizaion problem. Second, we are currenly invesigaing he opimizaion of larger scale microgrids wih differen inerconneced buildings by decomposiion mehods. REFERENCES 1] D. E. Olivares, A. Mehrizi-Sani, A. H. Eemadi, C. Canizares, R. Iravani, M. Kazerani, A. H. Hajimiragha, O. Gomis-Bellmun, M. Saeedifard, R. Palma-Behnke, e al., Trends in microgrid conrol, IEEE Transacions on Smar Grid, vol. 5, no. 4, pp. 1905 1919, 014. ] C. E. Garcia, D. M. Pre, and M. Morari, Model predicive conrol: heory and pracice a survey, Auomaica, vol. 5, no. 3, pp. 335 348, 1989. 3] F. Oldewurel, Sochasic model predicive conrol for energy efficien building climae conrol. PhD hesis, Eidgenössische Technische Hochschule ETH Zürich, 011. 4] P. Malisani, Piloage dynamique de l énergie du bâimen par commande opimale sous conraines uilisan la pénalisaion inérieure. PhD hesis, Ecole Naionale Supérieure des Mines de Paris, 01. 5] M. Y. Lamoudi, Disribued Model Predicive Conrol for energy managemen in buildings. PhD hesis, Universié de Grenoble, 01. 6] K. Paridari, A. Parisio, H. Sandberg, and K. H. Johansson, Robus scheduling of smar appliances in acive aparmens wih user behavior uncerainy, IEEE Transacions on Auomaion Science and Engineering, vol. 13, no. 1, pp. 47 59, 016. 7] M. De Lara, P. Carpenier, J.-P. Chancelier, and V. Leclère, Opimizaion Mehods for he Smar Grid. Conseil Francais de l Energie, 014. 8] M. V. Pereira and L. M. Pino, Muli-sage sochasic opimizaion applied o energy planning, Mahemaical programming, vol. 5, no. 1-3, pp. 359 375, 1991. 9] P. Haessig, Dimensionnemen e gesion d un sockage d énergie pour l aénuaion des inceriudes de producion éolienne. PhD hesis, Ecole normale supérieure de Cachan, 014. 10] B. Heymann, J. F. Bonnans, F. Silva, and G. Jimenez, A sochasic coninuous ime model for microgrid energy managemen, in European Conrol Conference ECC), pp. 084 089, IEEE, 016. 11] D. P. Bersekas, Dynamic programming and opimal conrol, vol. 1. Ahena Scienific Belmon, MA, hird ed., 005. 1] A. Shapiro, Analysis of sochasic dual dynamic programming mehod, European Journal of Operaional Research, vol. 09, no. 1, pp. 63 7, 011. 13] A. B. Philpo and Z. Guan, On he convergence of sochasic dual dynamic programming and relaed mehods, Operaions Research Leers, vol. 36, no. 4, pp. 450 455, 008. 14] P. Girardeau, V. Leclère, and A. B. Philpo, On he convergence of decomposiion mehods for mulisage sochasic convex programs, Mahemaics of Operaions Research, vol. 40, no. 1, pp. 130 145, 014. 15] A. N. Riseh, J. N. Dewynne, and C. L. Farmer, A comparison of conrol sraegies applied o a pricing problem in reail, arxiv preprin arxiv:1710.0044, 017. 16] T. Schüz, R. Sreblow, and D. Müller, A comparison of hermal energy sorage models for building energy sysem opimizaion, Energy and Buildings, vol. 93, pp. 3 31, 015. 17] N. Beeker, P. Malisani, and N. Pei, Discree-ime opimal conrol of elecric ho waer ank, in DYCOPS, 016. 18] T. Berhou, Developmen of building models for load curve forecas and design of energy opimizaion and load shedding sraegies. PhD hesis, Ecole Naionale Supérieure des Mines de Paris, Dec. 013. 19] O. Kallenberg, Foundaions of Modern Probabiliy. Springer-Verlag, New York, second ed., 00. 0] R. Bellman, Dynamic Programming. New Jersey: Princeon Universiy Press, 1957. 1] J. Officiel, Arrêé du 8 décembre 01 relaif aux caracérisiques hermiques e aux exigences de performance énergéique des bâimens nouveaux, 013. ] R. Baeens and D. Saelens, Modelling uncerainy in disric energy simulaions by sochasic residenial occupan behaviour, Journal of Building Performance Simulaion, vol. 9, no. 4, pp. 431 447, 016. 3] I. Dunning, J. Huchee, and M. Lubin, JuMP: A modeling language for mahemaical opimizaion, SIAM Review, vol. 59, no., pp. 95 30, 017. 4] G. O. Inc, Gurobi Opimizer Reference Manual, 014. 5] J. Widén and E. Wäckelgård, A high-resoluion sochasic model of domesic aciviy paerns and elecriciy demand, Applied Energy, vol. 87, no. 6, pp. 1880 189, 010. 6] S. Lloyd, Leas squares quanizaion in PCM, IEEE ransacions on informaion heory, vol. 8, no., pp. 19 137, 198.