An introduction to the theory of SDDP algorithm

Similar documents
Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

t dt t SCLP Bellman (1953) CLP (Dantzig, Tyndall, Grinold, Perold, Anstreicher 60's-80's) Anderson (1978) SCLP

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

A Hop Constrained Min-Sum Arborescence with Outage Costs

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Random Walk with Anti-Correlated Steps

Risk-Averse Stochastic Dual Dynamic Programming

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Notes on Kalman Filtering

An Introduction to Stochastic Programming: The Recourse Problem

arxiv: v1 [math.oc] 27 Jul 2009

STATE-SPACE MODELLING. A mass balance across the tank gives:

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Scheduling of Crude Oil Movements at Refinery Front-end

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Lecture 20: Riccati Equations and Least Squares Feedback Control

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Online Appendix to Solution Methods for Models with Rare Disasters

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

SUPPLEMENTARY INFORMATION

Distributionally Robust Stochastic Control with Conic Confidence Sets

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

Ordinary Differential Equations

Subway stations energy and air quality management

GMM - Generalized Method of Moments

Global Optimization for Scheduling Refinery Crude Oil Operations

Variational Iteration Method for Solving System of Fractional Order Ordinary Differential Equations

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

SDDP FOR MULTISTAGE STOCHASTIC LINEAR PROGRAMS BASED ON SPECTRAL RISK MEASURES

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

USP. Surplus-Production Models

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Longest Common Prefixes

Unsteady Flow Problems

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Isolated-word speech recognition using hidden Markov models

CH Sean Han QF, NTHU, Taiwan BFS2010. (Joint work with T.-Y. Chen and W.-H. Liu)

Lecture 33: November 29

Logic in computer science

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Pade and Laguerre Approximations Applied. to the Active Queue Management Model. of Internet Protocol

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Optima and Equilibria for Traffic Flow on a Network

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Distribution of Estimates

Optimal Investment under Dynamic Risk Constraints and Partial Information

Chapter 3 Boundary Value Problem

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

INEXACT CUTS FOR DETERMINISTIC AND STOCHASTIC DUAL DYNAMIC PROGRAMMING APPLIED TO CONVEX NONLINEAR OPTIMIZATION PROBLEMS

Author's personal copy

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Optimization of a Liquefied Natural Gas Portfolio by SDDP techniques

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game

Space-time Galerkin POD for optimal control of Burgers equation. April 27, 2017 Absolventen Seminar Numerische Mathematik, TU Berlin

Maximum Likelihood Parameter Estimation in State-Space Models

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Institute for Mathematical Methods in Economics. University of Technology Vienna. Singapore, May Manfred Deistler

Christos Papadimitriou & Luca Trevisan November 22, 2016

not to be republished NCERT MATHEMATICAL MODELLING Appendix 2 A.2.1 Introduction A.2.2 Why Mathematical Modelling?

BU Macro BU Macro Fall 2008, Lecture 4

Echocardiography Project and Finite Fourier Series

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

Air Traffic Forecast Empirical Research Based on the MCMC Method

2. Nonlinear Conservation Law Equations

Energy Storage Benchmark Problems

Single and Double Pendulum Models

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Economics 8105 Macroeconomic Theory Recitation 6

Chapter 4. Truncation Errors

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

Probabilistic Robotics

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

Optimal approximate dynamic programming algorithms for a general class of storage problems

References are appeared in the last slide. Last update: (1393/08/19)

) were both constant and we brought them from under the integral.

GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT

KEY. Math 334 Midterm III Winter 2008 section 002 Instructor: Scott Glasgow

Planning in POMDPs. Dominik Schoenberger Abstract

Tom Heskes and Onno Zoeter. Presented by Mark Buller

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Control of computer chip semi-conductor wafer fabs

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Reconstructing the power grid dynamic model from sparse measurements

Competitive and Cooperative Inventory Policies in a Two-Stage Supply-Chain

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

Vehicle Arrival Models : Headway

Transcription:

An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21

Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking such problems : decompose (spaially) he problem and coordinae soluions, consruc easily solvable approximaions (Linear Programming). Behind he name SDDP here is hree differen hings: a class of algorihms, a specific implemenaion of he algorihm, a sofware implemening his mehod develloped by PSR. The aim of his alk is o give you an idea of how he class of algorihm is working and give a convergence resul. V. Leclère Inroducion o SDDP Augus 1, 2014 2 / 21

Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking such problems : decompose (spaially) he problem and coordinae soluions, consruc easily solvable approximaions (Linear Programming). Behind he name SDDP here is hree differen hings: a class of algorihms, a specific implemenaion of he algorihm, a sofware implemening his mehod develloped by PSR. The aim of his alk is o give you an idea of how he class of algorihm is working and give a convergence resul. V. Leclère Inroducion o SDDP Augus 1, 2014 2 / 21

Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking such problems : decompose (spaially) he problem and coordinae soluions, consruc easily solvable approximaions (Linear Programming). Behind he name SDDP here is hree differen hings: a class of algorihms, a specific implemenaion of he algorihm, a sofware implemening his mehod develloped by PSR. The aim of his alk is o give you an idea of how he class of algorihm is working and give a convergence resul. V. Leclère Inroducion o SDDP Augus 1, 2014 2 / 21

An inroducive applicaion A dam is seen as a sock of energy wih uncerain inflows, valorized a an uncerain price. The objecive is o minimize he expeced cumulaed cos. V. Leclère Inroducion o SDDP Augus 1, 2014 3 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 3 / 21

Problem Formulaion min A 0 x 0 =b 0 x 0 0 { c T 0 x 0 + E [ min B 1 (ξ 1 )x 0 +A 1 (ξ 1 )x 1 =b 1 x 1 0 [ + E { [ c1 T (ξ 1 )x 1 + E min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 c T T (ξ T )x T ] ]}]} Key pracical assumpion: he noise ( ξ 1,..., ξ T ) is independen from ime o ime. If i is no he case we can ofen exend he sae x o obain an independen noise V. Leclère Inroducion o SDDP Augus 1, 2014 4 / 21

Problem Formulaion min A 0 x 0 =b 0 x 0 0 { c T 0 x 0 + E [ min B 1 (ξ 1 )x 0 +A 1 (ξ 1 )x 1 =b 1 x 1 0 [ + E { [ c1 T (ξ 1 )x 1 + E min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 c T T (ξ T )x T ] ]}]} Key pracical assumpion: he noise ( ξ 1,..., ξ T ) is independen from ime o ime. If i is no he case we can ofen exend he sae x o obain an independen noise V. Leclère Inroducion o SDDP Augus 1, 2014 4 / 21

Dynamic Programming Approach This problem can be solved by Dynamic Programming wih: Q T 1 (x T 1, ξ T ) = Q 1 (x 1, ξ ) = min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 V T 1 = E [ Q T 1 (x T 1, ξ T ) ] min B (ξ )x 1 +A (ξ )x =b (ξ ) x 0 V 1 (x 1 ) = E [ Q 1 (x 1, ξ ) ] c T T (ξ T )x T c T x + V (x ) Noe ha: he resoluion can be done backward in ime bu require a priori discreizaion; funcions Q and V are convex in x; funcions Q and V are pieciewise linear, hence above problems are linear programming problems. V. Leclère Inroducion o SDDP Augus 1, 2014 5 / 21

Dynamic Programming Approach This problem can be solved by Dynamic Programming wih: Q T 1 (x T 1, ξ T ) = Q 1 (x 1, ξ ) = min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 V T 1 = E [ Q T 1 (x T 1, ξ T ) ] min B (ξ )x 1 +A (ξ )x =b (ξ ) x 0 V 1 (x 1 ) = E [ Q 1 (x 1, ξ ) ] c T T (ξ T )x T c T x + V (x ) Noe ha: he resoluion can be done backward in ime bu require a priori discreizaion; funcions Q and V are convex in x; funcions Q and V are pieciewise linear, hence above problems are linear programming problems. V. Leclère Inroducion o SDDP Augus 1, 2014 5 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 5 / 21

Subgradien of a Value Funcion 1/2 Consider he following problem: Q (x, ξ) = min x,x +1 c T (ξ)x + V +1 (x +1 ) s.. x = x λ (ξ ) B (ξ)x 1 + A (ξ)x = b (ξ) Noe λ (ξ ) he Lagrangian muliplier associaed o he consrain x = x. Marginalis inerpreaion of he muliplier (and convexiy of Q ) implies ha λ (ξ ) Q (x, ξ) is a subgradien of Q (, ξ). In oher words Q (, ξ) Q (x, ξ) + λ, x. V. Leclère Inroducion o SDDP Augus 1, 2014 6 / 21

Subgradien of a Value Funcion 2/2 We have: x, Q (x, ξ) Q (x, ξ) + λ, x x. Recall ha V (x) = E [ Q (x, ξ) ]. By lineariy we obain V ( ) E [ Q (x, ξ) ] + E [ λ (ξ ) ], x. V. Leclère Inroducion o SDDP Augus 1, 2014 7 / 21

Consrucing a Cu from a Lower Approximaion 1/2 Assume ha we have Recall ha Q (x, ξ) = For any possible ξ, consider β (k) (ξ) = min x,x +1 Hence we have, hen, ˇV (k) +1 V +1. Choose a poin x (k). min B +1 (ξ)x +A +1 (ξ)x +1 =b +1 (ξ) x +1 0 s.. β (k) E [ β (k) c T (ξ)x + ˇV (k) +1 (x +1) x = x (k) B (ξ)x 1 + A (ξ)x = b (ξ) c T +1x + V +1 (x +1 ) (ξ) + λ (ξ), x (k) Q (, ξ). (ξ ) ] + E [ λ (ξ ) ], x (k) V ( ). λ (k) (ξ) V. Leclère Inroducion o SDDP Augus 1, 2014 8 / 21

Consrucing a Cu from a Lower Approximaion 2/2 Recall ha, Hence leads o β (k) V (x) = E [ Q (x, ξ +1 ) ]. (ξ) + λ (ξ), x (k) Q (, ξ), E [ β (k) (ξ ) ] + E [ λ (ξ ) ], x (k) V ( ) V. Leclère Inroducion o SDDP Augus 1, 2014 9 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 9 / 21

Probabilisic Assumpion The noise process (ξ 1,, ξ T ) is assumed o be independen in ime. If he acual noise process is auoregressive, i.e. W = a 0 + a 1 W 1 + + a τ W τ + ξ we can exend he sae of he problem by considering (x, W 1,, W τ ) as he new sae, and ξ as he new (independen) noise. The noise is assumed o be discree. A ime he noise affecing he sysem is assumed o be known before he decision x is made (Hazard-Decision informaion srucure). V. Leclère Inroducion o SDDP Augus 1, 2014 10 / 21

Global Scheme Assume ha we have lower approximaion V (k) Forward Phase Replacing he acual value funcions V by heir approximaion V (k), and selecing a scenario of noise (ξ (k) ) we derive a sae rajecory 0,..., ξ(k) T x (k) 0,..., x (k) T. Backward Phase Recursively backward compue a new cu (i.e. affine funcion lower han V ) of V a x (k). The new approximaion is given by V (k+1) = max { V (k), β (k) + λ (k), x (k) }. V. Leclère Inroducion o SDDP Augus 1, 2014 11 / 21

Forward Phase The forward phase of ieraion k find a good sae rajecory, given a sequence of lower approximaion V (k) : 1 Selec a sequence of noise (ξ (k) 0,..., ξ(k) T ). 2 Fix x (k) 0 = x 0. 3 Solve, recursively forward in ime, min x +1 s.. c T (ξ)x + x = x (k) ˇV (k) +1 (x +1) B (ξ (k) )x 1 + A (ξ)x = b (ξ (k) ) o obain a sequence ( x (k) 0,, x (k) ) T. This is he opimal rajecory given by he approximaion of he Bellman funcion along he given scenario. V. Leclère Inroducion o SDDP Augus 1, 2014 12 / 21

Backward Phase For any ime sep 1 For any ξ in he suppor of ξ +1, solve β (k) (ξ) = min x,x +1 s.. c T (ξ)x + ˇV (k) +1 (x +1) x = x (k) B (ξ)x 1 + A (ξ)x = b (ξ) λ (k) (ξ) 2 Compue he exac expecaions β (k) = E [ β (k) (ξ +1 ) ], λ (k) = E [ λ (k) (ξ +1 ) ]. 3 Updae he approximaion a ime : V (k+1) := max { V (k), β (k) + λ (k), x (k) }. V. Leclère Inroducion o SDDP Augus 1, 2014 13 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 13 / 21

Bounds Lower Bound A any sage of he algorihm we have an exac lower bound of he problem given by V (k) 0 (x 0 ). Upper Bound From he collecion of value funcions approximaions we derive a policy ( (k) X 0,, ) X(k) T ha can be simulaed. Hence, an upper bound of he value of he problem is given by [ T ] E c T X (k), =1 which can be esimaed by Mone-Carlo mehod. V. Leclère Inroducion o SDDP Augus 1, 2014 14 / 21

Sopping Rule A number of sopping rules for he SDDP mehod has been proposed. In pracice i is ofen simply a given number of ieraion (or a given compuaion ime). An ineresing sopping rule is he following. Fix an objecive gap ε. Compue he lower bound v (k) = V (k) (x 0 ). Esimae by Mone-Carlo he upper bound v (k). I is easy o obain an (asympoic) 90% confiance inerval for he upper bound [v (k) e (k), v (k) + e (k) ]. Sop if he difference v (k) + e (k) v (k) beween he upper esimaion of he upper bound and he lower bound is lower han he objecive gap, in which case we can cerified, wih 95% confidence ha he soluion of he algorihm is less han ε from he opimal. V. Leclère Inroducion o SDDP Augus 1, 2014 15 / 21

A Number of Approximaions 1 We fi on he daa available an auoregressive process represening he noise. firs unconrollable error. 2 We discreize he sochasic process o obain a scenario ree represenaion. second error conrolable hrough Sampling Average Approximaion (SAA) heory. 3 We compue approximaions ˇV (k) of he real value funcions V. hird error conrolable by bounds from he SDDP algorihm. 4 The upper bound is obained hrough Mone Carlo. fourh error conrolable hrough cenral limi heorem. V. Leclère Inroducion o SDDP Augus 1, 2014 16 / 21

Convergence resul - linear case Convergence resul Assume ha: he admissible ses of saes are compac, we are in he relaively complee recourse case, he random selecion in he forward phase saisfy some independence condiion, every value funcion is updaed an infinie number of ime. Then he upper and lower bounds converges almos surely oward he value of he problem. V. Leclère Inroducion o SDDP Augus 1, 2014 17 / 21

Convergence resul - convex case Convergence resul Assume ha: he admissible ses of saes are compac convex, he cos funcions are convex, we are in he exended relaively complee recourse case, he random selecion in he forward phase saisfy some independence condiion, every value funcion is updaed an infinie number of ime. Then he upper and lower bounds converges almos surely oward he value of he problem. V. Leclère Inroducion o SDDP Augus 1, 2014 18 / 21

Numerical Resuls Widely used in he energy communiy (managemen of dam, sizing of nework ec...) Efficien up o 10-15 dams. Numerical experimens available on A. Shapiro websie: 120 ime-sep, 4 dams for a sae of dimension 8, each ξ is discreized in 100 poins, hence here is 100 119 scenarios, 3000 ieraions runs in 15 minues, 20% gap, 4% in dimension 4. V. Leclère Inroducion o SDDP Augus 1, 2014 19 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 19 / 21

Conclusion SDDP is an algorihm, more precisely a class of algorihms ha exploi convexiy of he value funcions (from convexiy of coss...); does no require discreizaion; consruc ouer approximaions of V, hose approximaions being precise only in he righ places ; gives bounds : real lower bound V (k) 0 (x 0 ), esimaed (by Mone-Carlo) upper bound; consruc linear-convex approximaions, hus enabling he use of linear solver like CPLEX, have proofs of asympoic convergence. V. Leclère Inroducion o SDDP Augus 1, 2014 20 / 21

Available Exensions Cu dropping mehods are sudied. Ho sar by bypassing he forward phase and selecing arificial rajecories are numerically efficien. Work is done o apply SDDP in some non-convex cases. Risk aversion (hrough CVAR - eiher as consrain or as an elemen of he objecive funcion) can be aken ino accoun (by exending he sae). Non linear convex cos can be used (convergence resul). V. Leclère Inroducion o SDDP Augus 1, 2014 21 / 21

M. Pereira, L.Pino (1991). Muli-sage sochasic opimizaion applied o energy planning. Mahemaical Programming Z.Chen, W. Powell (1999). A convergen cuing plane and parial-sampling algorihm for mulisage linear programs wih recourse. Journal of Opimizaion Theory and Applicaions A.Philpo, Z. Guan (2008). On he convergence of sochasic dual dynamic programming and relaed mehods. Operaions research leers P.Girardeau, V.Leclère, A. Philpo (2014). On he convergence of decomposiion mehods for muli-sage sochasic convex programs. acceped in Mahemaics of Operaions Research. V. Leclère Inroducion o SDDP Augus 1, 2014 21 / 21