An introduction to the theory of SDDP algorithm

An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21

Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking such problems : decompose (spaially) he problem and coordinae soluions, consruc easily solvable approximaions (Linear Programming). Behind he name SDDP here is hree differen hings: a class of algorihms, a specific implemenaion of he algorihm, a sofware implemening his mehod develloped by PSR. The aim of his alk is o give you an idea of how he class of algorihm is working and give a convergence resul. V. Leclère Inroducion o SDDP Augus 1, 2014 2 / 21

An inroducive applicaion A dam is seen as a sock of energy wih uncerain inflows, valorized a an uncerain price. The objecive is o minimize he expeced cumulaed cos. V. Leclère Inroducion o SDDP Augus 1, 2014 3 / 21

Conens 1 Technical Preliminaries Problem Formulaion and Dynamic Programming Dualiy Approach 2 SDDP Algorihm The SDDP algorihm Miscellaneous 3 Convergence and Numerical Resuls 4 Conclusion V. Leclère Inroducion o SDDP Augus 1, 2014 3 / 21

Problem Formulaion min A 0 x 0 =b 0 x 0 0 { c T 0 x 0 + E [ min B 1 (ξ 1 )x 0 +A 1 (ξ 1 )x 1 =b 1 x 1 0 [ + E { [ c1 T (ξ 1 )x 1 + E min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 c T T (ξ T )x T ] ]}]} Key pracical assumpion: he noise ( ξ 1,..., ξ T ) is independen from ime o ime. If i is no he case we can ofen exend he sae x o obain an independen noise V. Leclère Inroducion o SDDP Augus 1, 2014 4 / 21

Dynamic Programming Approach This problem can be solved by Dynamic Programming wih: Q T 1 (x T 1, ξ T ) = Q 1 (x 1, ξ ) = min B T (ξ T )x T 1 +A T (ξ T )x T =b T (ξ T ) x T 0 V T 1 = E [ Q T 1 (x T 1, ξ T ) ] min B (ξ )x 1 +A (ξ )x =b (ξ ) x 0 V 1 (x 1 ) = E [ Q 1 (x 1, ξ ) ] c T T (ξ T )x T c T x + V (x ) Noe ha: he resoluion can be done backward in ime bu require a priori discreizaion; funcions Q and V are convex in x; funcions Q and V are pieciewise linear, hence above problems are linear programming problems. V. Leclère Inroducion o SDDP Augus 1, 2014 5 / 21

Subgradien of a Value Funcion 1/2 Consider he following problem: Q (x, ξ) = min x,x +1 c T (ξ)x + V +1 (x +1 ) s.. x = x λ (ξ ) B (ξ)x 1 + A (ξ)x = b (ξ) Noe λ (ξ ) he Lagrangian muliplier associaed o he consrain x = x. Marginalis inerpreaion of he muliplier (and convexiy of Q ) implies ha λ (ξ ) Q (x, ξ) is a subgradien of Q (, ξ). In oher words Q (, ξ) Q (x, ξ) + λ, x. V. Leclère Inroducion o SDDP Augus 1, 2014 6 / 21

Subgradien of a Value Funcion 2/2 We have: x, Q (x, ξ) Q (x, ξ) + λ, x x. Recall ha V (x) = E [ Q (x, ξ) ]. By lineariy we obain V ( ) E [ Q (x, ξ) ] + E [ λ (ξ ) ], x. V. Leclère Inroducion o SDDP Augus 1, 2014 7 / 21

Consrucing a Cu from a Lower Approximaion 1/2 Assume ha we have Recall ha Q (x, ξ) = For any possible ξ, consider β (k) (ξ) = min x,x +1 Hence we have, hen, ˇV (k) +1 V +1. Choose a poin x (k). min B +1 (ξ)x +A +1 (ξ)x +1 =b +1 (ξ) x +1 0 s.. β (k) E [ β (k) c T (ξ)x + ˇV (k) +1 (x +1) x = x (k) B (ξ)x 1 + A (ξ)x = b (ξ) c T +1x + V +1 (x +1 ) (ξ) + λ (ξ), x (k) Q (, ξ). (ξ ) ] + E [ λ (ξ ) ], x (k) V ( ). λ (k) (ξ) V. Leclère Inroducion o SDDP Augus 1, 2014 8 / 21

Consrucing a Cu from a Lower Approximaion 2/2 Recall ha, Hence leads o β (k) V (x) = E [ Q (x, ξ +1 ) ]. (ξ) + λ (ξ), x (k) Q (, ξ), E [ β (k) (ξ ) ] + E [ λ (ξ ) ], x (k) V ( ) V. Leclère Inroducion o SDDP Augus 1, 2014 9 / 21

Probabilisic Assumpion The noise process (ξ 1,, ξ T ) is assumed o be independen in ime. If he acual noise process is auoregressive, i.e. W = a 0 + a 1 W 1 + + a τ W τ + ξ we can exend he sae of he problem by considering (x, W 1,, W τ ) as he new sae, and ξ as he new (independen) noise. The noise is assumed o be discree. A ime he noise affecing he sysem is assumed o be known before he decision x is made (Hazard-Decision informaion srucure). V. Leclère Inroducion o SDDP Augus 1, 2014 10 / 21

Global Scheme Assume ha we have lower approximaion V (k) Forward Phase Replacing he acual value funcions V by heir approximaion V (k), and selecing a scenario of noise (ξ (k) ) we derive a sae rajecory 0,..., ξ(k) T x (k) 0,..., x (k) T. Backward Phase Recursively backward compue a new cu (i.e. affine funcion lower han V ) of V a x (k). The new approximaion is given by V (k+1) = max { V (k), β (k) + λ (k), x (k) }. V. Leclère Inroducion o SDDP Augus 1, 2014 11 / 21

Forward Phase The forward phase of ieraion k find a good sae rajecory, given a sequence of lower approximaion V (k) : 1 Selec a sequence of noise (ξ (k) 0,..., ξ(k) T ). 2 Fix x (k) 0 = x 0. 3 Solve, recursively forward in ime, min x +1 s.. c T (ξ)x + x = x (k) ˇV (k) +1 (x +1) B (ξ (k) )x 1 + A (ξ)x = b (ξ (k) ) o obain a sequence ( x (k) 0,, x (k) ) T. This is he opimal rajecory given by he approximaion of he Bellman funcion along he given scenario. V. Leclère Inroducion o SDDP Augus 1, 2014 12 / 21

Backward Phase For any ime sep 1 For any ξ in he suppor of ξ +1, solve β (k) (ξ) = min x,x +1 s.. c T (ξ)x + ˇV (k) +1 (x +1) x = x (k) B (ξ)x 1 + A (ξ)x = b (ξ) λ (k) (ξ) 2 Compue he exac expecaions β (k) = E [ β (k) (ξ +1 ) ], λ (k) = E [ λ (k) (ξ +1 ) ]. 3 Updae he approximaion a ime : V (k+1) := max { V (k), β (k) + λ (k), x (k) }. V. Leclère Inroducion o SDDP Augus 1, 2014 13 / 21

Bounds Lower Bound A any sage of he algorihm we have an exac lower bound of he problem given by V (k) 0 (x 0 ). Upper Bound From he collecion of value funcions approximaions we derive a policy ( (k) X 0,, ) X(k) T ha can be simulaed. Hence, an upper bound of he value of he problem is given by [ T ] E c T X (k), =1 which can be esimaed by Mone-Carlo mehod. V. Leclère Inroducion o SDDP Augus 1, 2014 14 / 21

Sopping Rule A number of sopping rules for he SDDP mehod has been proposed. In pracice i is ofen simply a given number of ieraion (or a given compuaion ime). An ineresing sopping rule is he following. Fix an objecive gap ε. Compue he lower bound v (k) = V (k) (x 0 ). Esimae by Mone-Carlo he upper bound v (k). I is easy o obain an (asympoic) 90% confiance inerval for he upper bound [v (k) e (k), v (k) + e (k) ]. Sop if he difference v (k) + e (k) v (k) beween he upper esimaion of he upper bound and he lower bound is lower han he objecive gap, in which case we can cerified, wih 95% confidence ha he soluion of he algorihm is less han ε from he opimal. V. Leclère Inroducion o SDDP Augus 1, 2014 15 / 21

A Number of Approximaions 1 We fi on he daa available an auoregressive process represening he noise. firs unconrollable error. 2 We discreize he sochasic process o obain a scenario ree represenaion. second error conrolable hrough Sampling Average Approximaion (SAA) heory. 3 We compue approximaions ˇV (k) of he real value funcions V. hird error conrolable by bounds from he SDDP algorihm. 4 The upper bound is obained hrough Mone Carlo. fourh error conrolable hrough cenral limi heorem. V. Leclère Inroducion o SDDP Augus 1, 2014 16 / 21

Convergence resul - linear case Convergence resul Assume ha: he admissible ses of saes are compac, we are in he relaively complee recourse case, he random selecion in he forward phase saisfy some independence condiion, every value funcion is updaed an infinie number of ime. Then he upper and lower bounds converges almos surely oward he value of he problem. V. Leclère Inroducion o SDDP Augus 1, 2014 17 / 21

Convergence resul - convex case Convergence resul Assume ha: he admissible ses of saes are compac convex, he cos funcions are convex, we are in he exended relaively complee recourse case, he random selecion in he forward phase saisfy some independence condiion, every value funcion is updaed an infinie number of ime. Then he upper and lower bounds converges almos surely oward he value of he problem. V. Leclère Inroducion o SDDP Augus 1, 2014 18 / 21

Numerical Resuls Widely used in he energy communiy (managemen of dam, sizing of nework ec...) Efficien up o 10-15 dams. Numerical experimens available on A. Shapiro websie: 120 ime-sep, 4 dams for a sae of dimension 8, each ξ is discreized in 100 poins, hence here is 100 119 scenarios, 3000 ieraions runs in 15 minues, 20% gap, 4% in dimension 4. V. Leclère Inroducion o SDDP Augus 1, 2014 19 / 21

Conclusion SDDP is an algorihm, more precisely a class of algorihms ha exploi convexiy of he value funcions (from convexiy of coss...); does no require discreizaion; consruc ouer approximaions of V, hose approximaions being precise only in he righ places ; gives bounds : real lower bound V (k) 0 (x 0 ), esimaed (by Mone-Carlo) upper bound; consruc linear-convex approximaions, hus enabling he use of linear solver like CPLEX, have proofs of asympoic convergence. V. Leclère Inroducion o SDDP Augus 1, 2014 20 / 21

Available Exensions Cu dropping mehods are sudied. Ho sar by bypassing he forward phase and selecing arificial rajecories are numerically efficien. Work is done o apply SDDP in some non-convex cases. Risk aversion (hrough CVAR - eiher as consrain or as an elemen of he objecive funcion) can be aken ino accoun (by exending he sae). Non linear convex cos can be used (convergence resul). V. Leclère Inroducion o SDDP Augus 1, 2014 21 / 21

M. Pereira, L.Pino (1991). Muli-sage sochasic opimizaion applied o energy planning. Mahemaical Programming Z.Chen, W. Powell (1999). A convergen cuing plane and parial-sampling algorihm for mulisage linear programs wih recourse. Journal of Opimizaion Theory and Applicaions A.Philpo, Z. Guan (2008). On he convergence of sochasic dual dynamic programming and relaed mehods. Operaions research leers P.Girardeau, V.Leclère, A. Philpo (2014). On he convergence of decomposiion mehods for muli-sage sochasic convex programs. acceped in Mahemaics of Operaions Research. V. Leclère Inroducion o SDDP Augus 1, 2014 21 / 21