Computational Stochastic Programming

Size: px
Start display at page:

Download "Computational Stochastic Programming"

Transcription

1 Computational Stochastic Programming Jeff Linderoth Dept. of ISyE Dept. of CS Univ. of Wisconsin-Madison SPXIII Bergamo, Italy July 7, 2013 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 89

2 Mission Impossible! It is impossible to discuss all of computational SP in 90 minutes. I will focus on a few basic topics, and (try to) provide references for a few others. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 2 / 89

3 Bias Bias The bias of an estimator is the difference between this estimator s expected value E[^θ] and the true value of the parameter θ. An estimator is unbiased if E[^θ θ] = 0 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 3 / 89

4 Bias Bias The bias of an estimator is the difference between this estimator s expected value E[^θ] and the true value of the parameter θ. An estimator is unbiased if E[^θ θ] = 0 I Am Biased! But this lecture is also extremely biased, in the English sense of the word. It will cover best things that I know most about There is lots of great work in computational SP that I (unfortunately) won t mention. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 3 / 89

5 What I WILL cover Stochastic LP w/recourse (Primarily 2-Stage) Decomposition. Benders Decomposition Lagrangian Relaxation Dual Decomposition Stochastic approximation Modern/Bundle-type methods. Trust region methods Regularized Decomposition The level method Multistage Extensions Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 4 / 89

6 What I WILL cover Stochastic LP w/recourse (Primarily 2-Stage) Decomposition. Benders Decomposition Lagrangian Relaxation Dual Decomposition Stochastic approximation Modern/Bundle-type methods. Trust region methods Regularized Decomposition The level method Multistage Extensions Software Tools SMPS format Some available software tools for modeling and solving Role of parallel computing Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 4 / 89

7 Other Things Covered Sampling Exterior Sampling Methods Sample Average Approximation Interior Sampling Methods Stochastic Quasi-Gradient Stochastic Decomposition Mirror-Prox Methods Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 5 / 89

8 Other Things Covered Sampling Exterior Sampling Methods Sample Average Approximation Interior Sampling Methods Stochastic Quasi-Gradient Stochastic Decomposition Mirror-Prox Methods Stochastic Integer Programming Integer L-Shaped Dual Decomposition Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 5 / 89

9 Stochastic Programming A Stochastic Program def min f(x) = E ω [F(x, ω)] x X Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 6 / 89

10 Stochastic Programming A Stochastic Program def min f(x) = E ω [F(x, ω)] x X 2 Stage Stochastic LP w/recourse The Recourse Problem F(x, ω) def = c T x + Q(x, ω) c T x: Pay me now Q(x, ω): Pay me later Q(x, ω) def = min q(ω) T y W(ω)y = h(ω) T(ω)x y 0 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 6 / 89

11 Stochastic Programming A Stochastic Program def min f(x) = E ω [F(x, ω)] x X 2 Stage Stochastic LP w/recourse The Recourse Problem F(x, ω) def = c T x + Q(x, ω) c T x: Pay me now Q(x, ω): Pay me later Q(x, ω) def = min q(ω) T y W(ω)y = h(ω) T(ω)x y 0 E[F(x, ω)] = c T x + E[Q(x, ω)] def = c T x + φ(x) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 6 / 89

12 Decomposition Algorithms Two Ways of Thinking Algorithms are equivalent regardless of how you think about them. But thinking in different ways gives different insights Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 7 / 89

13 Decomposition Algorithms Two Ways of Thinking Algorithms are equivalent regardless of how you think about them. But thinking in different ways gives different insights Complementary Viewpoints 1 As a large-scale problem for which you will apply decomposition techniques 2 As a oracle convex optimization problem Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 7 / 89

14 Decomposition: A Popular Method M E T H O D Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 8 / 89

15 Extensive Form Large Scale = Extensive Form This is sometimes called the deterministic equivalent, but I prefer the term extensive form Assume Ω = {ω 1, ω 2,... ω S } R r, P(ω = ω s ) = p s, s = 1, 2,..., S T s def = T(ω s ), h s def = h(ω s ), q s def = q(ω s ), W S = W(ω s ) Then can the write extensive form as just a large LP: Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 9 / 89

16 Extensive Form Large Scale = Extensive Form This is sometimes called the deterministic equivalent, but I prefer the term extensive form Assume Ω = {ω 1, ω 2,... ω S } R r, P(ω = ω s ) = p s, s = 1, 2,..., S T s def = T(ω s ), h s def = h(ω s ), q s def = q(ω s ), W S = W(ω s ) Then can the write extensive form as just a large LP: c T x + p 1 q T 1 y 1 + p 2 q T 2 y p s q T s y s Ax = b T 1 x + W 1 y 1 = h 1 T 2 x + W 2 y 2 = h T S x + W S y s = h s x X y 1 Y y 2 Y y s Y Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 9 / 89

17 Extensive Form Small SP s are Easy! Cplex/Extensive Form 50 Time L shaped number of scenarios Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 10 / 89

18 Extensive Form Small SP s are Easy! Cplex/Extensive Form 50 Time L shaped number of scenarios In my experience, using barrier/interior point method is faster than simplex/pivoting-based methods for solving extensive form LPs. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 10 / 89

19 Extensive Form The Upshot If it is too large to solve directly, then we must exploit the structure. If I fix the first stage variables x, then the problem decomposes by scenario Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 11 / 89

20 Extensive Form The Upshot If it is too large to solve directly, then we must exploit the structure. If I fix the first stage variables x, then the problem decomposes by scenario c T x + p 1 q T 1 y 1 + p 2 q T 2 y p s q T s y s Ax = b T 1 x + W 1 y 1 = h 1 T 2 x + W 2 y 2 = h T S x + W S y s = h s x X y 1 Y y 2 Y y s Y Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 11 / 89

21 Extensive Form The Upshot If it is too large to solve directly, then we must exploit the structure. If I fix the first stage variables x, then the problem decomposes by scenario c T x + p 1 q T 1 y 1 + p 2 q T 2 y p s q T s y s Ax = b T 1 x + W 1 y 1 = h 1 T 2 x + W 2 y 2 = h T S x + W S y s = h s x X y 1 Y y 2 Y y s Y Key Idea Benders Decomposition: Characterize the solution of a scenario linear program as a function of first stage solution x Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 11 / 89

22 Extensive Form Benders of Extensive Form z LP (x) = S s=1 zs LP (x), where z s LP (x) = inf{qt s y W s y = h s T s x, y 0} The dual of the LP defining z s LP (x) is sup{(h s T s x) T π W T s π q s }. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 12 / 89

23 Extensive Form Benders of Extensive Form z LP (x) = S s=1 zs LP (x), where z s LP (x) = inf{qt s y W s y = h s T s x, y 0} The dual of the LP defining z s LP (x) is sup{(h s T s x) T π W T s π q s }. Set of dual feasible solutions for scenario s: Π s = {π R m W T s π q s } Vertices of Π s : Extreme rays of Π s : V(Π s ) = {v 1s, v 2s,..., v Vs,s} R(Π s ) = {r 1s, r 2s,..., r Rs,s} Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 12 / 89

24 Extensive Form (In Case You Forgot...) Minkowski s Theorem There are two ways to describe polyhedra. As an intersection of halfspaces (by its facets), or by appropriate combinations of its extreme points and extreme rays Minkowski-Weyl Theorem Let P = {x R n Ax b} have extreme points V(P) = {v 1, v 2,... v V } and extreme rays R(P) = {r 1, r 2,... r R }, then P = { x R n x = V j=1 V λ j v j + j=1 R µ j r j j=1 } λ j = 1, λ j 0 j = 1,... V, µ j 0 j = 1,... R Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 13 / 89

25 Extensive Form A Little LP Theory We assume that z s LP (x) is bounded below. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 14 / 89

26 Extensive Form A Little LP Theory We assume that z s LP (x) is bounded below. The LP is feasible (z s LP (x) < + ) if there is not a direction of unboundedness in the dual LP. r R(Π s ) (Ws T r 0) such that (h s T s x) T r > 0 So x is a feasible solution (z s LP (x) < + ) if (h s T s x) T r 0 r = 1,..., R s Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 14 / 89

27 Extensive Form A Little LP Theory We assume that z s LP (x) is bounded below. The LP is feasible (z s LP (x) < + ) if there is not a direction of unboundedness in the dual LP. r R(Π s ) (Ws T r 0) such that (h s T s x) T r > 0 So x is a feasible solution (z s LP (x) < + ) if (h s T s x) T r 0 r = 1,..., R s If there is an optimal solution to an LP, then there is an optimal solution that occurs at an extreme point. By strong duality, the optimal solution to primal and dual LPs will have same objective value, so z s LP (x) = max {(h s T s x) T v js j=1,2,...v S r T ks (h s T s x) 0 k = 1, 2,... R s } Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 14 / 89

28 Extensive Form Developing Benders Decomposition Scenario s second stage feasibility set: C s def = {x y 0 with W s y = h s T s x} = {x h s T s x pos(w s )} First stage feasibility set X def = {x R n + Ax = b} Second stage feasibility set: C def = S s=1 C s (2SP) def min f(x) = c T x + x X C S p s Q(x, ω s ) s=1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 15 / 89

29 Extensive Form Developing Benders Decomposition Scenario s second stage feasibility set: C s def = {x y 0 with W s y = h s T s x} = {x h s T s x pos(w s )} First stage feasibility set X def = {x R n + Ax = b} Second stage feasibility set: C def = S s=1 C s (2SP) def min f(x) = c T x + x X C S p s Q(x, ω s ) s=1 x C s (h s T s x) T r js 0 j = 1,... R s θ s Q(x, ω s ) θ s (h s T s x) T v js j = 1,... V s Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 15 / 89

30 Extensive Form Benders-LShaped Use these results Introduce auxiliary variables θ s to represent the value of Q(x, ω s ) N.B. I am changing notation just a little bit Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 16 / 89

31 Extensive Form Benders-LShaped Use these results Introduce auxiliary variables θ s to represent the value of Q(x, ω s ) N.B. I am changing notation just a little bit Unaggregated: Full Multicut min c T x + s S p s θ s r T T s x r T h s s S, r R(Π s ) θ s + v T T s x v T h s s S, v V(Π s ) Ax = b x 0 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 16 / 89

32 Extensive Form LShaped Method. Aggregate inequalities and remove variables θ s for each scenario. Instead introduce variable: Θ s S p sθ s p s (v T s h s v T s T s x) (choosing any v s V(Π s ). Fully-Aggregated: LShaped min c T x + Θ r T T s x r T h s s S, r R(Π s ) Θ + p s v T T s x p s v T h s v V(Π s ) s S s S Ax = b x 0 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 17 / 89

33 Extensive Form LShaped Method. Aggregate inequalities and remove variables θ s for each scenario. Instead introduce variable: Θ s S p sθ s p s (v T s h s v T s T s x) (choosing any v s V(Π s ). Fully-Aggregated: LShaped min c T x + Θ r T T s x r T h s s S, r R(Π s ) Θ + p s v T T s x p s v T h s v V(Π s ) s S s S Ax = b x 0 N.B. Different aggregations are possible Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 17 / 89

34 Extensive Form A Whole Spectrum Complete Aggregation (Traditional LShaped): One variable for φ(x) Complete Multicut: S variables for φ(x) We can do anything in between... Partition the scenarios into C clusters S 1, S 2,... S C. φ [Sk ](x) = s S k p s Q(x, ω s ) Θ [Sk ] s S k p s Q(x, ω s ) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 18 / 89

35 Extensive Form A Whole Spectrum Complete Aggregation (Traditional LShaped): One variable for φ(x) Complete Multicut: S variables for φ(x) We can do anything in between... Partition the scenarios into C clusters S 1, S 2,... S C. φ [Sk ](x) = s S k p s Q(x, ω s ) Θ [Sk ] s S k p s Q(x, ω s ) References Original multicut paper: [Birge and Louveaux, 1988] You need not stay with one fixed aggregation: Recent paper by Trukhanov et al. [2010] Ph.D. thesis by Janjarassuk [2009]. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 18 / 89

36 Extensive Form Benders Decomposition Regardless of aggregation, linear program is likely to have exponentially many constraints. Benders Decomosition is a cutting plane method for solving one of the linear programs. I will describe for (full) multicut, but other algorithms are really just aggregated version of this one Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 19 / 89

37 Extensive Form Benders Decomposition Regardless of aggregation, linear program is likely to have exponentially many constraints. Benders Decomosition is a cutting plane method for solving one of the linear programs. I will describe for (full) multicut, but other algorithms are really just aggregated version of this one Basic Questions For a given x k, θ k 1,..., θk s, we must check for each scenario s S 1 If there r R(Π s ) such that rt s x < T h s 2 If there v V(Π s ) such that θ k s + v T T s x k < v T h s Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 19 / 89

38 Extensive Form Find a Ray? Phase 1 LP: Its Dual U s (x k ) = min 1 T u + 1 T v W s y + u v = h s T s x k y, u, v 0 max π T (h s T s x) W T s π 0 1 π 1 If U s (x k ) > 0, then by strong duality it has an optimal dual solution π k such that [π k ] T (h s T s x k ) > 0, so if we add the inequality this will exclude the solution x k (h s T s x) T π k 0 π k from Phase-1 LP is the dual extreme ray! Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 20 / 89

39 Extensive Form Find a Vertex Question: Does v V(Π s ) such that θ k s + vt s x k < v T h s? So we should Note this is the same as solving max {(h s T s x k )v} v V(Π s) sup{(h s T s x) T π W T s π q s }. And by duality, this is also the same as solving z s LP (x) = inf{qt s y W s y = h s T s x, y 0}, and looking at the (optimal) dual variables for the constraints W s y = h s T s x. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 21 / 89

40 Extensive Form Master Problem Cutting Plane Algorithm Will Identify R s R(Π s ) subset of extreme rays of dual feasible set Π s V s V(Π s ) subset of extreme points of dual feasible set Π s Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 22 / 89

41 Extensive Form Master Problem Cutting Plane Algorithm Will Identify R s R(Π s ) subset of extreme rays of dual feasible set Π s V s V(Π s ) subset of extreme points of dual feasible set Π s Full LP Master Problem min c T x + s S p s θ s min c T x + s S p s θ s r T T s x r T h s s S, r R(Π s ) θ s + v T T s x v T h s s S, v V(Π s ) Ax = b x 0 r T T s x r T h s s S, r R s θ s + v T T s x v T h s s S, v V s Ax = b x 0 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 22 / 89

42 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

43 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

44 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Solve Phase 1 LP to get U s (x k ). If U s (x k ) > 0 Q(x k, ω s ) =. Let π k s be optimal dual solution to Phase 1 LP. R s R s {π k s }, Done = false. Go to 5. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

45 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Solve Phase 1 LP to get U s (x k ). If U s (x k ) > 0 Q(x k, ω s ) =. Let π k s be optimal dual solution to Phase 1 LP. R s R s {π k s }, Done = false. Go to 5. Solve Phase-2 LP for Q(x k, ω s ), let π k s be its optimal dual multiplier. If Q(x k, ω s ) > θ k s, then V s V s {π k s }, Done = false. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

46 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Solve Phase 1 LP to get U s (x k ). If U s (x k ) > 0 Q(x k, ω s ) =. Let π k s be optimal dual solution to Phase 1 LP. R s R s {π k s }, Done = false. Go to 5. Solve Phase-2 LP for Q(x k, ω s ), let π k s be its optimal dual multiplier. If Q(x k, ω s ) > θ k s, then V s V s {π k s }, Done = false. 3 ub = c T x k + s S p sq(x k, ω s ) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

47 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Solve Phase 1 LP to get U s (x k ). If U s (x k ) > 0 Q(x k, ω s ) =. Let π k s be optimal dual solution to Phase 1 LP. R s R s {π k s }, Done = false. Go to 5. Solve Phase-2 LP for Q(x k, ω s ), let π k s be its optimal dual multiplier. If Q(x k, ω s ) > θ k s, then V s V s {π k s }, Done = false. 3 ub = c T x k + s S p sq(x k, ω s ) 4 If ub lb ɛ or Done = true then Stop. x k is an optimal solution. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

48 Extensive Form Benders/Cutting Plane Method 1 k = 1, R s = V s = s S, lb =, ub =, x 1 X 2 Done = true. For each s S Solve Phase 1 LP to get U s (x k ). If U s (x k ) > 0 Q(x k, ω s ) =. Let π k s be optimal dual solution to Phase 1 LP. R s R s {π k s }, Done = false. Go to 5. Solve Phase-2 LP for Q(x k, ω s ), let π k s be its optimal dual multiplier. If Q(x k, ω s ) > θ k s, then V s V s {π k s }, Done = false. 3 ub = c T x k + s S p sq(x k, ω s ) 4 If ub lb ɛ or Done = true then Stop. x k is an optimal solution. 5 Solve Master problem. Let lb be its optimal solution value, and let k k + 1. Let x k be the optimal solution to the master problem. Go to 2. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 23 / 89

49 Extensive Form A First Example subject to min x 1 + x 2 ω 1 x 1 + x 2 7 ω 2 x 1 + x 2 4 x 1 0 x 2 0 ω = (ω 1, ω 2 ) Ω = {(1, 1/3), (5/2, 2/3), (4, 1)} Each outcome has p s = 1 3 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 24 / 89

50 Extensive Form A First Example subject to min x 1 + x 2 ω 1 x 1 + x 2 7 ω 2 x 1 + x 2 4 x 1 0 x 2 0 ω = (ω 1, ω 2 ) Ω = {(1, 1/3), (5/2, 2/3), (4, 1)} Each outcome has p s = 1 3 Huh? This problem doesn t make sense! Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 24 / 89

51 Extensive Form Recourse Formulation min x 1 + x s + λ p s (y 1s + y 2s ) s S ω 1s x 1 + x 2 + y 1s 7 s = 1, 2, 3 ω 2s x 1 + x 2 + y 2s 4 s = 1, 2, 3 x 1, x 2, y 1s, y 2s 0 s = 1, 2, 3 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 25 / 89

52 Extensive Form Recourse Formulation min x 1 + x s + λ p s (y 1s + y 2s ) s S ω 1s x 1 + x 2 + y 1s 7 s = 1, 2, 3 ω 2s x 1 + x 2 + y 2s 4 s = 1, 2, 3 x 1, x 2, y 1s, y 2s 0 s = 1, 2, 3 Stop! Gammer Time? Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 25 / 89

53 Oracle-Based Oracle-Based Methods Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 26 / 89

54 Oracle-Based Two-Stage Stochastic LP with Recourse Our problem is where def min f(x) = c T x + E[Q(x, ω)] x X X = {x R n + Ax = b} Q(x, ω) = min y 0 {q(ω)t y T(ω)x + W(ω)y = h(ω)} In Q(x, ω), as x changes, the right hand side of the linear program changes. So, we should care very much about the value function of a linear program with respect to changes in its right-hand-side: v : R m R v(z) = min {q T y Wy = z} y R p + Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 27 / 89

55 Oracle-Based Nice Theorems Nice Theorem 1 Assume that Π = {π R m W T π q} } z 0 R m such that y 0 0 with Wy 0 = z 0 then v(z) is a proper, convex, polyhedral function v(z 0 ) = arg max{π T z 0 π Π} Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 28 / 89

56 Oracle-Based Nice Theorems Nice Theorem 1 Assume that Π = {π R m W T π q} } z 0 R m such that y 0 0 with Wy 0 = z 0 then v(z) is a proper, convex, polyhedral function v(z 0 ) = arg max{π T z 0 π Π} Nice Theorem 2 Under similar conditions (on each scenario W s, q s ) f(x) = c T x + E[Q(x, ω)] = c T x + φ(x) is proper, convex, and polyhedral subgradients of f come from (transformed and aggregated) optimal dual solutions of the second stage subproblems: S ( ) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 28 / 89

57 Oracle-Based Easy Peasy? (2SP) min f(x) x X C We know that f(x) is a nice 1 function It is also true that X C is a nice polyhedral set, so it should be easy to solve (2SP) What s the Problem?! f(x) is given implicitly: To evaluate f(x), we must solve S linear programs. 1 proper, convex, polyhedral Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 29 / 89

58 Oracle-Based Overarching Theme We will approximate 2 f by ever-improving functions of the form f(x) c T x + m k (x) Where m k (x) is a model of our expected recourse function: m k (x) S s=1 p s Q(x, ω s ) def = φ(x). We will also build ever-improving outerapproximations of C: (C k C). 2 often underapproximate Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 30 / 89

59 Oracle-Based Overarching Theme We will approximate 2 f by ever-improving functions of the form f(x) c T x + m k (x) Where m k (x) is a model of our expected recourse function: m k (x) S s=1 p s Q(x, ω s ) def = φ(x). We will also build ever-improving outerapproximations of C: (C k C). Since we know that Q(x k, ω s ) is convex, and Q(x k, ω s ) = T T s arg max π Π s {π T (h s T s x k )} we can underapproximate Q(x k, ω s ) using a (sub)-gradient inequality 2 often underapproximate Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 30 / 89

60 Oracle-Based Building a model By definition of convexity, we get Q(x, ω s ) Q(x k, ω s ) + y T (x x k ) y Q(x k, ω s ) Q(x k, ω s ) + [ T T s π k s] T (x x k ) for some π k s arg max π Π s {π T (h s T s x k )} = Q(x k, ω s ) + [π k s] T T s x k π k st s x = β k s + (α k s) T x We 3 aggregate these together to build a model of φ(x) S φ(x) = p s Q(X, ω s ) s=1 = β k + [ᾱ k ] T x S S p s β k s + p s [α k s] T x s=1 s=1 3 sometimes Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 31 / 89

61 Oracle-Based Our Model Choose some different x j X, π j s arg max π Πs {π T (h s T s x k ), j = 1,... k 1 Our Model (to minimize) β j s =Q(x j, ω s ) + [π j s] T T s x j βj = α j s = π k st s ᾱ j = m k (x) = max { β j + [ᾱ j ] T x} j=1,...,k 1 S p s β j s s=1 S p s α j s We model the process of minimizing the maximum using an auxiliary variable θ: θ β j + [ᾱ j ] T x j = 1,... k 1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 32 / 89 s=1

62 Oracle-Based An Oracle-Based Method for Convex Minimization We assume f : X R n R is a convex function given by an oracle : We can get values f(x k ) and subgradients s k f(x k ) for x k X 1 Find x 1 X, k 1, θ 1, ub, I = 2 Subproblem: Compute f(x k ), s k f(x k ).ub min{f(x k ), ub} 3 If θ k = f(x k ). STOP, x k is optimal. 4 Else: I = I {k}. Solve Master: min {θ θ θ,x X f(xi ) + s T i (x xi ) i I}. Let solution be x k+1, θ k. Go to 2. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 33 / 89

63 Oracle-Based Worth 1000 Words φ(x) x Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 34 / 89

64 Oracle-Based Worth 1000 Words φ(x) x x k Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 35 / 89

65 Oracle-Based Worth 1000 Words φ(x) x x 2 x 1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 36 / 89

66 Oracle-Based Regularizing A Dumb Algorithm? x k+1 arg min {c T x + m k (x) Ax = b} x R n + What happens if you start the algorithm with an initial iterate that is the optimal solution x? Are you done? Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 37 / 89

67 Oracle-Based Regularizing A Dumb Algorithm? x k+1 arg min {c T x + m k (x) Ax = b} x R n + What happens if you start the algorithm with an initial iterate that is the optimal solution x? Are you done? Unfortunately, no. At the first iterations, we have a very poor model m k ( ), so when we minimize this model, we may move very far away from x A variety of methods in stochastic programming use well-known methods from nonlinear programming/convex optimization to ensure that iterations are well-behaved. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 37 / 89

68 Oracle-Based Regularizing Regularizing Borrow the trust region concept from NLP Wright [2003]) At iteration k Have an incumbent solution x k Impose constraints x x k k k large like LShaped k small stay very close. This is often called Regularizing the method. (Linderoth and Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 38 / 89

69 Oracle-Based Regularizing Regularizing Borrow the trust region concept from NLP Wright [2003]) At iteration k Have an incumbent solution x k Impose constraints x x k k k large like LShaped k small stay very close. This is often called Regularizing the method. (Linderoth and Another (Good) Idea Penalize the length of the step you will take. min c T x + j C θ j + 1/(2ρ) x x k 2 ρ large like LShaped ρ small stay very close. This is known as the regularized decomposition method. Pioneered in stochastic programming by Ruszczyński [1986]. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 38 / 89

70 Oracle-Based Regularizing Trust Region Effect: Step Length Step Size Iteration Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 39 / 89

71 Oracle-Based Regularizing Trust Region Effect: Function Values 2e e e+07 LShaped function value LShaped master value Trust region function value Trust region master value 1.7e+07 Value 1.6e e e e e Iteration Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 40 / 89

72 Oracle-Based Regularizing Bundle-Trust These ideas are known in the nondifferentiable optimization community as Bundle-Trust-Region methods. Bundle Build up a bundle of subgradients to better approximate your function. (Get a better model m( )) Trust region Stay close (in a region you trust), until you build up a good enough bundle to model your function accurately Accept new iterate if it improves the objective by a sufficient amount. Potentially increase k or ρ. (Serious Step) Otherwise, improve the estimation of φ(x k ), resolve master problem, and potentially reduce k of ρ (Null Step) These methods can be shown to converge, even if cuts are deleted from the master problem. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 41 / 89

73 Oracle-Based Regularizing Vanilla Trust Region f(x) = c T x + φ(x) ^f k (x) = c T x + m k (x) 1 Let x 1 X, > 0, µ (0, 1)K = k = 1, y 1 = x 1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 42 / 89

74 Oracle-Based Regularizing Vanilla Trust Region f(x) = c T x + φ(x) ^f k (x) = c T x + m k (x) 1 Let x 1 X, > 0, µ (0, 1)K = k = 1, y 1 = x 1 2 Compute f(y 1 ) and subgradient model update information: ( β, ᾱ j ) if LShaped. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 42 / 89

75 Oracle-Based Regularizing Vanilla Trust Region f(x) = c T x + φ(x) ^f k (x) = c T x + m k (x) 1 Let x 1 X, > 0, µ (0, 1)K = k = 1, y 1 = x 1 2 Compute f(y 1 ) and subgradient model update information: ( β, ᾱ j ) if LShaped. 3 Master: Let y k+1 arg min{c T x + m k (x) x (B(x k, ) X}) 4 Compute predicted decrease: 5 If δ k ɛ Stop, y k+1 is optimal. δ k = f(x k ) ^f k (y k+1 ) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 42 / 89

76 Oracle-Based Regularizing Vanilla Trust Region f(x) = c T x + φ(x) ^f k (x) = c T x + m k (x) 1 Let x 1 X, > 0, µ (0, 1)K = k = 1, y 1 = x 1 2 Compute f(y 1 ) and subgradient model update information: ( β, ᾱ j ) if LShaped. 3 Master: Let y k+1 arg min{c T x + m k (x) x (B(x k, ) X}) 4 Compute predicted decrease: δ k = f(x k ) ^f k (y k+1 ) 5 If δ k ɛ Stop, y k+1 is optimal. 6 Subproblems: Compute f(y k+1 ) and subgradient information. Update m k (x) with subgradient information from y k+1. If f(x k ) f(y k+1 ) µδ k, then Serious Step: x k+1 y k+1 Else: Null Step: 7 x k+1 x k Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 42 / 89

77 Oracle-Based Level Method Level Method Basic Idea Instead of restricting search to points in the neighborhood of the current iterate, you restrict the research to points whose objective lies in the neighborhood of the current iterate. Idea is from Lemaréchal et al. [1995] Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 43 / 89

78 Oracle-Based Level Method Level Method Basic Idea Instead of restricting search to points in the neighborhood of the current iterate, you restrict the research to points whose objective lies in the neighborhood of the current iterate. Idea is from Lemaréchal et al. [1995] m k (x) = max i=1,...,k {f(x i ) + s T i (x xi )} Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 43 / 89

79 Oracle-Based Level Method Level Method Basic Idea Instead of restricting search to points in the neighborhood of the current iterate, you restrict the research to points whose objective lies in the neighborhood of the current iterate. Idea is from Lemaréchal et al. [1995] m k (x) = max i=1,...,k {f(x i ) + s T i (x xi )} 1 Choose λ (0, 1), x 1 X, k = 1 2 Compute f(x k ), s k f(x k ), update m k (x) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 43 / 89

80 Oracle-Based Level Method Level Method Basic Idea Instead of restricting search to points in the neighborhood of the current iterate, you restrict the research to points whose objective lies in the neighborhood of the current iterate. Idea is from Lemaréchal et al. [1995] m k (x) = max i=1,...,k {f(x i ) + s T i (x xi )} 1 Choose λ (0, 1), x 1 X, k = 1 2 Compute f(x k ), s k f(x k ), update m k (x) 3 Minimize Model: z k = min x X m k (x) Let z k = min i=1,...,k {f(x i )} be the best objective value you ve seen so far Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 43 / 89

81 Oracle-Based Level Method Level Method Basic Idea Instead of restricting search to points in the neighborhood of the current iterate, you restrict the research to points whose objective lies in the neighborhood of the current iterate. Idea is from Lemaréchal et al. [1995] m k (x) = max i=1,...,k {f(x i ) + s T i (x xi )} 1 Choose λ (0, 1), x 1 X, k = 1 2 Compute f(x k ), s k f(x k ), update m k (x) 3 Minimize Model: z k = min x X m k (x) Let z k = min i=1,...,k {f(x i )} be the best objective value you ve seen so far 4 Project: Let l k = z k + λ(z k z k ). x k+1 arg min x X { x x k 2 m k (x) l k }.k k + 1. Go to 2. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 43 / 89

82 Oracle-Based Level Method Convergence Rate A function f : X R is Lipschitz continuous over its domain X if L R such that f(y) f(x) L y x x, y X. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 44 / 89

83 Oracle-Based Level Method Convergence Rate A function f : X R is Lipschitz continuous over its domain X if L R such that f(y) f(x) L y x x, y X. The diameter of a compact set X is diam(x) def = max x y x,y X Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 44 / 89

84 Oracle-Based Level Method Convergence Rate A function f : X R is Lipschitz continuous over its domain X if L R such that f(y) f(x) L y x x, y X. The diameter of a compact set X is Smart Guy Theorem diam(x) def = max x y x,y X z k z k ɛ k C(λ) ( ) LD 2, ɛ Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 44 / 89

85 Oracle-Based Level Method Convergence Rate A function f : X R is Lipschitz continuous over its domain X if L R such that f(y) f(x) L y x x, y X. The diameter of a compact set X is Smart Guy Theorem diam(x) def = max x y x,y X z k z k ɛ k C(λ) ( ) LD 2, ɛ C(λ) = 1 λ(1 λ) 2 (2 λ) This rate is independent of the number of variables of the problem The minimimum C(λ ) = 4 when λ = Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 44 / 89

86 Oracle-Based Level Method Papers with Computational Experience Some computational experience in Zverovich s Ph.D. thesis: [Zverovich, 2011] Zverovich et al. [2012] have a nice, comprehensive comparison between Solving extensive form using simplex method and barrier method LShaped-method (aggregated forms) Regularized Decomposition Level method Trust region method Who s the winner? Hard to pick. But I think level method wins, simplex on extensive form is slowest Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 45 / 89

87 Oracle-Based Level Method Performance Profile [Zverovich et al., 2012] Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 46 / 89

88 Other Ideas for 2SP Dual Decomposition Methods A Dual Idea Dual Decomposition Create copies of the first-stage decision variables for each scenario Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 47 / 89

89 Other Ideas for 2SP Dual Decomposition Methods A Dual Idea Dual Decomposition Create copies of the first-stage decision variables for each scenario minimize subject to p s c T x s + q T y s s S Ax s = b T s x s + Wy s = h s s S x s 0 s S y s 0 s S x 1 = x 2 =... = x S Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 47 / 89

90 Other Ideas for 2SP Dual Decomposition Methods Relax Nonanticipativity The constraints x 0 = x 1 = x 2 =... = x s are called nonanticipativity constraints. We relax the nonanticipativity constraints, so the problem decomposes by scenario, and then we do Lagrangian Relaxation: max D s (λ s ) λ 1,...λ s s S where D s (λ s ) = min {p s (c T x s + q T y s ) + λ T s (x s x 0 ), } (x s,y s) F s and F s = {(x, y) Ax = b, T s x + W s y = h s, x 0, y 0} Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 48 / 89

91 Other Ideas for 2SP Dual Decomposition Methods Relax Nonanticipativity The constraints x 0 = x 1 = x 2 =... = x s are called nonanticipativity constraints. We relax the nonanticipativity constraints, so the problem decomposes by scenario, and then we do Lagrangian Relaxation: max D s (λ s ) λ 1,...λ s s S where D s (λ s ) = min {p s (c T x s + q T y s ) + λ T s (x s x 0 ), } (x s,y s) F s and F s = {(x, y) Ax = b, T s x + W s y = h s, x 0, y 0} Even Fancier You can do Augmented Lagrangian or Progressive Hedging [Rockafellar and Wets, 1991] by adding a quadratic proximal term to the Lagrangian function Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 48 / 89

92 Other Ideas for 2SP Bunching Bunching This idea is found in the works of Wets [1988] and Gassmann [1990] If W s = W, q s = q, s = 1,..., S, then to evaluate φ(x) we must solve S linear programs that differ only in their right hand side. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 49 / 89

93 Other Ideas for 2SP Bunching Bunching This idea is found in the works of Wets [1988] and Gassmann [1990] If W s = W, q s = q, s = 1,..., S, then to evaluate φ(x) we must solve S linear programs that differ only in their right hand side. Therefore, the dual LPs differ only the objective function: π s arg max π {πt (h s T s^x) : π T W q}. Basic Idea π s is feasible for all scenarios, and we have a dual feasible basis matrix W B For a new scenario (h r, T r ), with new objective (h r T r^x), if all reduced costs are negative, then π s is also optimal for scenario r Use dual simplex to solve scenario linear programs evaluating φ(x) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 49 / 89

94 Other Ideas for 2SP Interior Point Methods Interior Point methods c T x + p 1 q T 1 y 1 + p 2 q T 2 y p sq T s y s Ax = b T 1 x + W 1 y 1 = h 1 T 2 x + W 2 y 2 = h T S x + W S y s = h s x X y 1 Y y 2 Y y s Y Since extensive form is highly structured, then matrices of kkt system that must be solved (via Newton-type methods) for interior point methods can also be exploited. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 50 / 89

95 Other Ideas for 2SP Interior Point Methods Interior Point methods c T x + p 1 q T 1 y 1 + p 2 q T 2 y p sq T s y s Ax = b T 1 x + W 1 y 1 = h 1 T 2 x + W 2 y 2 = h T S x + W S y s = h s x X y 1 Y y 2 Y y s Y Since extensive form is highly structured, then matrices of kkt system that must be solved (via Newton-type methods) for interior point methods can also be exploited. He s The Expert! Definitely stick around for Jacek Gondzio s final plenary Recent computational advances in solving very large stochastic programs, 5PM on Friday. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 50 / 89

96 Computing Computing Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 51 / 89

97 Computing SMPS SMPS Format How do we specify a stochastic programming instance to the solver? We could form the extensive form ourselves, but... For really big problems, forming the extensive form is out of the questions. We need to just specify the random parts of the model. We can do this using SMPS format There is some recent research work in developed stochastic programming support in an AML. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 52 / 89

98 Computing SMPS Modeling Language Support AMPL: ( Talk by Gautum Mitra: Formulation and solver support for optimisation under uncertainty, Thursday afternoon, Room 3. SML: Colombo et al. [2009]. Adds keywords extending AMPL that encode problem structure. GAMS: ( Uses GAMS EMP (Extended Math Programming) framework. Manual at LINDO: ( Has support for built-in sampling 4 procedures. MPL: ( Has built-in decomposition solvers. Some introductory slides at 4 We ll talk about sampling shortly Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 53 / 89

99 Computing SMPS SMPS Components Core file Like MPS file for base instance Time file Specifies the time dependence structure Stoch file Specifies the randomness Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 54 / 89

100 Computing SMPS SMPS Components Core file Like MPS file for base instance Time file Specifies the time dependence structure Stoch file Specifies the randomness SMPS References Birge et al. [1987], Gassmann and Schweitzer [2001] Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 54 / 89

101 Computing SMPS SMPS Core File Like an MPS file specifying a base scenario Must permute the rows and columns so that the time indexing is sequential. (Columns for stage j listed before columns for stage j + 1). Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 55 / 89

102 Computing SMPS SMPS Core File Like an MPS file specifying a base scenario Must permute the rows and columns so that the time indexing is sequential. (Columns for stage j listed before columns for stage j + 1). min x 1 + x s + λ p s (y 1s + y 2s ) s S ω 1s x 1 + x 2 + y 1s 7 s = 1, 2, 3 ω 2s x 1 + x 2 + y 2s 4 s = 1, 2, 3 x 1, x 2, y 1s, y 2s 0 s = 1, 2, 3 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 55 / 89

103 Computing SMPS little.cor NAME little ROWS G R0001 G R0002 N R0003 COLUMNS C0001 R C0001 R C0001 R C0002 R C0002 R C0002 R C0003 R C0003 R C0004 R C0004 R RHS B R B R ENDATA Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 56 / 89

104 Computing SMPS little.tim Specify which row/column starts each time period. Must be sequential! * TIME little PERIODS IMPLICIT C0001 R0001 T1 C0003 R0001 T2 ENDATA Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 57 / 89

105 Computing SMPS Stoch File BLOCKS Specify a block of parameters that changes together INDEP Specify that all the parameters you are specifying are all independent random variables SCENARIO Specify a base scenario Specify what things change and when... Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 58 / 89

106 Computing SMPS litle.sto * STOCH little * BLOCKS DISCRETE BL BLOCK1 T C0001 R C0001 R BL BLOCK1 T C0001 R C0001 R BL BLOCK1 T C0001 R C0001 R ENDATA * STOCH little * INDEP DISCRETE C0001 R C0001 R C0001 R C0001 R ENDATA Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 59 / 89

107 Computing Utility Libraries Some Utility Libraries If you need to read and write SMPS files and manipulate and query the instance as part of build an algorithm in a programming language, you can try the following libraries Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 60 / 89

108 Computing Utility Libraries Some Utility Libraries If you need to read and write SMPS files and manipulate and query the instance as part of build an algorithm in a programming language, you can try the following libraries PySP: [Watson et al., 2012] Based on Pyomo [Hart et al., 2011] Also allows to build models Some algorithmic support, especially for progressive hedging type algorithms Watson and Woodruff [2011] Coin-SMI: From Coin-OR collection of open source code. SUTIL: A little bit dated, but being refectored now Has implemented methods for sampling from distribution specified in SMPS files Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 60 / 89

109 Computing Parallel Computation Parallelizing In decomposition algorithms, the evaluation of φ(x) solving the different LP s, can be done independently. If you have K computers, send them each one of S /K linear programs, and your evaluation of φ(x) will be completed K times faster. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 61 / 89

110 Computing Parallel Computation Parallelizing In decomposition algorithms, the evaluation of φ(x) solving the different LP s, can be done independently. If you have K computers, send them each one of S /K linear programs, and your evaluation of φ(x) will be completed K times faster. Factors Affecting Efficiency Synchronization: Waiting for all parallel machines to complete Solving the master problem worker machines waiting for master to complete Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 61 / 89

111 Computing Parallel Computation Worker Usage Number of Idle Workers t (sec.) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 62 / 89

112 Computing Parallel Computation Stamp Out Synchronicity! We start a new iteration only after all LPs have been evaluated In cloud/heterogeneous computing environments, different processors act at different speeds, so many may wait idle for the slowpoke Even worse, in many cloud environments, machines can be reclaimed before completing their tasks. Distributed Computing Fact Asynchronous methods are preferred for traditional parallel computing environments. They are nearly required for heterogenous and dynamic environments! Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 63 / 89

113 Computing Parallel Computation ATR An Asynchronous Trust Region Method Keep a basket B of trial points for which we are evaluating the objective function Make decision on whether or accept new iterate x k+1 after entire f(x k ) is computed Convergence theory and cut deletion theory is similar to the synchronous algorithm Populate the basket quickly by initially solving the master problem after only α% of the scenario LPs have been solved Greatly reduces the synchronicity requirements Might be doing some unnecessary work the candidiate points might be better if you waited for complete information from the preceeding iterations Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 64 / 89

114 Computing Parallel Computation The World s Largest LP Storm A cargo flight scheduling problem (Mulvey and Ruszczyński) In 2000, we aimed to solve an instance with 10,000,000 scenarios x R 121, y(ω s ) R 1259 The deterministic equivalent is of size A R 985,032,889 12,590,000,121 Cuts/iteration 1024, # Chunks 1024, B = 4 Started from an N = solution, 0 = 1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 65 / 89

115 Computing Parallel Computation The Super Storm Computer Number Type Location 184 Intel/Linux Argonne 254 Intel/Linux New Mexico 36 Intel/Linux NCSA 265 Intel/Linux Wisconsin 88 Intel/Solaris Wisconsin 239 Sun/Solaris Wisconsin 124 Intel/Linux Georgia Tech 90 Intel/Solaris Georgia Tech 13 Sun/Solaris Georgia Tech 9 Intel/Linux Columbia U. 10 Sun/Solaris Columbia U. 33 Intel/Linux Italy (INFN) 1345 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 66 / 89

116 Computing Parallel Computation TA-DA!!!!! Wall clock time 31:53:37 CPU time 1.03 Years Avg. # machines 433 Max # machines 556 Parallel Efficiency 67% Master iterations 199 CPU Time solving the master problem 1:54:37 Maximum number of rows in master problem Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 67 / 89

117 Computing Parallel Computation Number of Workers #workers Sec. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 68 / 89

118 Sampling and Monte Carlo Monte Carlo Methods Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 69 / 89

119 Sampling and Monte Carlo The Ugly Truth Imagine the following (real) problem. A Telecom company wants to expand its network in a way in which to meet an unknown (random) demand. There are 86 unknown demands. Each demand is independent and may take on one of five values. S = Ω = Π 86 k=1 (5) = 586 = The number of subatomic particles in the universe. How do we solve a problem that has more variables and more constraints than the number of subatomic particles in the universe? Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 70 / 89

120 Sampling and Monte Carlo The Ugly Truth Imagine the following (real) problem. A Telecom company wants to expand its network in a way in which to meet an unknown (random) demand. There are 86 unknown demands. Each demand is independent and may take on one of five values. S = Ω = Π 86 k=1 (5) = 586 = The number of subatomic particles in the universe. How do we solve a problem that has more variables and more constraints than the number of subatomic particles in the universe? The answer is we can t! We solve an approximating problem obtained through sampling. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 70 / 89

121 Sampling and Monte Carlo Monte Carlo Methods ( ) min {f(x) E PF(x, ω) F(x, ω)dp(ω)} x X Ω Most of the theory presented holds for (*) A very general SP problem Naturally it holds for our favorite SP problem: X def = {x Ax = b, x 0} f(x) c T x + E{Q(x, ω)} Q(x, ω) min y 0 {q(ω) T y Wy = h(ω) T(ω)x} Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 71 / 89

122 Sampling and Monte Carlo Monte Carlo Methods ( ) min {f(x) E PF(x, ω) F(x, ω)dp(ω)} x X Ω Most of the theory presented holds for (*) A very general SP problem Naturally it holds for our favorite SP problem: X def = {x Ax = b, x 0} f(x) c T x + E{Q(x, ω)} Q(x, ω) min y 0 {q(ω) T y Wy = h(ω) T(ω)x} The Dirty Secret Evaluating f(x) is completely intractable! Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 71 / 89

123 Sampling and Monte Carlo Sampling Methods Interior Sampling Methods Sample while solving LShaped Method [Dantzig and Infanger, 1991] Stochastic Decomposition [Higle and Sen, 1991] Stochastic Approximation Methods Stochastic Quasi-Gradient [Ermoliev, 1983] Mirror-Descent Stochastic Approximation [Nemirovski et al., 2009] Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 72 / 89

124 Sampling and Monte Carlo Sampling Methods Interior Sampling Methods Sample while solving LShaped Method [Dantzig and Infanger, 1991] Stochastic Decomposition [Higle and Sen, 1991] Stochastic Approximation Methods Stochastic Quasi-Gradient [Ermoliev, 1983] Mirror-Descent Stochastic Approximation [Nemirovski et al., 2009] Exterior sampling methods Sample. Then Solve. Sample Average Approximation Key Obtain (Statistical) bounds on solution quality Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 72 / 89

125 Sampling and Monte Carlo Sample Average Approximation Instead of solving (*), we solve an approximating problem. Let ξ 1,..., ξ N be N realizations of the random variable ξ: min {f N(x) N 1 x S N F(x, ξ j )}. j=1 f N (x) is just the sample average function For any x, we consider f N (x) a random variable, as it depends on the random sample Since ξ j drawn from P, f N (x) is an unbiased estimator of f(x) E[f N (x)] = f(x) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 73 / 89

126 Sampling and Monte Carlo Lower Bounds v = min {f(x) E PF(x, ω) F(x, ω)p(ω)dω} x S Ω For some sample ξ 1,..., ξ N, let Thm: v N = min x S {f N(x) N 1 N F(x, ξ j )}. j=1 E[v N ] v Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 74 / 89

127 Sampling and Monte Carlo Proof E min x X N 1 min x X N 1 N N F(x, ξ j ) N 1 F(x, ξ j ) x X j=1 N F(x, ξ j ) E j=1 E [v N ] E j=1 N 1 N 1 N j=1 N j=1 min E N 1 x X F(x, ξ j ) F(x, ξ j ) N j=1 x X x X F(x, ξ j ) = v Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 75 / 89

128 Sampling and Monte Carlo Next? Now we need to somehow estimate E[v n ] Idea: Generate M independent samples, ξ 1,j,..., ξ N,j, j = 1,..., M, each of size N, and solve the corresponding SAA problems min x X { N f j N (x) := N 1 } F(x, ξ i,j ), (1) i=1 for each j = 1,..., M. Let v j N and compute be the optimal value of problem (1), M L N,M 1 M j=1 v j N Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 76 / 89

129 Sampling and Monte Carlo Lower Bounds The estimate L N,M is an unbiased estimate of E[v N ]. By our last theorem, it provides a statistical lower bound for the true optimal value v. When the M batches ξ 1,j, ξ 2,j,..., ξ N,j, j = 1,..., M, are i.i.d. (although the elements within each batch do not need to be i.i.d.) have by the Central Limit Theorem that M [LN,M E(v N )] N (0, σ 2 L ) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 77 / 89

130 Sampling and Monte Carlo Confidence Intervals I can estimate the variance of my estimate L M,N as s 2 L (M) 1 M 1 M j=1 ( v j N L N,M) 2. Defining z α to satisfy P{N(0, 1) z α } = 1 α, and replacing σ L by s L (M), we can obtain an approximate (1 α)-confidence interval for E[v N ] to be [ L N,M z αs L (M), L N,M + z ] αs L (M) M M Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 78 / 89

131 Sampling and Monte Carlo Upper Bounds v def = min{f(x) = E P F(x; ω) def = F(x; ω)p(ω)dω} x X Ω Quick, Someone prove... f(^x) v x X How can we estimate f(^x)? Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 79 / 89

132 Sampling and Monte Carlo Estimating f(^x) Generate T independent batches of samples of size N, denoted by ξ 1,j, ξ 2,j,..., ξ N,j, j = 1, 2,..., T, where each batch has the unbiased property, namely E f j N(x) := N 1 N i=1 We can then use the average value defined by as an estimate of f(^x). F(x, ξ i,j ) = f(x), for all x X. U N,T (^x) := T 1 T f j N(^x) j=1 Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 80 / 89

133 Sampling and Monte Carlo More Confidence Intervals By applying the Central Limit Theorem again, we have that T [U N,T (^x) f(^x)] N(0, σ 2 U (^x)), as T, where σ 2 U (^x) := Var [ f N(^x) ]. We can estimate σ 2 U (^x) by the sample variance estimator s 2 U (^x; T) defined by s 2 1 U (^x; T) := T 1 T j=1 [ f j N(^x) U N,T (^x)] 2. By replacing σ 2 U (^x) with s2 U (^x; T), we can proceed as above to obtain a (1 α)-confidence interval for f(^x): [ U N,T (^x) z αs U (^x; T), U N,T (^x) + z ] αs U (^x; T). T T Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 81 / 89

134 Sampling and Monte Carlo Putting it all together v N is the optimal solution value for the sample average function: { v N min x S f N (x) := N 1 } N j=1 F(x, ωj ) Estimate E(v N ) as Ê(v N) = L N,M = M 1 M j=1 vj N Solve M stochastic LP s, each of sampled size N. f N (x) is the sample average function Draw ω 1,... ω N from P f N (x) N 1 N j=1 F(x, ωj ) For Stochastic LP w/recourse solve N LP s. Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 82 / 89

135 Sampling and Monte Carlo Recapping Theorems Thm. E(v N ) v f(x) x Thm. fn (^x) E(v N ) f(^x) v, as N, M, N We are mostly interested in estimating the quality of a given solution ^x. This is f(^x) v. f N (^x) computed by solving N independent LPs. Ê(v N ) computed by solving M independent stochastic LPs. Independent no synchronization easy to do in parallel Independent can construct confidence intervals around the estimates Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 83 / 89

136 Sampling and Monte Carlo An experiment M times Solve a stochastic sampled approximation of size N. (Thus obtaining an estimate of E(v N )). For each of the M solutions x 1,... x M, estimate f(^x) by solving N LP s. Test Instances Name Application Ω (m 1, n 1 ) (m 2, n 2 ) LandS HydroPower Planning 10 6 (2,4) (7,12) gbd Fleet Routing (?,?) (?,?) storm Cargo Flight Scheduling (185, 121) (?,1291) 20term Vehicle Assignment (1,5) (71,102) ssn Telecom. Network Design (1,89) (175,706) Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 84 / 89

137 Sampling and Monte Carlo Convergence of Optimal Solution Value 9 M 12, N = 10 6 Monte Carlo Sampling Instance N = 50 N = 100 N = 500 N = 1000 N = term gbd LandS storm ssn Latin Hypercube Sampling Instance N = 50 N = 100 N = 500 N = 1000 N = term gbd LandS storm ssn Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 85 / 89

138 Sampling and Monte Carlo 20term Convergence. Monte Carlo Sampling Lower Bound Upper Bound Value N Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 86 / 89

139 Sampling and Monte Carlo ssn Convergence. Monte Carlo Sampling Lower Bound Upper Bound Value N Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 87 / 89

140 Sampling and Monte Carlo storm Convergence. Monte Carlo Sampling 1.555e e+06 Lower Bound Upper Bound 1.553e e e+06 Value 1.55e e e e e e e N Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 88 / 89

141 Sampling and Monte Carlo gbd Convergence. Monte Carlo Sampling 1800 Lower Bound Upper Bound Value N Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 89 / 89

142 Bibliography Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 14

143 References J. R. Birge and F. V. Louveaux. A multicut algorithm for two-stage stochastic linear programs. European Journal of Operations Research, 34: , J. R. Birge, M. A. H. Dempster, H. I. Gassmann, E. A. Gunn, and A. J. King. A standard input format for multiperiod stochastic linear programs. COAL Newsletter, 17:1 19, J. R. Birge, C. J. Donohue, D. F. Holmes, and O. G. Svintsiski. A parallel implementation of the nested decomposition algorithm for multistage stochastic linear programs. Mathematical Programming, 75: , M. Colombo, A. Grothey, J. Hogg, K. Woodsend, and J. Gondzio. A structure-conveying modelling language for mathematical and stochastic programming. Mathematical Programming Computation, 1(4): , G. Dantzig and G. Infanger. Large-scale stochastic linear programs Importance sampling and Bender s decomposition. In C. Brezinski and U. Kulisch, editors, Computational and Applied Mathematics I (Dublin, 1991), pages North-Holland, Amsterdam, Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 14

144 References Y. Ermoliev. Stochastic quasi-gradient methods and their application to systems optimization. Stochastics, 4:1 37, H. I. Gassmann. MSLiP: A computer code for the multistage stochastic linear programming problem. Mathematical Programming, 47: , H.I. Gassmann and E. Schweitzer. A comprehensive input format for stochastic linear programs. Annals of Operations Research, 104:89 125, A. Gavironski. Implementation of stochastic quasigradient methods. In Numerical Techniques for Stochastic Optimization. Springer-Verlag, W. E. Hart, J.-P. Watson, and D. L. Woodruff. Pyomo: Modeling and solving mathematical programs in Python. Mathematical Programming Computation, 3(3): , J. L. Higle and S. Sen. Stochastic decomposition: An algorithm for two stage linear programs with recourse. Mathematics of Operations Research, 16(3): , Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 14

145 References U. Janjarassuk. Using Computational Grids for Effective Solution of Stochastic Programs. PhD thesis, Department of Industrial and Systems Engineering, Lehigh University, G. Lan, A. Nemirovski, and A. Shapiro. Validation analysis of mirror descent stochastic approximation method. Mathematical Programming, pages 1 34, C. Lemaréchal, A. Nemirovskii, and Y. Nesterov. New variants of bundle methods. Mathematical Programming, 69: , J. T. Linderoth and S. J. Wright. Implementing a decomposition algorithm for stochastic programming on a computational grid. Computational Optimization and Applications, 24: , Special Issue on Stochastic Programming. A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19: , H. Robbins and S. Monro. On a stochastic approximation method. Annals of Mathematical Statistics, 22: , Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 14

146 References R.T. Rockafellar and Roger J.-B. Wets. Scenarios and policy aggregation in optimization under uncertainty. Mathematics of Operations Research, 16(1): , A. Ruszczyński. A regularized decomposition for minimizing a sum of polyhedral functions. Mathematical Programming, 35: , A. Ruszczyński. Parallel decomposition of multistage stochastic programming problems. Mathematical Programming, 58: , S. Trukhanov, L. Ntaimo, and A. Schaefer. Adaptive multicut aggregation for two-stage stochastic linear programs with recourse. European Journal of Operational Research, 206: , J.-P. Watson and D. L. Woodruff. Progressive hedging innovations for a class of stochastic mixed-integer resource allocation problems. Computational Management Science, 8(4): , J.-P. Watson, D. L. Woodruff, and W. E. Hart. Pysp: Modeling and solving stochastic programs in Python. Mathematical Programming Computation, 4(2): , R. J. B. Wets. Large-scale linear programming techniques in stochastic Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 1 / 14

147 Bibliography programming. In Numerical Techniques for Stochastic Optimization. Springer-Verlag, V. Zverovich. Modelling and Solution Methods for Stochastic Optimization. PhD thesis, Brunel university, V. Zverovich, C. I. Fábián, E. F. D. Ellison, and G. Mitra. A computational study of a solver system for processing two-stage stochastic LPs with enhanced Benders decomposition. Mathematical Programming Computation, 4(3):21 238, Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 2 / 14

148 Bibliography Miscellaneous Topics Jeff Linderoth (UW-Madison) Computational SP Lecture Notes 2 / 14

Stochastic Decomposition

Stochastic Decomposition IE 495 Lecture 18 Stochastic Decomposition Prof. Jeff Linderoth March 26, 2003 March 19, 2003 Stochastic Programming Lecture 17 Slide 1 Outline Review Monte Carlo Methods Interior Sampling Methods Stochastic

More information

Monte Carlo Methods for Stochastic Programming

Monte Carlo Methods for Stochastic Programming IE 495 Lecture 16 Monte Carlo Methods for Stochastic Programming Prof. Jeff Linderoth March 17, 2003 March 17, 2003 Stochastic Programming Lecture 16 Slide 1 Outline Review Jensen's Inequality Edmundson-Madansky

More information

SMPS and the Recourse Function

SMPS and the Recourse Function IE 495 Lecture 8 SMPS and the Recourse Function Prof. Jeff Linderoth February 5, 2003 February 5, 2003 Stochastic Programming Lecture 8 Slide 1 Outline Formulating Stochastic Program(s). SMPS format Jacob

More information

Stochastic Integer Programming

Stochastic Integer Programming IE 495 Lecture 20 Stochastic Integer Programming Prof. Jeff Linderoth April 14, 2003 April 14, 2002 Stochastic Programming Lecture 20 Slide 1 Outline Stochastic Integer Programming Integer LShaped Method

More information

Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem

Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem Network Interdiction Stochastic Network Interdiction and to Solve a Stochastic Network Interdiction Problem Udom Janjarassuk Jeff Linderoth ISE Department COR@L Lab Lehigh University jtl3@lehigh.edu informs

More information

Solution Methods for Stochastic Programs

Solution Methods for Stochastic Programs Solution Methods for Stochastic Programs Huseyin Topaloglu School of Operations Research and Information Engineering Cornell University ht88@cornell.edu August 14, 2010 1 Outline Cutting plane methods

More information

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 44

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 44 1 / 44 The L-Shaped Method Operations Research Anthony Papavasiliou Contents 2 / 44 1 The L-Shaped Method [ 5.1 of BL] 2 Optimality Cuts [ 5.1a of BL] 3 Feasibility Cuts [ 5.1b of BL] 4 Proof of Convergence

More information

Decomposition Algorithms for Stochastic Programming on a Computational Grid

Decomposition Algorithms for Stochastic Programming on a Computational Grid Optimization Technical Report 02-07, September, 2002 Computer Sciences Department, University of Wisconsin-Madison Jeff Linderoth Stephen Wright Decomposition Algorithms for Stochastic Programming on a

More information

Optimization Tools in an Uncertain Environment

Optimization Tools in an Uncertain Environment Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization

More information

Financial Optimization ISE 347/447. Lecture 21. Dr. Ted Ralphs

Financial Optimization ISE 347/447. Lecture 21. Dr. Ted Ralphs Financial Optimization ISE 347/447 Lecture 21 Dr. Ted Ralphs ISE 347/447 Lecture 21 1 Reading for This Lecture C&T Chapter 16 ISE 347/447 Lecture 21 2 Formalizing: Random Linear Optimization Consider the

More information

Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem

Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem Reformulation and Sampling to Solve a Stochastic Network Interdiction Problem UDOM JANJARASSUK, JEFF LINDEROTH Department of Industrial and Systems Engineering, Lehigh University, 200 W. Packer Ave. Bethlehem,

More information

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 38

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 38 1 / 38 The L-Shaped Method Operations Research Anthony Papavasiliou Contents 2 / 38 1 The L-Shaped Method 2 Example: Capacity Expansion Planning 3 Examples with Optimality Cuts [ 5.1a of BL] 4 Examples

More information

An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse

An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse Yongjia Song, James Luedtke Virginia Commonwealth University, Richmond, VA, ysong3@vcu.edu University

More information

Benders Decomposition

Benders Decomposition Benders Decomposition Yuping Huang, Dr. Qipeng Phil Zheng Department of Industrial and Management Systems Engineering West Virginia University IENG 593G Nonlinear Programg, Spring 2012 Yuping Huang (IMSE@WVU)

More information

Lagrangian Relaxation in MIP

Lagrangian Relaxation in MIP Lagrangian Relaxation in MIP Bernard Gendron May 28, 2016 Master Class on Decomposition, CPAIOR2016, Banff, Canada CIRRELT and Département d informatique et de recherche opérationnelle, Université de Montréal,

More information

CO 250 Final Exam Guide

CO 250 Final Exam Guide Spring 2017 CO 250 Final Exam Guide TABLE OF CONTENTS richardwu.ca CO 250 Final Exam Guide Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4,

More information

Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization

Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization Robert M. Freund April 29, 2004 c 2004 Massachusetts Institute of echnology. 1 1 Block Ladder Structure We consider

More information

Stochastic Programming Math Review and MultiPeriod Models

Stochastic Programming Math Review and MultiPeriod Models IE 495 Lecture 5 Stochastic Programming Math Review and MultiPeriod Models Prof. Jeff Linderoth January 27, 2003 January 27, 2003 Stochastic Programming Lecture 5 Slide 1 Outline Homework questions? I

More information

Chance Constrained Programming

Chance Constrained Programming IE 495 Lecture 22 Chance Constrained Programming Prof. Jeff Linderoth April 21, 2003 April 21, 2002 Stochastic Programming Lecture 22 Slide 1 Outline HW Fixes Chance Constrained Programming Is Hard Main

More information

Lecture 1. Stochastic Optimization: Introduction. January 8, 2018

Lecture 1. Stochastic Optimization: Introduction. January 8, 2018 Lecture 1 Stochastic Optimization: Introduction January 8, 2018 Optimization Concerned with mininmization/maximization of mathematical functions Often subject to constraints Euler (1707-1783): Nothing

More information

Complexity of two and multi-stage stochastic programming problems

Complexity of two and multi-stage stochastic programming problems Complexity of two and multi-stage stochastic programming problems A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA The concept

More information

Integer Programming Reformulations: Dantzig-Wolfe & Benders Decomposition the Coluna Software Platform

Integer Programming Reformulations: Dantzig-Wolfe & Benders Decomposition the Coluna Software Platform Integer Programming Reformulations: Dantzig-Wolfe & Benders Decomposition the Coluna Software Platform François Vanderbeck B. Detienne, F. Clautiaux, R. Griset, T. Leite, G. Marques, V. Nesello, A. Pessoa,

More information

Linear Programming: Simplex

Linear Programming: Simplex Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Linear Programming: Simplex IMA, August 2016

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Introduction to Nonlinear Stochastic Programming

Introduction to Nonlinear Stochastic Programming School of Mathematics T H E U N I V E R S I T Y O H F R G E D I N B U Introduction to Nonlinear Stochastic Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio SPS

More information

15-780: LinearProgramming

15-780: LinearProgramming 15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear

More information

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006 Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

More information

Penalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

Stochastic Dual Dynamic Integer Programming

Stochastic Dual Dynamic Integer Programming Stochastic Dual Dynamic Integer Programming Jikai Zou Shabbir Ahmed Xu Andy Sun December 26, 2017 Abstract Multistage stochastic integer programming (MSIP) combines the difficulty of uncertainty, dynamics,

More information

Decomposition methods in optimization

Decomposition methods in optimization Decomposition methods in optimization I Approach I: I Partition problem constraints into two groups: explicit and implicit I Approach II: I Partition decision variables into two groups: primary and secondary

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS Working Paper 01 09 Departamento de Estadística y Econometría Statistics and Econometrics Series 06 Universidad Carlos III de Madrid January 2001 Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624

More information

Outline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation

Outline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING 1. Lagrangian Relaxation Lecture 12 Single Machine Models, Column Generation 2. Dantzig-Wolfe Decomposition Dantzig-Wolfe Decomposition Delayed Column

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis CS711008Z Algorithm Design and Analysis Lecture 8 Linear programming: interior point method Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 31 Outline Brief

More information

Lecture 9: Dantzig-Wolfe Decomposition

Lecture 9: Dantzig-Wolfe Decomposition Lecture 9: Dantzig-Wolfe Decomposition (3 units) Outline Dantzig-Wolfe decomposition Column generation algorithm Relation to Lagrangian dual Branch-and-price method Generated assignment problem and multi-commodity

More information

A Benders Algorithm for Two-Stage Stochastic Optimization Problems With Mixed Integer Recourse

A Benders Algorithm for Two-Stage Stochastic Optimization Problems With Mixed Integer Recourse A Benders Algorithm for Two-Stage Stochastic Optimization Problems With Mixed Integer Recourse Ted Ralphs 1 Joint work with Menal Güzelsoy 2 and Anahita Hassanzadeh 1 1 COR@L Lab, Department of Industrial

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

Bilevel Integer Optimization: Theory and Algorithms

Bilevel Integer Optimization: Theory and Algorithms : Theory and Algorithms Ted Ralphs 1 Joint work with Sahar Tahernajad 1, Scott DeNegre 3, Menal Güzelsoy 2, Anahita Hassanzadeh 4 1 COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University

More information

Lagrange Relaxation: Introduction and Applications

Lagrange Relaxation: Introduction and Applications 1 / 23 Lagrange Relaxation: Introduction and Applications Operations Research Anthony Papavasiliou 2 / 23 Contents 1 Context 2 Applications Application in Stochastic Programming Unit Commitment 3 / 23

More information

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding

More information

2.098/6.255/ Optimization Methods Practice True/False Questions

2.098/6.255/ Optimization Methods Practice True/False Questions 2.098/6.255/15.093 Optimization Methods Practice True/False Questions December 11, 2009 Part I For each one of the statements below, state whether it is true or false. Include a 1-3 line supporting sentence

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Overview of course. Introduction to Optimization, DIKU Monday 12 November David Pisinger

Overview of course. Introduction to Optimization, DIKU Monday 12 November David Pisinger Introduction to Optimization, DIKU 007-08 Monday November David Pisinger Lecture What is OR, linear models, standard form, slack form, simplex repetition, graphical interpretation, extreme points, basic

More information

Homework 3. Convex Optimization /36-725

Homework 3. Convex Optimization /36-725 Homework 3 Convex Optimization 10-725/36-725 Due Friday October 14 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Column Generation for Extended Formulations

Column Generation for Extended Formulations 1 / 28 Column Generation for Extended Formulations Ruslan Sadykov 1 François Vanderbeck 2,1 1 INRIA Bordeaux Sud-Ouest, France 2 University Bordeaux I, France ISMP 2012 Berlin, August 23 2 / 28 Contents

More information

Computations with Disjunctive Cuts for Two-Stage Stochastic Mixed 0-1 Integer Programs

Computations with Disjunctive Cuts for Two-Stage Stochastic Mixed 0-1 Integer Programs Computations with Disjunctive Cuts for Two-Stage Stochastic Mixed 0-1 Integer Programs Lewis Ntaimo and Matthew W. Tanner Department of Industrial and Systems Engineering, Texas A&M University, 3131 TAMU,

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming School of Mathematics T H E U N I V E R S I T Y O H F E D I N B U R G Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio

More information

Integer Programming Chapter 15

Integer Programming Chapter 15 Integer Programming Chapter 15 University of Chicago Booth School of Business Kipp Martin November 9, 2016 1 / 101 Outline Key Concepts Problem Formulation Quality Solver Options Epsilon Optimality Preprocessing

More information

Decomposition Algorithms with Parametric Gomory Cuts for Two-Stage Stochastic Integer Programs

Decomposition Algorithms with Parametric Gomory Cuts for Two-Stage Stochastic Integer Programs Decomposition Algorithms with Parametric Gomory Cuts for Two-Stage Stochastic Integer Programs Dinakar Gade, Simge Küçükyavuz, Suvrajeet Sen Integrated Systems Engineering 210 Baker Systems, 1971 Neil

More information

Applications. Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang

Applications. Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang Introduction to Large-Scale Linear Programming and Applications Stephen J. Stoyan, Maged M. Dessouky*, and Xiaoqing Wang Daniel J. Epstein Department of Industrial and Systems Engineering, University of

More information

Convex Optimization Lecture 16

Convex Optimization Lecture 16 Convex Optimization Lecture 16 Today: Projected Gradient Descent Conditional Gradient Descent Stochastic Gradient Descent Random Coordinate Descent Recall: Gradient Descent (Steepest Descent w.r.t Euclidean

More information

Multidisciplinary System Design Optimization (MSDO)

Multidisciplinary System Design Optimization (MSDO) Multidisciplinary System Design Optimization (MSDO) Numerical Optimization II Lecture 8 Karen Willcox 1 Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox Today s Topics Sequential

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

MINLP: Theory, Algorithms, Applications: Lecture 3, Basics of Algorothms

MINLP: Theory, Algorithms, Applications: Lecture 3, Basics of Algorothms MINLP: Theory, Algorithms, Applications: Lecture 3, Basics of Algorothms Jeff Linderoth Industrial and Systems Engineering University of Wisconsin-Madison Jonas Schweiger Friedrich-Alexander-Universität

More information

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini In the name of God Network Flows 6. Lagrangian Relaxation 6.3 Lagrangian Relaxation and Integer Programming Fall 2010 Instructor: Dr. Masoud Yaghini Integer Programming Outline Branch-and-Bound Technique

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization

Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization Benders Decomposition Methods for Structured Optimization, including Stochastic Optimization Robert M. Freund May 2, 2001 Block Ladder Structure Basic Model minimize x;y c T x + f T y s:t: Ax = b Bx +

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with Random Recourse

Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with Random Recourse Disjunctive Decomposition for Two-Stage Stochastic Mixed-Binary Programs with Random Recourse Lewis Ntaimo Department of Industrial and Systems Engineering, Texas A&M University, 3131 TAMU, College Station,

More information

Decomposition Algorithms for Two-Stage Chance-Constrained Programs

Decomposition Algorithms for Two-Stage Chance-Constrained Programs Mathematical Programming manuscript No. (will be inserted by the editor) Decomposition Algorithms for Two-Stage Chance-Constrained Programs Xiao Liu Simge Küçükyavuz Luedtke James Received: date / Accepted:

More information

Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs

Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs Siqian Shen Dept. of Industrial and Operations Engineering University of Michigan Joint work with Yan Deng (UMich, Google)

More information

Interior Point Methods for Mathematical Programming

Interior Point Methods for Mathematical Programming Interior Point Methods for Mathematical Programming Clóvis C. Gonzaga Federal University of Santa Catarina, Florianópolis, Brazil EURO - 2013 Roma Our heroes Cauchy Newton Lagrange Early results Unconstrained

More information

Fenchel Decomposition for Stochastic Mixed-Integer Programming

Fenchel Decomposition for Stochastic Mixed-Integer Programming Fenchel Decomposition for Stochastic Mixed-Integer Programming Lewis Ntaimo Department of Industrial and Systems Engineering, Texas A&M University, 3131 TAMU, College Station, TX 77843, USA, ntaimo@tamu.edu

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Stochastic Programming: From statistical data to optimal decisions

Stochastic Programming: From statistical data to optimal decisions Stochastic Programming: From statistical data to optimal decisions W. Römisch Humboldt-University Berlin Department of Mathematics (K. Emich, H. Heitsch, A. Möller) Page 1 of 24 6th International Conference

More information

Interior-Point versus Simplex methods for Integer Programming Branch-and-Bound

Interior-Point versus Simplex methods for Integer Programming Branch-and-Bound Interior-Point versus Simplex methods for Integer Programming Branch-and-Bound Samir Elhedhli elhedhli@uwaterloo.ca Department of Management Sciences, University of Waterloo, Canada Page of 4 McMaster

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

Acceleration and Stabilization Techniques for Column Generation

Acceleration and Stabilization Techniques for Column Generation Acceleration and Stabilization Techniques for Column Generation Zhouchun Huang Qipeng Phil Zheng Department of Industrial Engineering & Management Systems University of Central Florida Sep 26, 2014 Outline

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

Decomposition with Branch-and-Cut Approaches for Two Stage Stochastic Mixed-Integer Programming

Decomposition with Branch-and-Cut Approaches for Two Stage Stochastic Mixed-Integer Programming Decomposition with Branch-and-Cut Approaches for Two Stage Stochastic Mixed-Integer Programming by Suvrajeet Sen SIE Department, University of Arizona, Tucson, AZ 85721 and Hanif D. Sherali ISE Department,

More information

Scenario decomposition of risk-averse two stage stochastic programming problems

Scenario decomposition of risk-averse two stage stochastic programming problems R u t c o r Research R e p o r t Scenario decomposition of risk-averse two stage stochastic programming problems Ricardo A Collado a Dávid Papp b Andrzej Ruszczyński c RRR 2-2012, January 2012 RUTCOR Rutgers

More information

Optimisation and Operations Research

Optimisation and Operations Research Optimisation and Operations Research Lecture 22: Linear Programming Revisited Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/ Lecture_notes/OORII/ School

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory Instructor: Shengyu Zhang 1 LP Motivating examples Introduction to algorithms Simplex algorithm On a particular example General algorithm Duality An application to game theory 2 Example 1: profit maximization

More information

Generation and Representation of Piecewise Polyhedral Value Functions

Generation and Representation of Piecewise Polyhedral Value Functions Generation and Representation of Piecewise Polyhedral Value Functions Ted Ralphs 1 Joint work with Menal Güzelsoy 2 and Anahita Hassanzadeh 1 1 COR@L Lab, Department of Industrial and Systems Engineering,

More information

Numerical Methods for Inverse Kinematics

Numerical Methods for Inverse Kinematics Numerical Methods for Inverse Kinematics Niels Joubert, UC Berkeley, CS184 2008-11-25 Inverse Kinematics is used to pose models by specifying endpoints of segments rather than individual joint angles.

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

A. Shapiro Introduction

A. Shapiro Introduction ESAIM: PROCEEDINGS, December 2003, Vol. 13, 65 73 J.P. Penot, Editor DOI: 10.1051/proc:2003003 MONTE CARLO SAMPLING APPROACH TO STOCHASTIC PROGRAMMING A. Shapiro 1 Abstract. Various stochastic programming

More information

Stochastic Integer Programming An Algorithmic Perspective

Stochastic Integer Programming An Algorithmic Perspective Stochastic Integer Programming An Algorithmic Perspective sahmed@isye.gatech.edu www.isye.gatech.edu/~sahmed School of Industrial & Systems Engineering 2 Outline Two-stage SIP Formulation Challenges Simple

More information

Pedro Munari - COA 2017, February 10th, University of Edinburgh, Scotland, UK 2

Pedro Munari - COA 2017, February 10th, University of Edinburgh, Scotland, UK 2 Pedro Munari [munari@dep.ufscar.br] - COA 2017, February 10th, University of Edinburgh, Scotland, UK 2 Outline Vehicle routing problem; How interior point methods can help; Interior point branch-price-and-cut:

More information

Extended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications

Extended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications Extended Formulations, Lagrangian Relaxation, & Column Generation: tackling large scale applications François Vanderbeck University of Bordeaux INRIA Bordeaux-Sud-Ouest part : Defining Extended Formulations

More information

Solving Bilevel Mixed Integer Program by Reformulations and Decomposition

Solving Bilevel Mixed Integer Program by Reformulations and Decomposition Solving Bilevel Mixed Integer Program by Reformulations and Decomposition June, 2014 Abstract In this paper, we study bilevel mixed integer programming (MIP) problem and present a novel computing scheme

More information

Stochastic Gradient Descent. Ryan Tibshirani Convex Optimization

Stochastic Gradient Descent. Ryan Tibshirani Convex Optimization Stochastic Gradient Descent Ryan Tibshirani Convex Optimization 10-725 Last time: proximal gradient descent Consider the problem min x g(x) + h(x) with g, h convex, g differentiable, and h simple in so

More information

10 Numerical methods for constrained problems

10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä New Proximal Bundle Method for Nonsmooth DC Optimization TUCS Technical Report No 1130, February 2015 New Proximal Bundle Method for Nonsmooth

More information

Computational Optimization. Augmented Lagrangian NW 17.3

Computational Optimization. Augmented Lagrangian NW 17.3 Computational Optimization Augmented Lagrangian NW 17.3 Upcoming Schedule No class April 18 Friday, April 25, in class presentations. Projects due unless you present April 25 (free extension until Monday

More information

Contents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016

Contents. 1 Introduction. 1.1 History of Optimization ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 ALG-ML SEMINAR LISSA: LINEAR TIME SECOND-ORDER STOCHASTIC ALGORITHM FEBRUARY 23, 2016 LECTURERS: NAMAN AGARWAL AND BRIAN BULLINS SCRIBE: KIRAN VODRAHALLI Contents 1 Introduction 1 1.1 History of Optimization.....................................

More information

IE 495 Final Exam. Due Date: May 1, 2003

IE 495 Final Exam. Due Date: May 1, 2003 IE 495 Final Exam Due Date: May 1, 2003 Do the following problems. You must work alone on this exam. The only reference materials you are allowed to use on the exam are the textbook by Birge and Louveaux,

More information

5. Subgradient method

5. Subgradient method L. Vandenberghe EE236C (Spring 2016) 5. Subgradient method subgradient method convergence analysis optimal step size when f is known alternating projections optimality 5-1 Subgradient method to minimize

More information

The Multi-Path Utility Maximization Problem

The Multi-Path Utility Maximization Problem The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract

More information

LINEAR AND NONLINEAR PROGRAMMING

LINEAR AND NONLINEAR PROGRAMMING LINEAR AND NONLINEAR PROGRAMMING Stephen G. Nash and Ariela Sofer George Mason University The McGraw-Hill Companies, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico

More information