Sample Average Approximation (SAA) for Stochastic Programs with an eye towards computational SAA Dave Morton Industrial Engineering & Management Sciences Northwestern University
Outline SAA Results for Monte Carlo estimators: no optimization What results should we want for SAA? Results for SAA 1. Bias 2. Consistency 3. Central limit theorem (CLT) SAA Algorithm A basic algorithm A sequential algorithm Multi-Stage Problems What We Didn t Discuss
Stochastic Programming Models z = min x X E f(x, ξ) Such problems arise in statistics, simulation and mathematical programming Our focus: mathematical programming with X deterministic We ll assume: (A1) X and compact (A2) Ef(, ξ) is lower semicontinuous (A3) E sup f 2 (x, ξ) < x X ξ is a random vector and P ξ P ξ (x) We can evaluate f(x, ξ(ω)) for a fixed x and realization ξ(ω) Choice of f determines problem class
Sample Average Approximation True or population problem: Denote optimal solution x z = min Ef(x, ξ) (SP ) x X SAA problem: z n = min x X n f(x, ξ j ) 1 n j=1 }{{} f n (x) (SP n ) Here, ξ 1, ξ 2,..., ξ n iid as ξ or sampled another way. Denote optimal solution x n View z n as an estimator of z and x n as an estimator of x Want names? external sampling method, sample-path optimization, stochastic counterpart, retrospective optimization, non-recursive method, and sample average approximation.
Let s start in a simpler setting, momentarily putting aside optimization...
Monte Carlo Sampling Suppressing the (fixed) decision x Let z = Ef(ξ), σ 2 = varf(ξ) < and ξ 1, ξ 2,..., ξ n be iid as ξ Let z n = 1 n n f(ξ i ) be the sample mean estimator of z i=1 FACT 1. Ez n = z z n is an unbiased estimator of z FACT 2. z n z, wp1 (strong LLN) z n is a strongly consistent estimator of z FACT 3. n(zn z) N(0, σ 2 ) (CLT) Rate of convergence is 1/ n and scaled difference is normally distributed FACTS 4,5,... law of iterated logarithm, concentration inequalities,...
Do such results carry over to SAA?
SAA Population problem: Denote optimal solution x z = min Ef(x, ξ) (SP ) x X SAA problem: z n = min x X n f(x, ξ j ) 1 n j=1 }{{} f n (x) (SP n ) Denote optimal solution x n View zn as an estimator of z and x n as an estimator of x What can we say about zn and x n as n? What should we want to say about zn and x n as n?
1. x n x, wp1 and n(x n x ) N(0, Σ) SAA: Possible Goals
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. z n z, wp1 and n(z n z ) N(0, σ 2 ) SAA: Possible Goals
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. zn z, wp1 and n(zn z ) N(0, σ 2 ) 3. Ef(x n, ξ) z, wp1 SAA: Possible Goals
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. z n z, wp1 and n(z n z ) N(0, σ 2 ) 3. Ef(x n, ξ) z, wp1 SAA: Possible Goals 4. lim n P (Ef(x n, ξ) z ε n ) 1 α where ε n 0
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. z n z, wp1 and n(z n z ) N(0, σ 2 ) 3. Ef(x n, ξ) z, wp1 SAA: Possible Goals 4. lim n P (Ef(x n, ξ) z ε n ) 1 α where ε n 0 Modeling Issues: If (SP n ) is for maximum-likelihood estimation then goal 1 could be appropriate If (SP ) is to price a financial option then goal 2 could be appropriate When (SP ) is a decision-making model, 1 may be more than we need and 2 is of secondary interest. Goals 3 and 4 arguably suffice
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. z n z, wp1 and n(z n z ) N(0, σ 2 ) 3. Ef(x n, ξ) z, wp1 SAA: Possible Goals 4. lim n P (Ef(x n, ξ) z ε n ) 1 α where ε n 0 Modeling Issues: If (SP n ) is for maximum-likelihood estimation then goal 1 could be appropriate If (SP ) is to price a financial option then goal 2 could be appropriate When (SP ) is a decision-making model, 1 may be more than we need and 2 is of secondary interest. Goals 3 and 4 arguably suffice Technical Issues: In general, we shouldn t expect {x n} n=1 to converge when (SP ) has multiple optimal solutions. In this case, we want: limit points of {x n} n=1 solve (SP ) If we achieve limit points result, X is compact & Ef(, ξ) is continuous, then we obtain goal 3 The limiting distributions may not be normal
1. x n x, wp1 and n(x n x ) N(0, Σ) 2. z n z, wp1 and n(z n z ) N(0, σ 2 ) 3. Ef(x n, ξ) z, wp1 SAA: Possible Goals 1 4. lim n P (Ef(x n, ξ) z ε n ) 1 α where ε n 0 Modeling Issues: If (SP n ) is for maximum-likelihood estimation then goal 1 could be appropriate If (SP ) is to price a financial option then goal 2 could be appropriate When (SP ) is a decision-making model, 1 may be more than we need and 2 is of secondary interest. Goals 3 and 4 arguably suffice Technical Issues: In general, we shouldn t expect {x n} n=1 to converge when (SP ) has multiple optimal solutions. In this case, we want: limit points of {x n} n=1 solve (SP ) If we achieve limit points result, X is compact & Ef(, ξ) is continuous, then we obtain goal 3 The limiting distributions may not be normal 1 Again, these goals aren t true in general; i.e., they may be impossible goals.
1. Bias 2. Consistency 3. CLT
SAA: Example z = min 1 x 1 [E f(x, ξ) = Eξx], where ξ N(0, 1) Every feasible solution, x [ 1, 1] is optimal and z = 0
SAA: Example z = min 1 x 1 [E f(x, ξ) = Eξx], where ξ N(0, 1) Every feasible solution, x [ 1, 1] is optimal and z = 0 z n = x n = ±1, z n = N(0, 1/n) min 1 x 1 ( 1 n ) n ξ j x j=1
SAA: Example z = min 1 x 1 [E f(x, ξ) = Eξx], where ξ N(0, 1) Every feasible solution, x [ 1, 1] is optimal and z = 0 z n = x n = ±1, z n = N(0, 1/n) Observations min 1 x 1 1. Ez n z n (negative bias) ( 1 n ) n ξ j x j=1 2. Ez n Ez n+1 n (monotonically shrinking bias) 3. z n z, wp1 (strongly consistent) 4. n(z n z ) = N(0, 1) (non-normal errors) 5. b(z n) Ez n z = a/ n (O(n 1/2 ) bias)
SAA: Example z = min 1 x 1 [E f(x, ξ) = Eξx], where ξ N(0, 1) Every feasible solution, x [ 1, 1] is optimal and z = 0 z n = x n = ±1, z n = N(0, 1/n) Observations min 1 x 1 1. Ez n z n (negative bias) ( 1 n ) n ξ j x j=1 2. Ez n Ez n+1 n (monotonically shrinking bias) 3. z n z, wp1 (strongly consistent) 4. n(z n z ) = N(0, 1) (non-normal errors) 5. b(z n) Ez n z = a/ n (O(n 1/2 ) bias) So, optimization changes the nature of sample-mean estimators.
SAA: Example z = min 1 x 1 [E f(x, ξ) = Eξx], where ξ N(0, 1) Every feasible solution, x [ 1, 1] is optimal and z = 0 z n = x n = ±1, z n = N(0, 1/n) Observations min 1 x 1 1. Ez n z n (negative bias) ( 1 n ) n ξ j x j=1 2. Ez n Ez n+1 n (monotonically shrinking bias) 3. z n z, wp1 (strongly consistent) 4. n(z n z ) = N(0, 1) (non-normal errors) 5. b(z n) Ez n z = a/ n (O(n 1/2 ) bias) So, optimization changes the nature of sample-mean estimators. Note: What if x [ 1, 1] is replaced by x R? SAA fails, spectacularly.
1. Bias 2. Consistency 3. CLT
1. Bias All you need to know: min [f(x) + g(x)] min f(x) + min g(x) x X x X x X
SAA: Bias Theorem. Assume (A1), (A2), and E f n (x) = Ef(x, ξ), x X. Then, Ezn z. If, in addition, ξ 1, ξ 2,..., ξ n are iid then Ezn Ezn+1.
SAA: Bias Theorem. Assume (A1), (A2), and E f n (x) = Ef(x, ξ), x X. Then, Ezn z. If, in addition, ξ 1, ξ 2,..., ξ n are iid then Ezn Ezn+1. Notes: First result does not require iid realizations, just an unbiased estimator Hypothesis can be relaxed to: E f n (x) Ef(x, ξ), x X Hypothesis can be relaxed to: ξ 1, ξ 2,..., ξ n are exchangeable random variables
Proof of Bias Result E 1 n n f(x, ξ j ) = E f(x, ξ) j=1
Proof of Bias Result min E x X 1 n n f(x, ξ j ) = min E f(x, ξ) j=1 x X
Proof of Bias Result min E x X 1 n n f(x, ξ j ) = min E f(x, ξ) = z j=1 x X
Proof of Bias Result and so we obtain min E x X E min x X 1 n 1 n n f(x, ξ j ) = min E f(x, ξ) = z j=1 x X n f(x, ξ j ) min E f(x, ξ) = z j=1 x X
and so we obtain min E x X Ez n = E 1 n min x X Proof of Bias Result n f(x, ξ j ) = min E f(x, ξ) = z j=1 1 n x X n f(x, ξ j ) min E f(x, ξ) = z j=1 x X
and so we obtain min E x X Ez n = E 1 n min x X Proof of Bias Result n f(x, ξ j ) = min E f(x, ξ) = z j=1 1 n x X n f(x, ξ j ) min E f(x, ξ) = z j=1 x X
and so we obtain min E x X Ez n = E 1 n min x X Proof of Bias Result n f(x, ξ j ) = min E f(x, ξ) = z j=1 1 n x X n f(x, ξ j ) min E f(x, ξ) = z j=1 x X Aside: Simple example when n = 1 E min f(x, ξ) min Ef(x, ξ) x X x X
and so we obtain min E x X Ez n = E 1 n min x X Proof of Bias Result n f(x, ξ j ) = min E f(x, ξ) = z j=1 1 n x X n f(x, ξ j ) min E f(x, ξ) = z j=1 x X Aside: Simple example when n = 1 E min f(x, ξ) min Ef(x, ξ) x X x X Interpretation: We ll do better if we wait and see ξ s realization before choosing x Next, we show bias decreases monotonically: Intuition... Ez n Ez n+1
Proof of Bias Monotonicity Result Ez n+1 = E min x X = E min x X [ 1 n + 1 1 n + 1 n+1 i=1 n+1 i=1 f(x, ξ i ) 1 n n+1 ] j=1,j i f(x, ξ j )
Proof of Bias Monotonicity Result Ez n+1 = E min x X = E min x X E [ 1 n + 1 1 n + 1 1 n + 1 n+1 i=1 n+1 i=1 n+1 i=1 min x X f(x, ξ i ) 1 n 1 n n+1 ] j=1,j i n+1 j=1, j i f(x, ξ j ) f(x, ξ j )
Proof of Bias Monotonicity Result Ez n+1 = E min x X = E min x X E [ 1 n + 1 1 n + 1 1 n + 1 n+1 i=1 n+1 i=1 n+1 i=1 min x X f(x, ξ i ) 1 n 1 n n+1 ] j=1,j i n+1 j=1, j i f(x, ξ j ) f(x, ξ j ) = 1 n + 1 n+1 i=1 E min x X 1 n n+1 j=1, j i f(x, ξ j )
Proof of Bias Monotonicity Result Ez n+1 = E min x X = E min x X E [ 1 n + 1 1 n + 1 1 n + 1 n+1 i=1 n+1 i=1 n+1 i=1 min x X f(x, ξ i ) 1 n 1 n n+1 ] j=1,j i n+1 j=1, j i f(x, ξ j ) f(x, ξ j ) = 1 n + 1 n+1 i=1 E min x X 1 n n+1 j=1, j i f(x, ξ j ) = Ez n
Bias 2. Consistency: z n and x n 3. CLT
2. Consistency of z n All you need to know: Ef(x, ξ) Ef(x n, ξ) and fn (x n) f n (x )
SAA: Consistency of z n Theorem. Assume (A1), (A2), and the USLLN: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X Then, z n z, wp1.
SAA: Consistency of z n Theorem. Assume (A1), (A2), and the USLLN: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X Then, z n z, wp1. Notes: Does not assume ξ 1, ξ 2,... ξ n are iid Instead, assumes uniform strong law of large numbers (USLLN)
SAA: Consistency of z n Theorem. Assume (A1), (A2), and the USLLN: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X Then, z n z, wp1. Notes: Does not assume ξ 1, ξ 2,... ξ n are iid Instead, assumes uniform strong law of large numbers (USLLN) Important to realize: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X lim fn (x) Ef(x, ξ) = 0, wp1, x X n But, the converse is false. Think of our example: fn (x) = ξ n x and X = R
Proof of consistency of z n z n z = fn (x n) Ef(x, ξ)
Proof of consistency of z n z n z = fn (x n) Ef(x, ξ) = max { fn (x n) Ef(x, ξ), Ef(x, ξ) f n (x n) }
Proof of consistency of z n z n z = fn (x n) Ef(x, ξ) = max { fn (x n) Ef(x, ξ), Ef(x, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), Ef(x n, ξ) f n (x n) }
Proof of consistency of z n zn z = fn (x n) Ef(x, ξ) = max { fn (x n) Ef(x, ξ), Ef(x, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), Ef(x n, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), fn (x n) Ef(x n, ξ) }
Proof of consistency of z n z n z = fn (x n) Ef(x, ξ) = max { fn (x n) Ef(x, ξ), Ef(x, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), Ef(x n, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), fn (x n) Ef(x n, ξ) } sup fn (x) Ef(x, ξ) x X
Proof of consistency of z n z n z = fn (x n) Ef(x, ξ) = max { fn (x n) Ef(x, ξ), Ef(x, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), Ef(x n, ξ) f n (x n) } max { fn (x ) Ef(x, ξ), fn (x n) Ef(x n, ξ) } sup fn (x) Ef(x, ξ) x X Taking n completes the proof
2. Consistency of x n All you need to know: If g is continuous and lim k x k = ˆx then lim k g(x k ) = g(ˆx)
SAA: Consistency of x n Theorem. Assume (A1), (A2), Ef(, ξ) is continuous, and the USLLN: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X Then, every limit point of {x n} solves (SP ), wp1.
SAA: Consistency of x n Theorem. Assume (A1), (A2), Ef(, ξ) is continuous, and the USLLN: lim sup fn (x) Ef(x, ξ) = 0, wp1. n x X Then, every limit point of {x n} solves (SP ), wp1. Notes: Assumes USLLN rather than assuming ξ 1, ξ 2,... ξ n are iid And, assumes continuity of Ef(, ξ) The result doesn t say: lim n x n = x, wp1. Why not?
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. (Note such as limit point exists and ˆx X because X is compact.)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN lim n n N f n (x n) }{{} z n = z, wp1
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) Taking n for n N... fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) Taking n for n N... First term goes to zero by USLLN fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) Taking n for n N... First term goes to zero by USLLN And second goes to zero by continuity of Ef(, ξ)
Proof of consistency of x n Let ˆx be a limit point of {x n } n=1 and let n N index a convergent subsequence. By the USLLN and lim n n N f n (x n) }{{} z n = z, wp1 fn (x n) Ef(ˆx, ξ) = fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) fn (x n) Ef(x n, ξ) + Ef(x n, ξ) Ef(ˆx, ξ) Taking n for n N... First term goes to zero by USLLN And second goes to zero by continuity of Ef(, ξ) Thus, Ef(ˆx, ξ) = z
Bias Consistency: z n and x n 3. CLT
Bias Consistency: z n and x n When does USSLN hold? Suppose we have a stochastic MIP, in which continuity doesn t make sense?
Sufficient Conditions for the USLLN Fact. 2 Assume X is compact and assume: f(, ξ) is continuous, wp1, on X g(ξ) satisfying sup f(x, ξ) g(ξ), wp1 and Eg(ξ) < x X ξ 1, ξ 2,..., ξ n are iid as ξ. Then, the USLLN holds. 2 Facts are theorems that we won t prove.
Sufficient Conditions for the USLLN Fact. Let X be compact and convex and assume: f(, ξ) is convex and continuous, wp1, on X the LLN holds pointwise: lim n Then, the USLLN holds. fn (x) Ef(x, ξ) = 0, wp1, x X
SAA: Consistency of z n and x n under Finite X Fact. Assume X is finite, and assume lim n fn (x) Ef(x, ξ) = 0, wp1, x X. Then, USLLN holds, z n z, and every limit point of {x n} solves (SP ), wp1. Notes: Ef(, ξ) need not be continuous (would be unnatural since domain X is finite) Assumes pointwise LLN rather than USLLN iid Here plus X finite implies lim fn (x) Ef(x, ξ) = 0, wp1, x X n lim sup fn (x) Ef(x, ξ) = 0, wp1 n x X
SAA: Consistency of z n and x n under LSC f(, ξ) Fact. Assume ξ 1, ξ 2,..., ξ n are iid as ξ f(, ξ) is lower semicontinuous on X ξ g(ξ) satisfying inf f(x, ξ) g(ξ), wp1, where E g(ξ) <. x X Then, z n z, wp1, and every limit point of {x n} solves (SP ), wp1.
SAA: Consistency of z n and x n under LSC f(, ξ) Fact. Assume ξ 1, ξ 2,..., ξ n are iid as ξ f(, ξ) is lower semicontinuous on X ξ g(ξ) satisfying inf f(x, ξ) g(ξ), wp1, where E g(ξ) <. x X Then, z n z, wp1, and every limit point of {x n} solves (SP ), wp1. Proof relies on epi-convergence of f n (x) to Ef(x, ξ) Epi-convergence provides theory for approximation in optimization beyond SAA f n (x) convex, continuous on compact, convex X: epi-convergence USLLN But, epi-convergence provides a more general framework in non-convex setting Epi-convergence can be viewed as precisely the relaxation of uniform convergence that yields desired convergence results
MATHEMATICS OF OPERATIONS RESEARCH Vol. 11, No. 1, February 1986 Printed in U.S.A. APPROXIMATION TO OPTIMIZATION PROBLEMS: AN ELEMENTARY REVIEW* PETER KALL Universitat Zurich During the last two decades the concept of epi-convergence was introduced and then was used in various investigations in optimization and related areas. The aim of this review is to show in an elementary way how closely the arguments in the epi-convergence approach are related to those of the classical theory of convergence of functions. 1. Introduction. In mathematical programming problems of the type inf (q(x) x E } (I) have to be solved, where r C IRR and,: F -> R are given. In designing solution methods for e it is quite common to replace the original problem by a sequence of "approximating" problems inf (,(x) I xe r}) (IF) which are supposed to be easier to solve then e. To give some examples, we just mention cutting plane methods, penalty methods and solution methods for stochastic programming problems. To simplify the presentation we restate the above problems in the usual way by defining f(x)= f (x) if xe, + oo else, and f fv(x) = ^(x) if x E r, + oo else. Then obviously e and 6l are equivalent to
Bias Consistency: z n and x n 3. CLT
3. One-sided CLT for z n All you need to know: CLT for iidrvs and f n (x n) f n (x) x X
SAA: Towards a CLT for z n We have conditions under which z n z shrinks to zero Is n correct scaling factor so that n(z n z ) converges to something nontrivial?
SAA: Towards a CLT for z n We have conditions under which z n z shrinks to zero Is n correct scaling factor so that n(z n z ) converges to something nontrivial? Notation: f n (x) = 1 n n f(x, ξ j ) j=1 σ 2 (x) = var[f(x, ξ)] s 2 n(x) = 1 n 1 n [ f(x, ξ j ) f n (x) ] 2 j=1 X is set of optimal solutions to (SP ) z α satisfies P(N(0, 1) z α ) = 1 α
SAA: Towards a CLT for z n z n = f n (x n) f n (x), wp1, x X
SAA: Towards a CLT for z n and so z n = f n (x n) f n (x), wp1, x X z n z σ(x)/ n f n (x) z σ(x)/ n, wp1
SAA: Towards a CLT for z n and so Let x X X. z n = f n (x n) f n (x), wp1, x X z n z σ(x)/ n f n (x) z σ(x)/ n, wp1
SAA: Towards a CLT for z n and so Let x X X. Then, z n = f n (x n) f n (x), wp1, x X z n z σ(x)/ n f n (x) z σ(x)/ n, wp1 ( z P n z ) σ(x )/ n z α ( fn (x ) z ) P σ(x )/ n z α
SAA: Towards a CLT for z n and so Let x X X. Then, By CLT for iidrvs Thus... z n = f n (x n) f n (x), wp1, x X z n z σ(x)/ n f n (x) z σ(x)/ n, wp1 ( z P n z ) σ(x )/ n z α ( fn lim P (x ) z ) n σ(x )/ n z α ( fn (x ) z ) P σ(x )/ n z α = 1 α
SAA: One-sided CLT for z n Theorem. Assume a pointwise CLT: ( ) fn lim P (x) Ef(x, ξ) n σ(x)/ u = P(N(0, 1) u), x X. n Let x X. Then, Notes: lim inf n ( z P n z ) σ(x )/ n z α 1 α. (A3) and ξ 1, ξ 2,..., ξ n iid as ξ suffice for pointwise CLT. Other possibilities, too For sufficiently large n, we infer that P { z n z α σ(x )/ n z } 1 α Of course, we don t know σ(x ), and so this is practically useless. But...
SAA: Towards (a better) CLT for z n z n = f n (x n) f n (x), wp1, x X
SAA: Towards (a better) CLT for z n and so z n = f n (x n) f n (x), wp1, x X z n z s n (x n)/ n f n (x) z s n (x n)/ n, wp1
SAA: Towards (a better) CLT for z n and so z n = f n (x n) f n (x), wp1, x X z n z s n (x n)/ n f n (x) z s n (x n)/ n, wp1 Let x = x min arg min x X σ2 (x).
SAA: Towards (a better) CLT for z n z n = f n (x n) f n (x), wp1, x X and so z n z s n (x n)/ n f n (x) z s n (x n)/ n, wp1 Let x = x min arg min x X σ2 (x). Then, ( z P n z ) s n (x n)/ n z α ( fn (x min P ) z s n (x n)/ n z α )
SAA: Towards (a better) CLT for z n and so z n = f n (x n) f n (x), wp1, x X z n z s n (x n)/ n f n (x) z s n (x n)/ n, wp1 Let x = x min arg min x X σ2 (x). Then, ( z P n z ) s n (x n)/ n z α ( fn (x min P ) ) z s n (x n)/ n z α ( fn (x min = P ) [ z σ(x min )/ n z sn (x ]) n) α σ(x min )
SAA: Towards (a better) CLT for z n and so z n = f n (x n) f n (x), wp1, x X z n z s n (x n)/ n f n (x) z s n (x n)/ n, wp1 Let x = x min arg min x X σ2 (x). Then, ( z P n z ) s n (x n)/ n z α If z α > 0 and lim inf n ( fn (x min P ) ) z s n (x n)/ n z α ( fn (x min = P ) [ z σ(x min )/ n z sn (x ]) n) α σ(x min ) s n(x n) inf σ(x) then... x X ( z lim inf P n z n s n (x n)/ n z α ) 1 α
SAA: One-sided CLT for z n Theorem. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ inf x X σ2 (x) lim inf n s2 n(x n) lim sup s 2 n(x n) sup σ 2 (x), wp1 n x X Then, given 0 < α < 1 lim inf n ( z P n z ) s n (x n)/ n z α 1 α.
SAA: One-sided CLT for z n Theorem. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ inf x X σ2 (x) lim inf n s2 n(x n) lim sup s 2 n(x n) sup σ 2 (x), wp1. n x X Then, given 0 < α < 1 Notes: lim inf n Could have assumed pointwise CLT For sufficiently large n, we infer that ( z P n z ) s n (x n)/ n z α 1 α. P { z n z α s n (x n)/ n z } 1 α How does this relate to the bias result: Ez n z?
Bias Consistency: z n and x n CLT for z n Two-sided CLT for z?
Two-sided CLT for z n Fact. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ f(x 1, ξ) f(x 2, ξ) g(ξ) x 1 x 2, x 1, x 2 X, where E g 2 (ξ) < If (SP ) has a unique optimal solution then: n (z n z ) N(0, σ 2 (x )).
Two-sided CLT for z n Fact. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ f(x 1, ξ) f(x 2, ξ) g(ξ) x 1 x 2, x 1, x 2 X, where E g 2 (ξ) < If (SP ) has a unique optimal solution then: n (z n z ) N(0, σ 2 (x )). Notes: But, there are frequently multiple optimal solutions...
Two-sided CLT for z n Fact. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ f(x 1, ξ) f(x 2, ξ) g(ξ) x 1 x 2, x 1, x 2 X, where E g 2 (ξ) < Then, n (z n z ) inf x X N(0, σ2 (x)).
Two-sided CLT for z n Fact. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ f(x 1, ξ) f(x 2, ξ) g(ξ) x 1 x 2, x 1, x 2 X, where E g 2 (ξ) <. Then, Notes: n (z n z ) inf x X N(0, σ2 (x)). What is inf x X N(0, σ2 (x))? n ( fn (x) Ef(x, ξ) ) N(0, σ 2 (x)) N(0, σ 2 (x)) is family of correlated normal random variables
Two-sided CLT for z n Fact. Assume (A1)-(A3) ξ 1, ξ 2,..., ξ n are iid as ξ f(x 1, ξ) f(x 2, ξ) g(ξ) x 1 x 2, x 1, x 2 X, where E g 2 (ξ) <. Then, Notes: n (z n z ) inf x X N(0, σ2 (x)). What is inf x X N(0, σ2 (x))? n ( fn (x) Ef(x, ξ) ) N(0, σ 2 (x)) N(0, σ 2 (x)) is family of correlated normal random variables Recall example: inf N(0, x X σ2 (x)) = N(0, 1) How does inf x X N(0, σ2 (x)) relate to the bias result: Ez n z?
Bias Consistency: z n and x n CLT for z n 3. CLT for x n
Fact. Assume (A1)-(A3) SAA: CLT for x n f(, ξ) is convex and twice continuously differentiable X = {x : Ax b} (SP ) has a unique optimal solution x (x 1 x 2 ) H (x 1 x 2 ) > 0, x 1, x 2 X, x 1 x 2, where H = E 2 xf(x, ξ). Assume x f(x, ξ) satisfies: x f(x 1, ξ) x f(x 2, ξ) g(ξ) x 1 x 2 x 1, x 2 X, where Eg 2 (ξ) < for some real-valued function g Then, n(x n x ) u where u solves the random QP: 1 min u 2 u H u + c u s.t. A i u 0 : i {i : A i x = b i } u E x f(x, ξ) = 0 and c is multivariate normal with mean 0 and covariance matrix Σ, where ( Σ ij = cov f(x,ξ) x i, f(x,ξ) x j ).
Bias: z n Consistency: z n and x n CLT: z n and x n
SAA: Revisiting Possible Goals 1. x n x, wp1 and n(x n x ) u, where u solves a random QP 2. z n z, wp1 and n(z n z ) inf x X N(0, σ 2 (x)) 3. Ef(x n, ξ) z, wp1 4. lim n P (Ef(x n, ξ) z ε n ) 1 α where ε n 0 We now have conditions under which variants of 1-3 hold Let s next start by aiming for a more modest version of 4: Given ˆx X and α find a (random) CI width ε with: P(E f(ˆx, ξ) z ε) 1 α
An SAA Algorithm
Assessing Solution Quality: Towards an SAA Algorithm z = min x X E f(x, ξ) Goal: Given ˆx X and α find a (random) CI width ε with: Using the bias result, E 1 n n j=1 P(E f(ˆx, ξ) z ε) 1 α f(ˆx, ξ j 1 ) min x X n n f(x, ξ j ) Ef(ˆx, ξ) z j=1 } {{ } G n (ˆx)
Assessing Solution Quality: Towards an SAA Algorithm z = min x X E f(x, ξ) Goal: Given ˆx X and α find a (random) CI width ε with: P(E f(ˆx, ξ) z ε) 1 α Using the bias result, E 1 n n j=1 f(ˆx, ξ j 1 ) min x X n n f(x, ξ j ) j=1 } {{ } G n (ˆx) Ef(ˆx, ξ) z Remarks Anticipate var G n (ˆx) var [ 1 n n j=1 f(ˆx, ξj )] + var zn G n (ˆx) 0, but not asymptotically normal (what to do?) Not much of an algorithm if the solution, ˆx, comes as input!
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z 0. Sample iid observations ξ 1, ξ 2,..., ξ n x, and solve (SP nx ) to obtain x n x
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z 0. Sample iid observations ξ 1, ξ 2,..., ξ n x, and solve (SP nx ) to obtain x n x 1. For k = 1, 2,..., n g 1.1. Sample iid observations ξ k1, ξ k2,..., ξ kn from the distribution of ξ 1.2. Solve (SP n ) using ξ k1, ξ k2,..., ξ kn to obtain x k n 1.3. Calculate G k n(x n x ) = 1 n n j=1 f(x n x, ξ kj ) 1 n n j=1 f(xk n, ξ kj )
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z 0. Sample iid observations ξ 1, ξ 2,..., ξ n x, and solve (SP nx ) to obtain x n x 1. For k = 1, 2,..., n g 1.1. Sample iid observations ξ k1, ξ k2,..., ξ kn from the distribution of ξ 1.2. Solve (SP n ) using ξ k1, ξ k2,..., ξ kn to obtain x k n 1.3. Calculate G k n(x n x ) = 1 n n j=1 f(x n x, ξ kj ) 1 n n j=1 f(xk n, ξ kj ) 2. Calculate gap estimate and sample variance: Ḡ n (n g ) = 1 n g n g k=1 G k n(x n x ) and s 2 G(n g ) = 1 n g 1 n g k=1 ( G k n (x n x ) Ḡn(n g ) ) 2
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z 0. Sample iid observations ξ 1, ξ 2,..., ξ n x, and solve (SP nx ) to obtain x n x 1. For k = 1, 2,..., n g 1.1. Sample iid observations ξ k1, ξ k2,..., ξ kn from the distribution of ξ 1.2. Solve (SP n ) using ξ k1, ξ k2,..., ξ kn to obtain x k n 1.3. Calculate G k n(x n x ) = 1 n n j=1 f(x n x, ξ kj ) 1 n n j=1 f(xk n, ξ kj ) 2. Calculate gap estimate and sample variance: Ḡ n (n g ) = 1 n g n g k=1 G k n(x n x ) and s 2 G(n g ) = 1 n g 1 n g 3. Let ε g = t ng 1,αs G (n g )/ n g, and output x n x and one-sided CI: [ 0, Ḡ n (n g ) + ε g ] k=1 ( G k n (x n x ) Ḡn(n g ) ) 2
An SAA Algorithm Input: CI level 1 α, sample sizes n x and n, replication size n g Fix α = 0.05 and n g = 15 (say) Choose n x and n based on what is computationally reasonable Choose n x > n, perhaps n x n Then, For fixed n and n x can justify algorithm with n g For fixed n g can justify the algorithm with n Can even use n g = 1, albeit with different variance estimator
An SAA Algorithm Output: Solution x n x and approximate (1 α)-level CI on E f(x n x, ξ) z x n x is the decision we will make The confidence interval is on x n x s optimality gap, E f(x n x, ξ) z Here, E f(x n x, ξ) = E ξ [f(x n x, ξ) x n x ] So, this is a posterior assessment, given the decision we will make
An SAA Algorithm 0. Sample iid observations ξ 1, ξ 2,..., ξ n x, and solve (SP nx ) to obtain x n x ξ 1, ξ 2,..., ξ n x need not be iid Agnostic to algorithm used to solve (SP nx )
An SAA Algorithm 1. For k = 1, 2,..., n g 1.1. Sample iid observations ξ k1, ξ k2,..., ξ kn from the distribution of ξ 1.2. Solve (SP n ) using ξ k1, ξ k2,..., ξ kn to obtain x k n 1.3. Calculate G k n(x n x ) = 1 n n j=1 f(x n x, ξ kj ) 1 n n j=1 f(xk n, ξ kj ) ξ k1, ξ k2,..., ξ kn need not be iid, but should satisfy E f n (x) = Ef(x, ξ) (could use Latin hypercube sampling or randomized quasi Monte Carlo sampling) (ξ k1, ξ k2,..., ξ kn ), k = 1, 2,..., n g, should be iid Agnostic to algorithm used to solve (SP n ) Can solve relaxation of (SP n ) if lower bound is used in second term of 1.3 (recall E f n (x) Ef(x, ξ) relaxation in bias result) Can also use independent samples and different sample sizes, n u, and n l, for the upper- and lower-bound estimators in step 1.3
An SAA Algorithm 2. Calculate gap estimate and sample variance: Ḡ n (n g ) = 1 n g n g k=1 G k n(x n x ) and s 2 G(n g ) = 1 n g 1 n g 3. Let ε g = t ng 1,αs G (n g )/ n g, and output x n x and one-sided CI: [ 0, Ḡ n (n g ) + ε g ] k=1 ( G k n (x n x ) Ḡn(n g ) ) 2 Standard calculation of sample mean and sample variance Standard calculation of one-sided confidence interval for a nonnegative parameter Again, here the parameter is E f(x n x, ξ) z SAA Algorithm tends to be conservative, i.e., exhibit over-coverage Why?
SAA Algorithm Applied to a few Two-Stage SLPs Problem DB WRPM 20TERM SSN n x in (SP nx ) for x n x 50 50 50 2000 Optimality Gap n 25 25 25 1000 n g 30 30 30 30 95% CI Width 0.2% 0.08% 0.5% 8% Var. Red. 4300 480 1300 17 Variance reduction with respect to algorithm, which estimates upper and lower bounds defining G with independent, rather than common, random number streams
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 6 A 2 4 7 E D C 5 4 gap (% of z*) 3 2 1 samp err gap 0 10 100 1000 10000 n=n x Note: n = n x
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 0.9 A 2 4 7 E D C 0.8 0.7 0.6 gap (% of z*) 0.5 0.4 0.3 samp err gap 0.2 0.1 0 100 1000 10000 n=n x
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 0.18 A 2 4 7 E D C 0.16 0.14 0.12 gap (% of z*) 0.1 0.08 0.06 samp err gap 0.04 0.02 0 1000 10000 n=n x
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 8.35 A 2 4 7 E D C 8.3 Upper and Lower Bounds 8.25 8.2 8.15 8.1 8.05 8 7.95 1 10 100 1000 10000 n
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 8.32 A 2 4 7 E D C 8.31 Upper and Lower Bounds 8.3 8.29 8.28 8.27 8.26 8.25 1 10 100 1000 10000 n
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 8.302 A 2 4 7 E D C 8.3 Upper and Lower Bounds 8.298 8.296 8.294 8.292 8.29 8.288 1 10 100 1000 10000 n
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 0.001 A 2 4 7 E D C 0.01 gap 0.1 1 1 10 100 1000 10000 n If EG n (x n) = a n p then log [EG n(x n)] = log[a] p log[n] From these four points p 0.74. R 2 = 0.9998.
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) B 1 3 5 6 A 2 4 7 E D C Enforce symmetry constraints: x 1 = x 6, x 2 = x 7, x 3 = x 5
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 6 6 5 5 4 4 gap (% of z*) 3 2 1 samp err gap gap (% of z*) 3 2 1 samp err gap 0 10 100 1000 10000 0 10 100 1000 10000 n=n x n=n x no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 gap (% of z*) 0.5 0.4 0.3 0.2 samp err gap gap (% of z*) 0.5 0.4 0.3 0.2 samp err gap 0.1 0.1 0 100 1000 10000 0 100 1000 10000 n=n x n=n x no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) gap (% of z*) 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 1000 10000 samp err gap gap (% of z*) 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 1000 10000 samp err gap n=n x n=n x no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 8.35 8.35 8.3 8.3 Upper and Lower Bounds 8.25 8.2 8.15 8.1 8.05 Upper and Lower Bounds 8.25 8.2 8.15 8.1 8.05 8 8 7.95 1 10 100 1000 10000 7.95 1 10 100 1000 10000 n n no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 8.32 8.32 8.31 8.31 Upper and Lower Bounds 8.3 8.29 8.28 8.27 8.26 Upper and Lower Bounds 8.3 8.29 8.28 8.27 8.26 8.25 1 10 100 1000 10000 8.25 1 10 100 1000 10000 n n no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 8.304 8.304 8.302 8.302 Upper and Lower Bounds 8.3 8.298 8.296 8.294 8.292 Upper and Lower Bounds 8.3 8.298 8.296 8.294 8.292 8.29 8.29 8.288 1 10 100 1000 10000 8.288 1 10 100 1000 10000 n n no extra constraints with symmetry constraints
SAA Algorithm Network Capacity Expansion Model (z 8.3) (Higle & Sen) 0.001 0.001 0.01 0.01 gap gap 0.1 0.1 1 1 10 100 1000 10000 n 1 1 10 100 1000 10000 n EG n (x n) = a n p p 0.74. R 2 = 0.999 p 0.61. R 2 = 0.986 rate worse, constant a better
If you are happy with your results from SAA Algorithm then stop now!
Why Are You Unhappy? 1. Computational effort to solve n g = 15 instances of (SP n ) is prohibitive; 2. Bias of zn is large; 3. Sampling error, ε g, is large; or, 4. Solution x n x is far from optimal to (SP )
Why Are You Unhappy? 1. Computational effort to solve n g = 15 instances of (SP n ) is prohibitive; 2. Bias of zn is large; 3. Sampling error, ε g, is large; or, 4. Solution x n x is far from optimal to (SP ) Remedy 1: Single replication procedure: n g = 1 Remedy 2: LHS, randomized QMC, adaptive jackknife estimator Remedy 3: CRNs reduce variance. Other ideas help: LHS and randomized QMC Remedy 4: A sequential SAA algorithm
A Sequential SAA Algorithm
A Sequential SAA Algorithm Step 1: Generate a candidate solution Step 2: Check stopping criterion. If satisfied, stop Else, go to step 1 Instead of candidate solution ˆx = x n x X, we have {ˆx k } with each ˆx k X Stopping criterion rooted in above procedure (with n g = 1): and s 2 k s 2 n k (x n k ) = G k G nk (ˆx k ) = 1 n k 1 n k 1 n k j=1 n k j=1 ( f(ˆxk, ξ j ) f(x n k, ξ j ) ) [ (f(ˆxk, ξ j ) f(x n k, ξ j )) ( f nk (ˆx k ) f nk (x n k )) ] 2
A Sequential SAA Algorithm Stopping criterion: Sample size criterion: T = inf k 1 {k : G k h s k } (1) n k ( ) 2 1 ( cq,α + 2q ln 2 k ) (2) h h Fact. Consider the sequential sampling procedure in which the sample size is increased according to (2), and the procedure stops at iteration T according to (1). Then, under some regularity assumptions (including uniform integrability of a moment generating function) lim inf h h P (E f(ˆx T, ξ) z hs T ) 1 α
A Word (well, pictures) About Multi-Stage Stochastic Programming
What Does Solution Mean? In multistage setting, assessing solution quality means assessing policy quality
One Family of Algorithms & SAA Assume interstage independence, or, dependence with special structure. Stochastic dual dynamic programming (SDDP): (a) Forward Pass (b) Backward Pass
Small Sampling of Things We Didn t Talk About Non-iid sampling (well, we did a bit) Bias and variance reduction techniques (some brief allusion) Multi-stage SAA (in any detail) Large-deviation results, concentration-inequality results, finite-sample guarantees More generally, results with coefficients that are difficult to estimate SAA for expected-value constraints, including chance constraints SAA for other models, such as those with equilibrium constraints Results that exploit more specific special structure of f, ξ, and/or X Results that study interaction between an optimization algorithm and SAA Stochastic approximation, stochastic gradient descent, stochastic mirror descent, stochastic cutting-plane methods, stochastic dual dynamic programming... Statistical testing of optimality conditions Results for risk measures not expressed as expected (dis)utility. Decision-dependent probability distributions Distributionally robust data-driven variants of SAA
Summary: SAA SAA Results for Monte Carlo estimators: no optimization What results should we want for SAA? Results for SAA 1. Bias 2. Consistency 3. CLT SAA Algorithm A basic algorithm A sequential algorithm Multi-Stage Problems What We Didn t Discuss
Small Sampling of References Lagrange, Bernoulli, Euler, Laplace, Gauss, Edgeworth, Hotelling, Fisher... (leading to maximum likelihood) H. Robbins and S. Monro, A stochastic approximation method, Annals of Mathematical Statistics 22, 400-407, 1951. G. Dantzig and A. Madansky, On the solution of two-stage linear programs under uncertainty, Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1961. Overviews and Tutorials A. Shapiro, A. Ruszczyński, D. Dentcheva, Lectures on Stochastic Programming: Modeling and Theory (Chapter 5, Statistical Inference), 2014. A. Shapiro, Monte Carlo sampling methods. In A. Ruszczyński and A. Shapiro (editors), Stochastic Programming: Handbooks in Operations Research and Management Science, 2003. S. Kim, R. Pasupathy and S. Henderson. A guide to sample-average approximation. In Handbook of Simulation Optimization, edited by M. Fu, 2015. T. Homem-de-Mello and G. Bayraksan, Monte Carlo sampling-based methods for stochastic optimization, Surveys in Operations Research and Management Science 19, 56-85, 2014. G. Bayraksan and D.P. Morton, Assessing solution quality in stochastic programs via sampling, Tutorials in Operations Research, M.R. Oskoorouchi (ed.), 102-122, INFORMS, 2009. Further References G. Bayraksan and D.P. Morton, Assessing solution quality in stochastic programs, Mathematical Programming, 108, 495-514 (2006). G. Bayraksan and D.P. Morton, A sequential sampling procedure for stochastic programming, Operations Research 59, 898-913 (2011). J. Dupačová and R. Wets, Asymptotic behavior of statistical estimators and of optimal solutions of stochastic optimization problems, The Annals of Statistics 16, 1517-1549, 1988. M. Freimer, J. Linderoth and D. Thomas, The impact of sampling methods on bias and variance in stochastic linear programs, Computational Optimization and Applications 51, 51-75, 2012. P. Glynn and G. Infanger, Simulation-based confidence bounds for two-stage stochastic programs, Mathematical Programming 138, 15-42, 2013. J. Higle and S. Sen, Stochastic decomposition: an algorithm for two-stage linear programs with recourse, Mathematics of Operations Research 16, 650-669, 1991.
Small Sampling of References J. Higle and S. Sen, Duality and statistical tests of optimality for two stage stochastic programs, Mathematical Programming 75, 257-275, 1996. T. Homem-de-Mello, On rates of convergence for stochastic optimization problems under non-iid sampling, SIAM Journal on Optimization 19, 524-551, 2008. G. Infanger, Monte Carlo (importance) sampling within a Benders decomposition algorithm for stochastic linear programs, Annals of Operations Research 39, 4167, 1991 A. King and R. Rockafellar, Asymptotic theory for solutions in statistical estimation and stochastic programming, Mathematics of Operations Research 18, 148-162, 1993. A. King and R. Wets, Epiconsistency of convex stochastic programs, Stochastics 34, 83-92, 1991. A. Kleywegt, A. Shapiro, and T. Homem-de-Mello, The sample average approximation method for stochastic discrete optimization, SIAM Journal on Optimization 12, 479-502, 2001. V. Kozmik and D.P. Morton, Evaluating policies in risk-averse multi-stage stochastic programming, Mathematical Programming 152, 275-300 (2015). J. Luedtke and S. Ahmed, A sample approximation approach for optimization with probabilistic constraints, SIAM Journal on Optimization 19, 674-699, 2008. J. Linderoth, A. Shapiro and S. Wright, The empirical behavior of sampling methods for stochastic programming, Annals of Operations Research 142, 215-241, 2001. W. Mak, D. Morton and R. Wood, Monte Carlo bounding techniques for determining solution quality in stochastic programs, Operations Research Letters 24, 47-56 (1999). B. Pagnoncelli, S. Ahmed and A. Shapiro, Sample average approximation method for chance constrained programming: theory and applications, Journal of Optimization Theory and Applications 142, 399-416, 2009. R. Pasupathy, On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization, Operations Research 58, 889-901, 2010. J. Royset and R. Szechtman, Optimal Budget Allocation for Sample Average Approximation, Operations Research 61, 762-776, 2013.
Sorry for All the Acronyms (SAA) CI: Confidence Interval CLT: Central Limit Theorem CRN: Common Random Numbers DB: Donohue Birge test instance iid: independent and identically distributed iidrvs: iid random variables LHS: Latin Hybercube Sampling LLN: Law of Large Numbers LSC: Lower Semi-Continuous MIP: Mixed Integer Program QMC: Quasi Monte Carlo QP: Quadratic Program SAA: Sample Average Approximation SDDP: Stochastic Dual Dynamic Programming SLP: Stochastic Linear Program SSN: SONET Switched Network test instance. Or, Suvrajeet Sen s Network SONET: Synchronous Optical Networking USLLN: Uniform Strong LLN wp1: with probability one WRPM: West-coast Regional Planning Model 20TERM: 20 TERMinal test instance