Stability of optimization problems with stochastic dominance constraints

Similar documents
Stability of Stochastic Programming Problems

Asymptotics of minimax stochastic programs

Jitka Dupačová and scenario reduction

A note on scenario reduction for two-stage stochastic programs

Scenario generation in stochastic programming with application to energy systems

Concepts and Applications of Stochastically Weighted Stochastic Dominance

Conditioning of linear-quadratic two-stage stochastic programming problems

Scenario Reduction: Old and New Results

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Approximation of stochastic optimization problems and scenario generation

On duality theory of conic linear problems

Stability-based generation of scenario trees for multistage stochastic programs

Quantitative Stability of Two Stage Linear Second-order Conic Stochastic Programs with Full Random Recourse

On Kusuoka Representation of Law Invariant Risk Measures

Probability metrics and the stability of stochastic programs with recourse

Stochastic Programming with Multivariate Second Order Stochastic Dominance Constraints with Applications in Portfolio Optimization

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Semi-infinite programming, duality, discretization and optimality conditions

Empirical Processes: General Weak Convergence Theory

Confidence regions for stochastic variational inequalities

On Kusuoka representation of law invariant risk measures

Inverse Stochastic Dominance Constraints Duality and Methods

An introduction to Mathematical Theory of Control

A Two-Stage Moment Robust Optimization Model and its Solution Using Decomposition

Aliprantis, Border: Infinite-dimensional Analysis A Hitchhiker s Guide

Portfolio optimization with stochastic dominance constraints

Stochastic Programming: From statistical data to optimal decisions

On almost sure rates of convergence for sample average approximations

Stone-Čech compactification of Tychonoff spaces

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Optimal scenario generation and reduction in stochastic programming

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

On deterministic reformulations of distributionally robust joint chance constrained optimization problems

i=1 β i,i.e. = β 1 x β x β 1 1 xβ d

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 516

ELEMENTS OF PROBABILITY THEORY

Individual confidence intervals for true solutions to stochastic variational inequalities

CVaR and Examples of Deviation Risk Measures

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

The Subdifferential of Convex Deviation Measures and Risk Functions

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

We denote the space of distributions on Ω by D ( Ω) 2.

Relaxations of linear programming problems with first order stochastic dominance constraints

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

Lagrange Relaxation: Introduction and Applications

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Functional Analysis I

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013

Robust Duality in Parametric Convex Optimization

Linear conic optimization for nonlinear optimal control

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

On a weighted total variation minimization problem

Regularity of the density for the stochastic heat equation

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

Some new results in post-pareto analysis

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Continuity of convex functions in normed spaces

Summary of Real Analysis by Royden

A Concise Course on Stochastic Partial Differential Equations

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Stochastic optimization, multivariate numerical integration and Quasi-Monte Carlo methods. W. Römisch

Scenario Generation and Sampling Methods

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

ANOVA decomposition of convex piecewise linear functions

Part II Probability and Measure

Generalized quantiles as risk measures

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Convex Analysis and Economic Theory Winter 2018

CHAPTER V DUAL SPACES

2 Statement of the problem and assumptions

Chance constrained optimization - applications, properties and numerical issues

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Estimation of Dynamic Regression Models

Overview of normed linear spaces

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

2 Chance constrained programming

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Iteration-complexity of first-order penalty methods for convex programming

Recent Trends in Differential Inclusions

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Linear and non-linear programming

A convergence result for an Outer Approximation Scheme

On a Class of Multidimensional Optimal Transportation Problems

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

Lecture 7 Introduction to Statistical Decision Theory

Linear and Combinatorial Optimization

REMOVABLE SINGULARITIES FOR BOUNDED p-harmonic AND QUASI(SUPER)HARMONIC FUNCTIONS ON METRIC SPACES

Tools from Lebesgue integration

Scenario generation Contents

Transcription:

Stability of optimization problems with stochastic dominance constraints D. Dentcheva and W. Römisch Stevens Institute of Technology, Hoboken Humboldt-University Berlin www.math.hu-berlin.de/~romisch SIAM Optimization Conference, San Diego, May 19-22, 2014

Introduction and contents The use of stochastic orderings as a modeling tool has become standard in theory and applications of stochastic optimization. Much of the theory is developed and many successful applications are known. Research topics: - Multivariate concepts and analysis, - scenario generation and approximation schemes, - analysis of (Quasi-) Monte Carlo approximations, - numerical methods and decomposition schemes, - applications. Contents of the talk: (1) Introduction, stochastic dominance, probability metrics (2) Quantitative stability results (3) Sensitivity of optimal values (4) Limit theorem for empirical approximations

Optimization models with stochastic dominance constraints We consider the convex optimization model min { f(x) : x D, G(x, ξ) (k) Y }, where k N, k 2, D is a nonempty closed convex subset of R m, Ξ a closed convex subset of R s, f : R m R is convex, ξ is a random vector with support Ξ and Y a real random variable on some probability space both having finite moments of order k 1, and G : R m R s R is continuous, concave with respect to the first argument and satisfies the linear growth condition G(x, ξ) C(B) max{1, ξ } (x B, ξ Ξ) for every bounded subset B R m and some constant C(B) (depending on B). The random variable Y plays the role of a benchmark outcome. D. Dentcheva, A. Ruszczyński: Optimization with stochastic dominance constraints, SIAM J. Optim. 14 (2003), 548 566.

Stochastic dominance relation (k) X (k) Y F (k) (k) X (η) F Y (η) ( η R) where X and Y are real random variables belonging to L k 1 (Ω, F, P) with norm k 1 for some probability space (Ω, F, P). By L 0 we denote consistently the space of all scalar random variables. Let P X denote the probability distribution of X and F (1) X = F X its distribution function, i.e., η η F (1) X (η) = P({X η}) = P X (dξ) = df X (ξ) ( η R) and F (k+1) X (η) = η F (k) X (ξ)dξ = η (η ξ) k P X (dξ) = k! η = 1 k! max{0, η X} k k ( η R), (η ξ) k df X (ξ) k! where X k = ( E( X k ) ) 1 k ( k 1). A. Müller and D. Stoyan: Comparison Methods for Stochastic Models and Risks, Wiley, Chichester, 2002.

The original problem is equivalent to its split variable formulation { } min f(x) : x D, G(x, ξ) X, F (k) (k) X (η) F Y (η), η R by introducing a new real random variable X and the constraint G(x, ξ) X P almost surely. This formulation motivates the need of two different metrics for handling the two constraints of different nature: The almost sure constraint G(x, ξ) X (P-a.s.) and the functional constraint (k) ( ) F ( ), respectively. F (k) X Y D. Dentcheva, A. Ruszczyński: Optimality and duality theory for stochastic optimization problems with nonlinear dominance constraints, Math. Progr. 99 (2004), 329 350.

Properties: (i) Equivalent characterization of (2) : X (2) Y E[u(X)] E[u(Y )] for each nondecreasing concave utility u : R R such that the expectations are finite. (ii) The function F (k) X k 2. : R R is nondecreasing for k 1 and convex for (iii) For every k N the SD relation (k) introduces a partial ordering in L k 1 (Ω, F, P) which is not generated by a convex cone if Y is not deterministic. Extensions: By imposing appropriate assumptions all results remain valid for the following two extended situations: (a) finite number of kth order stochastic dominance constraints, (b) the objective f is replaced by an expectation function of the form E[g(, ξ)] where g is a real-valued function defined on R m R s.

The case of discrete distributions: Let ξ j, X j and Y j the scenarios of ξ, X and Y with probabilities p j, j = 1,..., n. Then the second order dominance constraints (i.e. k = 2) in the split variable formulation can be expressed as n n p j [η X j ] + p j [η Y j ] + ( η I). j=1 The latter condition can be shown to be equivalent to n n p j [Y k X j )] + p j [Y k Y j ] + ( k = 1,..., n). j=1 j=1 j=1 if Y k I, k = 1,..., n. Here, [ ] + = max{0, }. Hence, the second order dominance constraints may be reformulated as linear constraints for the X j, j = 1,..., n, in G(x, ξ j ) X j (j = 1,..., n). D. Dentcheva, A. Ruszczyński: Optimality and duality theory for stochastic optimization problems with nonlinear dominance constraints, Math. Progr. 99 (2004), 329 350. J. Luedtke: New formulations for optimization under stochastic dominance constraints, SIAM J. Optim. 19 (2008), 1433 1450.

Metrics associated to (k) Rachev metrics on L k 1 : ( D k,p (X, Y ) := R (k) F X sup η R (k) (η) F Y (η) dη)1p p, 1 p < F (k) X (k) (η) F Y (η), p = Proposition: It holds for any X, Y L k 1 D k,p (X, Y )= ζ k,p (X, Y ):= sup f(x)p X (dx) f D k,p R R f(x)p Y (dx) if E(X i ) = E(Y i ), i = 1,..., k 1. Here, D k,p denotes the set of continuous functions f : R R that have measurable kth order derivatives f (k) on R such that f (k) (x) p 1dx p 1 (p > 1) or ess sup f (k) (x) 1 (p = 1). x R R

Note that the condition E(X i ) = E(Y i ), i = 1,..., k 1, is implied by the finiteness of ζ k,p (X, Y ), since D k,p contains all polynomials of degree k 1. Conversely, if X and Y belong to L k 1 and E(X i ) = E(Y i ), i = 1,..., k 1, holds, then the distance D k,p (X, Y ) is finite. Proposition: There exists c k > 0 (only depending on k) such that ζ k, (X, Y ) ζ 1, (X, Y ) c k ζ k, (X, Y ) 1 k ( X, Y L k 1 ). ζ 1, is the Kolmogorov metric and ζ 1,1 the first order Fourier-Mourier or Wasserstein metric. S. T. Rachev: Probability Metrics and the Stability of Stochastic Models, Wiley, 1991.

Structure and stability We consider the kth order SD constrained optimization model { } min f(x) : x D, F (k) (k) G(x,ξ) (η) F Y (η), η R as semi-infinite program. Relaxation: Replace R by some compact inverval I = [a, b]. Proposition: Under the general convexity assumptions the feasible set { } X (ξ, Y ) = x D : F (k) (k) G(x,ξ) (η) F Y (η), η I is closed and convex in R m.

Uniform dominance condition of kth order (kudc) at (ξ, Y ): There exists x D such that ( ) F (k) (k) Y (η) F G( x,ξ) (η) > 0. min η I Metrics on L s k 1 L k 1: d k ((ξ, Y ), ( ξ, Ỹ )) = l k 1(ξ, ξ) + D k, (Y, Ỹ ), where k N, k 2 is the degree of the SD constraint, D k, is the kth order Rachev metric, and l k 1 is the L k 1 -minimal or (k 1)th order Wasserstein distance defined by l k 1 (ξ, ξ) { } 1 := inf x x k 1 k 1 η(dx, d x), where the infimum is taken w.r.t. marginal P ξ and P ξ, respectively. Ξ Ξ all probability measures η on Ξ Ξ with

Proposition: Let D be compact and assume that the function G satisfies G(x, u) G(x, ũ) L G u ũ for all x D, u, ũ Ξ and some constant L G > 0. Assume that the kth order uniform dominance condition is satisfied at (ξ, Y ). Then there exist constants L(k) > 0 and δ > 0 such that d H (X (ξ, Y ), X ( ξ, Ỹ )) L(k) d k((ξ, Y ), ( ξ, Ỹ )), whenever the pair ( ξ, Ỹ ) is chosen such that d k((ξ, Y ), ( ξ, Ỹ )) < δ. (d H denotes the Pompeiu-Hausdorff distance on compact subsets of R m.) Note that L(k) gets smaller with increasing k N if ξ k 1 grows at most exponentially with k. Hence, higher order stochastic dominance constraints may have improved stability properties.

Let v(ξ, Y ) denote the optimal value and S(ξ, Y ) the solution set of min { f(x) : x D, x X (ξ, Y ) }. We consider the growth function and ψ (ξ,y ) (τ) := inf { f(x) v(ξ, Y ) : d(x, S(ξ, Y )) τ, x X (ξ, Y ) } Ψ (ξ,y ) (θ) := θ + ψ 1 (ξ,y ) (2θ) (θ R +), where we set ψ 1 (ξ,y ) (t) = sup{τ R + : ψ (ξ,y ) (τ) t}. Note that Ψ (ξ,y ) is increasing, lower semicontinuous and vanishes at θ = 0.

Main stability result Theorem: Let D be compact and assume that the function G satisfies G(x, u) G(x, ũ) L G u ũ for all x D, u, ũ Ξ and some constant L G > 0. Assume that the kth order uniform dominance condition is satisfied at (ξ, Y ). Then there exist positive constants L(k) and δ such that v(ξ, Y ) v( ξ, Ỹ ) L(k) d k((ξ, Y ), ( ξ, Ỹ )) sup d(x, S(ξ, Y )) Ψ (ξ,y ) (L(k) d k ((ξ, Y ), ( ξ, Ỹ ))) x S( ξ,ỹ ) whenever d k ((ξ, Y ), ( ξ, Ỹ )) < δ. (Klatte 94, Rockafelar-Wets 98)

Dual multipliers and utilities Let Y = C(I) and Y its dual which is isometrically isomorph to the space rca(i) of regular countably additive measures µ on I having finite total variation µ (I). The dual pairing is given by µ, y = y(η)µ(dη) ( y Y, µ rca(i)). We consider the closed convex cone and its polar cone K I K = {y Y : y(η) 0, η I} K = {µ rca(i) : µ, y 0, y K}. The semi-infinite constraint may be written as and the semi-infinite program is G k (x; P ξ, P Y ) := F (k) Y F (k) G(x,ξ) K min { f(x) : x D, G k (x; P ξ, P Y ) K }.

Lemma: (Dentcheva-Ruszczyǹski 03) Let k 2, I = [a, b], µ K. There exists u U k 1 such that µ, F (k) X = F (k) X (η)µ(dη) = E[u(X)] I holds for every X L k 1. Here, U k 1 denotes the set of all u C k 2 (R) such that its (k 1)th derivative exists almost everywhere and there is a nonnegative, non-increasing, left-continuous, bounded function ϕ : I R such that u (k 1) (t) = ( 1) k ϕ(t) u (k 1) (t) = ( 1) k ϕ(a), t < a, u(t) = 0, t b,, µ-a.e. t [a, b], u (i) (b) = 0, i = 1,..., k 2, where the symbol u (i) denotes the ith derivative of u. In particular, the utilities u U k 1 are nondecreasing and concave on R. Proof: The function u U k 1 is defined by putting u(t) = 0, t b, u (k 1) (t) = ( 1) k µ([t, b]), µ-a.e. t b, u (i) (b) = 0, = 1,..., k 2. One obtains by repeated integration by parts for any X L k 1 b µ, F (k) X = ( 1)k F (k) X (η)du(k 1) (t) = b u(t)df X (t) = E[u(X)].

Optimality and duality Define the Lagrange-like function L : R m U k 1 R as L(x, u; P ξ, P Y ) := f(x) u(g(x, z))p ξ (dz) + Ξ R u(t)p Y (dt). Theorem: (Dentcheva-Ruszczyǹski) Let k 2 and assume the kth order uniform dominance condition at (ξ, Y ). A feasible û is optimal if and only if a function û U k 1 exists such that Ξ L(ˆx, û; P ξ, P Y ) = min L(x, û, P ξ, P Y ) x D û(g( x, z))p ξ (dz) = û(t)p Y (dt). Furthermore, the dual problem is [ [ ] ] max inf f(x) E [u(g(x; ξ))] + E [u(y )] u U k 1 x D or [ [ max inf f(x) µ, Gk (x; P ξ, P Y ) ]] µ K x D and primal and dual optimal values coincide. R

Sensitivity of the optimal value function Let the infimal function v : C(D) R be given by v(g) = inf x D g(x). If D is compact, v is finite and concave on C(D), and Lipschitz continuous with respect to the supremum norm on C(D). Hence, it is Hadamard directionally differentiable on C(D) and v (g; d) = min { d(x) : x arg min x D g(x)} (g, d C(D)). Let Uk 1 denote the solution set of the dual problem. Any ū U k 1 is called shadow utility. For some shadow utility ū and gū = L(, ū; P ξ, P Y ), the duality theorem implies v(gū) = v(p ξ, P Y ). Corollary: Let D be compact and the assumptions of the duality theorem be satisfied. Then the optimal value function v(p ξ, P Y ) is Hadamard directionally differentiable on C(D) and the directional derivative into direction d C(D) is v (gū; d) = v (P ξ, P Y ; d)) = min { d(x) : x S(P ξ, P Y ) }.

Limit theorems for empirical approximations Let (ξ n, Y n ), n N, be a sequence of i.i.d. (independent, and identically distributed) random vectors on some probability space. Let P (n) ξ and P (n) Y denote the corresponding empirical measures and P n = P (n) ξ Empirical approximation: { n min f(x) : x D, Optimal value: i=1 [ η G(x, ξi ) ] k 1 + n i=1 P (n) Y. v(p ξ, P Y ) = inf L(x, ū; P ξ, P Y ) x D = inf E[ f(x) + ū(g(x, ξ)) ū(y ) ] x D = inf P (f(x) + ū(g(x, z)) ū(t)), x D where ū is a shadow utility and P := P ξ P Y. [ η Yi ] k 1 +, η I }

Proposition: (Donsker class) Let the assumptions of the main stability theorem be satisfied. Let D and the supports Ξ = supp(p ξ ) and Υ = supp(p Y ) be compact. Then Γ k is a Donsker class, i.e., the empirical process E n g indexed by g Γ k E n g = n(p n P )g = n n (n 1 i=1 ) d g(ξ i, Y i ) E(g(ξ, Y )) G(g) (g Γ k ) converges in distribution to a Gaussian limit process G on the space l (Γ k ) (of bounded functions on Γ k ) equipped with supremum norm, where Γ k = { g x : g x (z, t) =f(x) + ū(g(x, z)) ū(t), (z, t) Ξ Υ, x D }. The Gaussian process G has zero mean and covariances E[G(x) G( x)] = E P [g x g x ] E P [g x ]E P [g x ] for x, x D.

Proposition: (functional delta method) Let B 1 and B 2 be Banach spaces equipped with their Borel σ-fields and B 1 be separable. Let (X n ) be random elements of B 1, h : B 1 B 2 be a mapping and (τ n ) be a sequence of positive numbers tending to infinity as n. If τ n (X n θ) d X for some θ B 1 and some random element X of B 1 and h is Hadamard directionally differentiable at θ, it holds τ n (h(x n ) h(θ)) where d means convergence in distribution. d h (θ; X), Application: B 1 = C(D), B 2 = R, h(g) = inf x D g(x), h is concave and Lipschitz w.r.t., and h (g; d) = min{d(y) : y arg min x D g(x)}.

Theorem: (Limit theorem) Let the assumptions of the Donsker class Proposition be satisfied. Then the optimal values v(p (n) ξ, P (n) Y ), n N, satisfy the limit theorem ( (n) n v(p ξ, P (n) Y ) v(p ξ, P Y ) ) d min{g(x) : x S(P ξ, P Y )} where G is a Gaussian process with zero mean and covariances E[G(x) G( x)] = E P [g x g x ] E P [g x ]E P [g x ] for x, x S(P ξ, P Y ). If S(P ξ, P Y ) is a singleton, i.e., S(P ξ, P Y ) = { x}, the limit G( x) is normal with zero mean and variance E P [g 2 x] (E P [g x ]) 2. The result allows the application of resampling techniques to determine asymptotic confidence intervals for the optimal value v(p ξ, P Y ), in particular, bootstrapping if S(P ξ, P Y ) is a singleton and subsampling in the general case. A. Eichhorn and W. Römisch: Stochastic integer programming: Limit theorems and confidence intervals, Math. Oper. Res. 32 (2007), 118 135.

Conclusions Quantitative continuity properties for optimal values and solution sets in terms of a suitable distance of probability distributions have been obtained. A limit theorem for optimal values of empirical approximations of stochastic dominance constrained optimization models is shown which allows to derive confidence intervals. Extensions of the results to study (modern) Quasi-Monte Carlo approximations of such models are desirable (convergence rate O(n 1+δ ), δ (0, 1 2 ]). Extensions of the asymptotic result to the situation of estimated shadow utilities are desirable. Extensions to multivariate dominance constraints are desirable, e.g., for the concept X (m,k) Y iff v X (k) v Y, v V, where V is convex in R m + and X, Y L m k 1. For example, V = {v R m + : v 1 = 1} is studied in (Dentcheva-Ruszczyński 09) and V {v R m : v 1 1} in (Hu-Homem-de-Mello-Mehrotra 11).

References D. Dentcheva, R. Henrion and A. Ruszczyński: Stability and sensitivity of optimization problems with first order stochastic dominance constraints, SIAM Journal on Optimization 18 (2007), 322 337. D. Dentcheva and W. Römisch: Stability and sensitivity of stochastic dominance constrained optimization models, SIAM Journal on Optimization 23 (2013), 1672 1688. Y. Liu, W. Römisch and H. Xu: Quantitative stability analysis of stochastic generalized equations, SIAM Journal on Optimizatio 24 (2014), 467 497. Y. Liu and H. Xu: Stability and sensitivity analysis of stochastic programs with second order dominance constraints, Mathematical Programming 142 (2013), 435 460.