MFE 5100: Optimization 2015 16 First Term Handout 8: Dealing with Data Uncertainty Instructor: Anthony Man Cho So December 1, 2015 1 Introduction Conic linear programming CLP, and in particular, semidefinite programming SDP, has received a lot of attention in recent years. Such a popularity can partly be attributed to the wide applicability of CLP, as well as recent advances in the design of provably efficient interior point algorithms. In this lecture we will consider how CLP techniques can be used to tackle data uncertainty in optimization problems. 2 Robust Linear Programming To motivate the problem considered in this section, consider the following LP: subject to c T x Âx ˆb, 1 where  Rm n, ˆb R m, and c R n are given. As it often occurs in practice, the data of the above LP, namely  and ˆb, are uncertain. In an attempt to tackle such uncertainties, Ben Tal and Nemirovski [3] introduced the idea of robust optimization. In the robust optimization setting, we assume that the uncertain data lie in some given uncertainty set U. Our goal is then to find a solution that is feasible for all possible realizations of the uncertain data and optimizes some given objective. To derive the robust counterpart of 1, let us first rewrite it as c T z subject to Az 0, z n+1 = 1, 2 where A = [ ˆb] R m n+1, c = c, 0 R n+1, and z R n+1. Now, suppose that each row a i R n+1 of the matrix A lies in an ellipsoidal region U i whose center u i R n+1 is given here, i = 1,..., m. Specifically, we impose the constraint that a i U i = { x R n+1 : x = u i + B i v, v 2 1 } for i = 1,..., m, where a i is the i th row of A, u i R n+1 is the center of the ellipsoid U i, and B i is some n+1 n+1 positive semidefinite matrix. Then, the robust counterpart of 2 is the following optimization problem: c T z subject to Az 0 for all a i U i, i = 1,..., m, z n+1 = 1. 1 3
Note that in general there are uncountably many constraints in 3. However, we claim that 3 is equivalent to an SOCP problem. To see this, observe that a T i z 0 for all a i U i iff 0 max v R n+1 : v 2 1 { ui + B i v T z } = u T i z + B i z 2, where i = 1,..., m. Hence, we conclude that 3 is equivalent to c T z subject to B i z 2 u T i z for i = 1,..., m, z n+1 = 1, which is an SOCP problem. For further readings on robust optimization, we refer the readers to [11, 1, 4, 2]. 3 Chance Constrained Linear Programming In the previous section, we showed how one can handle data uncertainties in a linear optimization problem using the robust optimization approach. Such an approach is particularly suitable if there is little information about the nature of the data uncertainty, and/or if the constraints are not to be violated under any circumstance. However, it can produce very conservative solutions. In situations where the distribution of the data uncertainty is partially known, one can often obtain a better solution by allowing the constraints to be violated with small probability. To illustrate this possibility, let us consider the following simple LP: c T x subject to a T x b, x P, 4 where a, c R n are given vectors, b R is a given scalar, and P R n is a given polyhedron. For the sake of simplicity, suppose that P is deterministic, but the data a, b are randomly affinely perturbed; i.e., a = a 0 + ϵ i a i, b = b 0 + ϵ i b i, where a 0, a 1,..., a l R n and b 0, b 1,..., b l R are given, and ϵ 1,..., ϵ l are i.i.d. mean zero random variables supported on [ 1, 1]. Then, for any given tolerance parameter δ 0, 1, we can formulate the following chance constrained counterpart of 4: c T x subject to Pra T x > b δ, x P. 5 In other words, a solution x P is feasible for 5 if it only violates the constraint a T x b with probability at most δ. The constraint is known as a chance constraint. Note that when δ = 0, 5 reduces to a robust linear optimization problem. Moreover, if x R n is feasible for 5 at some tolerance level δ 0, then it is also feasible for 5 at any δ δ. Thus, by varying δ, we will be able to adjust the conservatism of the solution. 2
Despite the attractiveness of the chance constraint formulation, it has one major drawback; namely, it is generally computationally intractable. Indeed, even when the distributions of ϵ 1,..., ϵ l are very simple, the feasible set defined by the chance constraint can be non convex. One way to tackle this problem is to replace the chance constraint by its safe tractable approximation; i.e., a system of deterministic constraints H such that i x R n is feasible for whenever it is feasible for H safe approximation, and ii the constraints in H are efficiently computable tractability. To develop a safe tractable approximation of, we first observe that is equivalent to the following system of constraints: Pr y 0 + ϵ i y i > 0 δ, 6 y i = a i T x b i for i = 0, 1,..., l. 7 Since ϵ 1,..., ϵ l are i.i.d. mean zero random variables supported on [ 1, 1], by Hoeffding s inequality [9], we have t 2 Pr ϵ i y i > t exp 2 l y2 i for any t > 0. It follows that when we have Pr y 0 + y 0 2 ln 1 yi 2 δ, 8 y0 2 ϵ i y i > 0 exp 2 δ. l y2 i In other words, 8 is a sufficient condition for 6 to hold. The upshot of 8 is that it is a second order cone constraint. Hence, we conclude that constraints 7 and 8 together serve as a safe tractable approximation of the chance constraint. So far we have only discussed how to handle a single scalar chance constraint. Of course, the case of joint scalar chance constraints; i.e., chance constraints of the form Pr a i T x > b i for i = 1, 2,..., m δ are also of great interest. However, they are not as easy to handle as a single scale chance constraint. For some recent results in this direction, see [2, 5, 14, 7]. It is instructive to examine the relationship between the robust optimization approach and the safe tractable approximation approach. Upon following the derivations in the previous section, we see that constraint 8 is equivalent to the following robust constraint: d T y 0 for all d U, where U is the ellipsoidal uncertainty set given by { } U = x R l+1 : x = e 1 + Bv, v 2 1, 3
and B R l+1 l+1 is given by B = 2 ln 1 [ 0 0 T δ 0 I ]. In other words, we are using the following robust optimization problem subject to c T x d i a i T x b i 0 for all d U, i=0 x P as a safe tractable approximation of the chance constrained optimization problem 5. For further elaboration on the connection between robust optimization and chance constrained optimization, see [6, 5]. We remark that chance constrained optimization has been employed in many recent applications; see, e.g., [8, 13, 10, 17, 16]. For a more comprehensive treatment and some recent advances of chance constrained optimization, we refer the reader to [12, Chapter 4] and [15]. References [1] F. Alizadeh and D. Goldfarb. Second Order Cone Programming. Mathematical Programming, Series B, 95:3 51, 2003. [2] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski. Robust Optimization. Princeton Series in Applied Mathematics. Princeton University Press, Princeton, NJ, 2009. [3] A. Ben-Tal and A. Nemirovski. Robust Convex Optimization. Mathematics of Operations Research, 234:769 805, 1998. [4] A. Ben-Tal and A. Nemirovski. Selected Topics in Robust Convex Optimization. Mathematical Programming, Series B, 1121:125 158, 2008. [5] W. Chen, M. Sim, J. Sun, and C.-P. Teo. From CVaR to Uncertainty Set: Implications in Joint Chance Constrained Optimization. Operations Research, 582:470 485, 2010. [6] X. Chen, M. Sim, and P. Sun. A Robust Optimization Perspective on Stochastic Programming. Operations Research, 556:1058 1071, 2007. [7] S.-S. Cheung, A. M.-C. So, and K. Wang. Linear Matrix Inequalities with Stochastically Dependent Perturbations and Applications to Chance Constrained Semidefinite Optimization. SIAM Journal on Optimization, 224:1394 1430, 2012. [8] L. El Ghaoui, M. Oks, and F. Oustry. Worst Case Value at Risk and Robust Portfolio Optimization: A Conic Programming Approach. Operations Research, 514:543 556, 2003. [9] W. Hoeffding. Probability Inequalities for Sums of Bounded Random Variables. Journal of the American Statistical Association, 58301:13 30, 1963. 4
[10] W. W.-L. Li, Y. J. Zhang, A. M.-C. So, and M. Z. Win. Slow Adaptive OFDMA Systems through Chance Constrained Programming. IEEE Transactions on Signal Processing, 587:3858 3869, 2010. [11] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret. Applications of Second Order Cone Programming. Linear Algebra and Its Applications, 284:193 228, 1998. [12] A. Shapiro, D. Dentcheva, and A. Ruszczyński. Lectures on Stochastic Programming: Modeling and Theory, volume 9 of MPS SIAM Series on Optimization. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 2009. [13] M. B. Shenouda and T. N. Davidson. Outage Based Designs for Multi User Transceivers. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP 2009, pages 2389 2392, 2009. [14] A. M.-C. So. Moment Inequalities for Sums of Random Matrices and Their Applications in Optimization. Mathematical Programming, Series A, 1301:125 151, 2011. [15] A. M.-C. So. Lecture Notes on Chance Constrained Optimization. Available at http://www. se.cuhk.edu.hk/~manchoso/papers/ccopt.pdf, 2013. [16] K.-Y. Wang, A. M.-C. So, T.-H. Chang, W.-K. Ma, and C.-Y. Chi. Outage Constrained Robust Transmit Optimization for Multiuser MISO Downlinks: Tractable Approximations by Conic Optimization. IEEE Transactions on Signal Processing, 6221:5690 5705, 2014. [17] Y. J. Zhang and A. M.-C. So. Optimal Spectrum Sharing in MIMO Cognitive Radio Networks via Semidefinite Programming. IEEE Journal on Selected Areas in Communications, 292:362 373, 2011. 5