Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

Similar documents
Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

Notation and Prerequisites

What can be expressed via Conic Quadratic and Semidefinite Programming?

Assignment 1: From the Definition of Convexity to Helley Theorem

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski

ROBUST CONVEX OPTIMIZATION

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Lecture 6: Conic Optimization September 8

Uncertain Conic Quadratic and Semidefinite Optimization Canonical Conic problem: A 1 x + b 1 K 1 min A I x + b I K I

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Robust conic quadratic programming with ellipsoidal uncertainties

15. Conic optimization

(Linear Programming program, Linear, Theorem on Alternative, Linear Programming duality)

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

Lecture Note 5: Semidefinite Programming for Stability Analysis

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Analysis and synthesis: a complexity perspective

4. Algebra and Duality

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III

Characterizing Robust Solution Sets of Convex Programs under Data Uncertainty

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Summer School: Semidefinite Optimization

1 Strict local optimality in unconstrained optimization

Convex Optimization M2

A Hierarchy of Polyhedral Approximations of Robust Semidefinite Programs

On deterministic reformulations of distributionally robust joint chance constrained optimization problems

Interval solutions for interval algebraic equations

Chap 2. Optimality conditions

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

A strongly polynomial algorithm for linear systems having a binary solution

Operations Research Letters

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

1 Robust optimization

Chapter 1. Preliminaries

Approximate Farkas Lemmas in Convex Optimization

Selected Topics in Robust Convex Optimization

Handout 8: Dealing with Data Uncertainty

Practical Robust Optimization - an introduction -

Structure of Valid Inequalities for Mixed Integer Conic Programs

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Robust linear optimization under general norms

Largest dual ellipsoids inscribed in dual cones

Lecture 5. The Dual Cone and Dual Problem

Trust Region Problems with Linear Inequality Constraints: Exact SDP Relaxation, Global Optimality and Robust Optimization

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture: Duality of LP, SOCP and SDP

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Selected Topics in Robust Convex Optimization

Lecture Notes 1: Vector spaces

1. Introduction. Consider the deterministic multi-objective linear semi-infinite program of the form (P ) V-min c (1.1)

GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION

Date: July 5, Contents

Additional Homework Problems

Duality Theory of Constrained Optimization

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data

Optimality, Duality, Complementarity for Constrained Optimization

A Two-Stage Moment Robust Optimization Model and its Solution Using Decomposition

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

POLARS AND DUAL CONES

Linear Algebra Massoud Malek

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Lecture 7 Monotonicity. September 21, 2008

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Extending the Scope of Robust Quadratic Optimization

Robust combinatorial optimization with variable budgeted uncertainty

Lecture: Cone programming. Approximating the Lorentz cone.

c 2000 Society for Industrial and Applied Mathematics

C.O.R.E. Summer School on Modern Convex Optimization August 26-30, 2002 FIVE LECTURES ON MODERN CONVEX OPTIMIZATION

Introduction and Math Preliminaries

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

Semidefinite Programming

Optimality Conditions for Constrained Optimization

Online First-Order Framework for Robust Convex Optimization

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Optimization Theory. A Concise Introduction. Jiongmin Yong

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Conic optimization: an elegant framework for convex optimization

Adjustable Robust Solutions of Uncertain Linear Programs 1

5. Duality. Lagrangian

On John type ellipsoids

Advances in Convex Optimization: Conic Programming

1 Quantum states and von Neumann entropy

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Machine Learning. Support Vector Machines. Manfred Huber

AN INTRODUCTION TO CONVEXITY

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

LIGHT ROBUSTNESS. Matteo Fischetti, Michele Monaci. DEI, University of Padova. 1st ARRIVAL Annual Workshop and Review Meeting, Utrecht, April 19, 2007

Iterative Methods for Solving A x = b

Linear and non-linear programming

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Lifting for conic mixed-integer programming

A Hierarchy of Suboptimal Policies for the Multi-period, Multi-echelon, Robust Inventory Problem

Strong Duality in Robust Semi-Definite Linear Programming under Data Uncertainty

Relations between Semidefinite, Copositive, Semi-infinite and Integer Programming

Robust Farkas Lemma for Uncertain Linear Systems with Applications

Distributionally Robust Convex Optimization

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Transcription:

. Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. Nemirovski Arkadi.Nemirovski@isye.gatech.edu

Linear Optimization Problem, its Data and Structure Linear Optimization problem: min x { c T x + d : Ax b } (LO) x R n : vector of decision variables, c R n and d R form the objective, A: an m n constraint matrix, b R m : right hand side. Problem s structure: its sizes m, n. Problem s data: (c, d, A, b). Data Uncertainty The data of typical real world LOs are partially uncertain not known exactly when the problem is being solved. Sources of data uncertainty: Prediction errors. Some of data entries (future demands, returns, etc.) do not exist when the problem is solved and hence are replaced with their forecasts.

Measurement errors: Some of the data (parameters of technological devices and processes, contents associated with raw materials, etc.) cannot be measured exactly, and their true values drift around the measured nominal values. Implementation errors: Some of the decision variables (planned intensities of technological processes, parameters of physical devices we are designing, etc.) cannot be implemented exactly as computed. The implementation errors are equivalent to artificial data uncertainties. Indeed, the impact of implementation errors x j (1 + ϵ j )x j + δ j on the validity of the constraint a i1 x 1 +... + a in x n b i is as if there were no implementation errors, but the data of the constraint was subject to perturbations a ij (1 + ϵ j )a ij, b i b i j a ij δ j.

Data Uncertainty: Traditional Treatment and Dangers Traditionally, small (fractions of percents) data uncertainty is just ignored, the problem is solved as it is with the nominal data, and the resulting nominal optimal solution is forwarded to the end user; large data uncertainty is assigned with a probability distribution and is treated via Stochastic Programming techniques. Fact: in many cases, even small data uncertainty can make the nominal solution heavily infeasible and thus practically meaningless.

Example: Antenna Design [Physics:] Directional density of energy transmitted by an monochromatic antenna placed at the origin is proportional to D(δ) 2, where the antenna s diagram D(δ) is a complexvalued function of 3-D direction (unit 3-D vector) δ. [Physics:] For an antenna array a complex antenna comprised of a number of antenna elements, the diagram is D(δ) = j x jd j (δ) ( ) D j ( ): diagrams of elements x j : complex weights design parameters responsible for how the elements in the array are invoked. Antenna Design problem: Given diagrams D 1 ( ),..., D k ( ) and a target diagram D ( ), find the weights x i C such that the synthesized diagram ( ) is as close as possible to the target diagram D ( ). When D j ( ), D ( ), same as the weights, are real and the closeness is quantified by the uniform norm on a finite grid Γ of directions, Antenna Design becomes the LO problem { min τ : τ D (δ) } x jd j (δ) τ δ Γ. x R n,τ j

Example: Consider planar antenna array comprised of 10 elements (circle surrounded by 9 rings of equal areas) in the plane XY (Earth s surface ), and our goal is to send most of the energy up, along the 12 o cone around the Z-axis: Diagram of a ring {z = 0, a x 2 + y 2 b}: [ D a,b (θ) = 1 b 2π 2 r cos ( 2πrλ 1 cos(θ) cos(ϕ) ) ] dϕ dr, a 0 θ: altitude angle λ: wavelength 0.25 0.2 0.15 0.1 0.05 0 0.05 10 antenna elements, equal areas, outer radius 1 m Nominal design problem: τ = min x R 10,τ 0.1 0 10 20 30 40 50 60 70 80 90 Diagrams of the elements vs the altitude angle θ, λ =50 cm { τ : τ D (θ i ) 10 j=1 x jd j (θ i ) τ, 1 i 240, θ i = iπ 480 1.2 } 1 0.8 0.6 0.4 0.2 0 0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Target (blue) and nominal optimal (magenta) diagrams, τ = 0.0589

But: The design variables are characteristics of physical devices and as such they cannot be implemented exactly as computed. What happens when there are implementation errors: x fact j with small ρ? = (1 + ξ j )x comp j, ξ j Uniform[ ρ, ρ] 1.2 20 200 1500 1 15 150 1000 0.8 10 100 500 5 50 0.6 0 0 0 0.4 500 5 50 0.2 10 100 1000 0 15 150 1500 0.2 0 10 20 30 40 50 60 70 80 90 20 0 10 20 30 40 50 60 70 80 90 200 0 10 20 30 40 50 60 70 80 90 2000 0 10 20 30 40 50 60 70 80 90 ρ = 0 ρ = 0.0001 ρ = 0.001 ρ = 0.01 Dream and reality, nominal optimal design: samples of 100 actual diagrams (red) for different uncertainty levels. Blue: the target diagram -distance to target energy concentration Dream Reality ρ = 0 ρ = 0.0001 ρ = 0.001 ρ = 0.01 value min mean max min mean max min mean max 0.059 1.280 5.671 14.04 11.42 56.84 176.6 39.25 506.5 1484 85.1% 0.5% 16.4% 51.0% 0.1% 16.5% 48.3% 0.5% 14.9% 47.1% Quality of nominal antenna design: dream and reality. Data over 100 samples of actuation errors per each uncertainty level ρ. Conclusion: Nominal optimal design is completely meaningless...

Example: NETLIB Case Study. NETLIB: a collection of LO problems for testing LO algorithms. Constraint # 372 of the NETLIB problem PILOT4: a T x 15.79081x 826 8.598819x 827 1.88789x 828 1.362417x 829 1.526049x 830 0.031883x 849 28.725555x 850 10.792065x 851 0.19004x 852 2.757176x 853 12.290832x 854 + 717.562256x 855 0.057865x 856 3.785417x 857 78.30661x 858 122.163055x 859 6.46609x 860 0.48371x 861 0.615264x 862 1.353783x 863 84.644257x 864 122.459045x 865 43.15593x 866 1.712592x 870 0.401597x 871 +x 880 0.946049x 898 0.946049x 916 b 23.387405 The related nonzero coordinates in the optimal solution x of the problem as reported by CPLEX are: x 826 = 255.6112787181108 x 827 = 6240.488912232100 x 828 = 3624.613324098961 x 829 = 18.20205065283259 x 849 = 174397.0389573037 x 870 = 14250.00176680900 x 871 = 25910.00731692178 x 880 = 104958.3199274139 This solution makes (C) an equality within machine precision. Note: The coefficients in a, except for the coefficient 1 at x 880, are ugly reals like -15.79081 or - 84.644257. Ugly coefficients characterize certain technological devices and processes; as such they could hardly be known to high accuracy and coincide with the true data within accuracy of 3-4 digits, not more. Question: Assuming that the ugly entries in a are 0.1%- accurate approximations of the true data ã, what is the effect of this uncertainty on the validity of the true constraint ã T x b as evaluated at x? (C)

Answer: The minimum, over all 0.1% perturbations a ã of ugly entries in a, value of ã T x b, is < 104.9, that is, with 0.1% perturbations of ugly coefficients, the violation of the constraint as evaluated at the nominal solution can be as large as 450% of the right hand side! With independent random 0.1%-perturbations of ugly coefficients, the violation of the constraint at average is as large as 125% of the right hand side; the probability of violating the constraint by at least 150% of the right hand side is as large as 0.18. Among 90 NETLIB problems, perturbing ugly coefficients by just 0.01% results in violating some of the constraints, as evaluated at nominal optimal solutions, by more than 50% in 13 problems, by more than 100% in 6 problems. by 210,000% in PILOT4.

Conclusion: In applications of LO, there exists a real need of a technique capable of detecting cases when data uncertainty can heavily affect the quality of the nominal solution, and in these cases to generate a reliable solution, one that is immunized against uncertainty. Robust Optimization is aimed at satisfying the above need. Uncertain Linear Optimization Problems x Definition: An uncertain LO problem is a collection { { min c T x + d : Ax b }} (LO U ) x (c,d,a,b) U { of LO problems (instances) min c T x + d : Ax b } of common structure (i.e., with common numbers m of constraints and n of variables) with the data varying in a given uncertainty set U R (m+1) (n+1). Usually we assume that the uncertainty set is parameterized, in an affine fashion, by perturbation vector ζ varying in a given perturbation set Z: { [ ] [ ] [ c T d c T U = = 0 d 0 + L c T ζ l A b A 0 b l 0 l=1 A l }{{} nominal data D 0 d l b l } {{ } basic shifts D l ] : ζ Z R }. L

Example: When speaking about PILOT4, we tacitly used the following model of uncertainty: Uncertainty affects only the ugly coefficients {a ij : (i, j) J } in the constraint matrix, and every one of them is allowed to run, independently of all other coefficients, through the interval [a n ij ρ ij a n ij, a n ij + ρ ij a n ij ] a n ij : nominal values of the data ρ ij : perturbation levels (which in the experiment were set to ρ = 0.001). Perturbation set: The box {ζ = {ζ ij } (i,j) J : ρ ij ζ ij ρ ij } Parameterization of the data by perturbation vector: [ ] [ ] c T d [c = n ] T d n A b A n b n + [ ] ζ ij e i e T j (i,j) J

{ { min c T x + d : Ax b }} x (c,d,a,b) U { [ ] [ c T U = 0 d 0 + L c T ζ l A 0 b l 0 l=1 A l }{{} nominal data D 0 d l b l ] } {{ } basic shifts D l : ζ Z R L }. (LO U ) There is no universally defined notion of a solution to a family of optimization problems, like (LO U ). Consider decision environment as follows: A.1. All decision variables in (LO U ) represent here and now decisions; they should be assigned specific numerical values as a result of solving the problem before the actual data reveals itself. A.2. The decision maker is fully responsible for consequences of the decisions to be made when, and only when, the actual data is within the prespecified uncertainty set U. A.3. The constraints in (LO U ) are hard we cannot tolerate violations of constraints, even small ones, when the data is in U.

{ { min c T x + d : Ax b }} (LO U ) x (c,d,a,b) U In the above decision environment, the only meaningful candidate solutions to (LO U ) are the robust feasible ones. Definition: x R n is called a robust feasible solution to (LO U ), if x is feasible for all instances: Ax b (c, d, A, b) U. Indeed, by A.1 a meaningful candidate solution should be independent of the data, i.e., it should be just a fixed vector x. By A.2-3, it should satisfy the constraints, whatever be a realization of the data from U. Acting in the same worst-case-oriented fashion, it makes sense to quantify the quality of a candidate solution x by the guaranteed (the worst, over the data from U) value of the objective: sup{c T x + d : (c, d, A, b) U}

{ { min c T x + d : Ax b }} (LO U ) x (c,d,a,b) U Now we can associate with (LO U ) the problem of finding the best, in terms of the guaranteed value of the objective, among the robust feasible solutions: min t,x { t : c T x + d t, Ax b (c, d, A, b) U } (RC) This is called the Robust Counterpart of (LO U ). Note: Passing from LOs of the form { min c T x + d : Ax b } x to their equivalents { t : c T x + d t, Ax b } min t,x we always may assume that the objective is certain, and the RC respects this equivalence. We lose nothing by assuming the objective in (LO U ) certain, in which case we can think of U as of the set in the space R m (n+1) of the [A, b]-data, and the RC reads min x { c T x : Ax b [A, b] U }. (RC)

{ { min c T x : Ax b }} (LO U ) x (A,b) U { min x c T x : Ax b [A, b] U } (RC) Fact I: The RC of uncertain LO with certain objective is a purely constraint-wise construction: when building the RC, we replace every constraint a T i x b i of the instances with its RC a T i x b i [a T i, b i ] U i (RC i ) where U i is the projection of the uncertainty set U on the space of data [a T i, b i] of i-th constraint. Fact II: The RC remains intact when extending the uncertainty set U to its closed convex hull. When (LO U ) has certain objective, the RC remains intact when extending U to the direct product of closed convex hulls of U i. Thus, the transformation U U + = [cl Conv(U 1 )]... [cl Conv(U m )] keeps the RC intact. From now on, we always assume uncertainty set U convex, and perturbation set Z convex and closed.

{ { min c T x : Ax b }} (LO U ) x [A,b] U { min x c T x : Ax b [A, b] U } (RC) The central questions associated with the concept of RC are: A. What is the computational status of the RC? When is it possible to process the RC efficiently? to be addressed in-depth below. B. How to come-up with meaningful uncertainty sets? modeling issue NOT to be addressed in this talk.

min x { c T x : Ax b [A, b] U } (RC) Potentially bad news: The RC is a semi-infinite optimization problem (finitely many variables, infinitely many constraints) and as such can be computationally intractable. Example: Consider the nearly linear semi-infinite constraint P x p 1 1, [P, p] U U = {[P, p] : p = Bζ, ζ 2 1} To check whether x = 0 is robust feasible is, in general, NP-hard!

min x { c T x : Ax b [A, b] U } (RC) Good news: The RC of an uncertain LO problem is computationally tractable, provided the uncertainty set U is so. Explanation, I: The RC can be written down as the optimization problem { min c T x : f i (x) 0, i = 1,..., m } x f i (x) = sup [a T i x b i] [A,b] U The functions f i (x) are convex (due to their origin) and efficiently computable (as maxima of affine functions over computationally tractable convex sets). Thus, the RC is a Convex Programming program with efficiently computable objective and constraints, and problems of this type are efficiently solvable.

Recaling that the RC is a constraint-wise construction, all we need is to reformulate in a tractable form a single semi-infinite constraint α = [a; b] {α n + Aζ : ζ Z} R n+1 : α T [x; 1] a T x + b 0. Consider several instructive cases when tractable reformulation of ( ) is easy does not require any theory. 1. Scenario uncertainty Z = Conv{ζ 1,..., ζ N }. Setting α j = α n + Aζ j, 1 j N, we get ( ) {α j [x; 1] 0, 1 j N} 2. p -uncertainty Z = {ζ R L : ζ p 1}, ( L ) 1/p ζ p = l=1 ζ l p, 1 p < max l ζ l, p = We have ( ) [α n ] T [x; 1] + A T [x; 1] p 0, 1 p + 1 p = 1 In particular, Interval uncertainty { α : (α n ) j δ j α j α n j + δ j, 1 j n + 1 } ( ) leads to ( ) [α n ] T [x; 1] + n δ i x i + δ n+1 0 i=1

General Well-Structured Case Definition: Let us say that a set X R N is well-structured, if it admits a well-structured representation a representation of the form A 0 x + B 0 u + c 0 = 0 X = x R N : u R M A : 1 x + B 1 u + c 1 K 1, A K x + B k u + c K K K where K k, for every k K, is a simple cone, specifically, either nonnegative orthant R m + = {x R m : x 0}, m = m k, or a Lorentz cone L m = {x R m : x m x 2 1 +... + }, m = m x2 m 1 k, or a Semidefinite cone S m + the cone of positive semidefinite matrices in the space S m of real symmetric m m matrices, m = m k. Example 1: The set X = {x R N : x 1 1} admits polyhedral representation X = {x R N : u R N : u i x i u i, i u i 1} = x R n : u R N : A 1 x + B 1 u + c 1 u 1 x 1 u 1 + x 1. u N x N u N + x N 1 i u i R 2N+1 +

Example 2: The set X = {x R 4 + : x 1 x 2 x 3 x 4 1} admits conic quadratic representation 0 u 1 x 1 x 2 X = x R4 + : u R 3 : 0 u 2 x 3 x 4 = x R n : u R 3 : 1 u 3 u 1 u 2 [x 1 ; x 2 ; x 3 ; x 4 ; u 1 ; u 2 ; u 3 1] R 7 + [2u 1 ; x 1 x 2 ; x 1 + x 2 ] L 3 [2u 2 ; x 3 x 4 ; x 3 + x 4 ] L 3 [2u 3 ; u 1 u 2 ; u 1 + u 2 ] L 3 Example 3: The set X of m n matrices X with nuclear norm (sum of singular values) 1 admits semidefinite representation { X = X R m n : u = (U S m, V S n ) : Tr(U) [ + Tr(V ] ) 2 U X X T 0 V }.

X = x R N : u R M : A 0 x + B 0 u + c 0 = 0 A 1 x + B 1 u + c 1 K 1 A K x + B k u + c K K K, ( ) Good news on well-structured representations: Computational tractability: Minimizing a linear objective over a set given by ( ) reduces to solving a well-structured conic program min x,u c T x : A 0 x + B 0 u + c 0 = 0 A 1 x + B 1 u + c 1 K 1 A K x + B k u + c K K K and thus can be done in a theoretically (and to some extent also practically) efficient manner by polynomial time interior point algorithms. Extremely powerful expressive abilities: w.-s.r. s admit a simple fully algorithmic calculus which makes it easy to build a w.-s.r. for the result of a convexity-preserving operation with convex sets (like taking intersections, direct sums, affine images, inverse affine images, polars, etc.) via w.-s.r. s of the operands. As a result, for all practical purposes, all computationally tractable convex sets arising in Optimization admit explicit w.-s.r. s.,

Example: The set given by the constraints { Ax = b (a) x 0 [ 8 ] 1/3 x i 3 x 1/7 2 x 2/7 3 x 3/7 4 + 2x 1/5 1 x 2/5 5 x 1/5 6 (b) (c) i=1 5I 5x 2 1 x 1/2 1 x 2 2 x 2 x 1 x 1 x 4 x 3 x 3 x 6 x 3 x 3 x 8 + 2 x 1/3 2 x 3 3 x5/8 4 is positive semidefinite x 1 x 2 x 1 x 3 x 2 x 4 x 3 x 2 x 1 x 2 x 3 x 2 x 4 x 3 x 3 x 2 x 3 x 2 x 3 x 4 x 3 x 4 x 3 x 4 x 3 x 4 x 3 x 4 is positive semidefinite x 1 + x 2 sin(ϕ) + x 3 sin(2ϕ) + x 4 sin(4ϕ) 0 ϕ [ ] 0, π 2 admits an explicit w.-s.r. This w.-s.r. can be built fully automatically, by a compiler, and this is how CVX works.

Coming back to Robust Counterpart of a scalar linear inequality, we want to answer the following Question: How to describe in a tractable form the fact that a vector x satisfies the semi-infinite constraint α = [a; b] {α n + Aζ : ζ Z} R n+1 : α T [x; 1] a T x + b 0. We are about to give a nice answer in the case when the perturbation set Z is well structured. Substituting α = α n + Aζ into α T [x; 1] 0, we arrive at the inequality ζ T A T [x; 1] }{{} p(x) or, which is the same, + [α n ] T [x; 1] }{{} q(x) 0, ( ) (p(x)) T ζ q(x) (!) Note that p(x) and q(x) are affine in x and that x satisfies ( ) if and only if (!) holds true for every ζ Z, that is, if and only if the relation (!), considered as a scalar linear inequality in variables ζ, is a consequence of the constraints defining Z. Thus, our goal is to understand when a scalar linear inequality is a consequence of the system of constraints defining a well-structured set. The answer is given by Conic Duality.

Excursion to Conic Duality Situation: We are given a system of linear equations and conic constraints A 0 w b 0 = 0 & A i w b i K i, 1 i I (C) w: variable from Euclidean space E with inner product, K i : closed convex pointed cones with nonempty interiors in Eculidean spaces E i with inner products, i w A i w b i : E E i : affine mappings Case of primary interest: Every K i is or nonnegative orthant R m i + = {y R m i : y 0}, (E i = R m i, y, z i = y T z), or Lorentz cone L m i + = {y R m i : y mi [y 1 ;...; y mi 1 2 } (E i = R m i, y, z i = y T z), or semidefinite cone S m i + (E i = S m i is the space of m i m i symmetric matrices with the Frobenius inner product y, z i = Tr(yz) = y ij z ij ) i,j Goal: To understand when a scalar linear inequality p, w q is a consequence of (C), meaning that ( ) is satisfied at every solution to (C). ( )

Given a system A 0 w b 0 = 0 & A i w b i K i, 1 i I (C) w: variable from Euclidean space (E,, ) K i : closed convex pointed cones with nonempty interiors in Euclidean spaces (E i,, i ) of linear equations and conic constraints, we want to understand when a scalar linear inequality p, w q is a consequence of the system. Linear aggregation: Choose Lagrange multipliers y i, 0 i I, such that for i 1, y i belongs to the cone dual to K i : i > 0 : y i K i := {y : y, u i 0 u K i } Since y i K i, the conic constraint A iw b i K i implies that y i, [A i w b i ] i 0, that is, [ (C) A i y i, w b i, y i i ] (+) A : F E is the conjugate of A : E F : A y, x E y, Ax F Equations A 0 w b 0 = 0 imply (+) when i = 0. Summing up inequalities (+) over i = 0, 1,..., I, we arrive at the following conclusion: Whenever y i, 0 i I are such that y i K i when i 1, and I I A i y i = p and b i, y i i q i=0 the scalar linear inequality ( ) is a consequence of (C). i=0 ( )

A 0 w b 0 = 0 & A i w b i K i, 1 i I (C) w: variable from Euclidean space (E,, ) K i : closed convex pointed cones with nonempty interiors in Euclidean spaces (E i,, i ) p, w q Conic Farkas Lemma: Let (C) be strictly feasible: w : A 0 w b 0 = 0 & A i w b i intk i, 1 i I Then ( ) is a consequence of (C) if and only if ( ) can be obtained from (C) by linear aggregation, i.e. if and only if y 0, {y i K i } I i=1 : I A i y i = p and i=0 I b i, y i i q. When all K i are nonnegative orthants, the statement remains true when replacing strict feasibility with feasibility. i=0 ( )

Corollary [Conic Duality]: Consider a conic problem Opt(P ) = min { c, w : A 0 w = b 0 & A i w b i K i, 1 i I} x (P ) along with its dual problem { I } I Opt(D) = max b i, y i i : y i K i i 1, A i y i = c y 1,...,y I i=0 i=0 (D) Weak Duality: One always has Opt(P ) Opt(D). Strong Duality: Assuming that (P ) is strictly feasible and below bounded, (D) is solvable, and Opt(P ) = Opt(D). When all K i are nonnegative orthants, Strong Duality holds when strict feasibility of (P ) is replaced with feasibility. Proof of Strong Duality: Apply Conic Farkas Lemma to the inequality c, x Opt(P ) which clearly is the consequence of the system of constraints in (P ). Note: Nonnegative orthants/lorentz/semidefinite cones are self-dual. When all K i are of nonnegative orthants, or Lorentz cones, or semidefinite cones, the constraints y i K i in (D) take the form y i K i.

Conic Farkas Lemma is really deep already in the case when all conic inequalities in question are scalar: K i = R + for all i. In this case it becomes the usual Farkas Lemma: A linear inequality p T x q is a consequence of solvable system of linear inequalities Ax b if and only if there exists y 0 such that A T y = p & b T y q, or, which is the same, if and only if the target inequality ( ) can be obtained by linear aggregation (taking weighted sum with nonnegative weights) of inequalities from the system and the identically true inequality 0 T x 1. While simply-looking, this fact is really deep. Indeed, consider the following derivation { } 1 x 1 x 1 y 1 2 1, y 2 1 x 2 + y 2 2 x + y = 1 x + 1 y 1 2 + 1 2 x 2 + y 2 2 2 = 2 [we have used Cauchy Inequality] In this chain of implications, the starting point is a solvable system of linear inequalities, and the conclusion x + y 2 is a linear inequality as well; however, the steps use highly nonlinear arguments. According to Farkas Lemma, every chain of this type, whatever long and with whatever complicated steps, can be replaced with linear aggregation. A statement with that huge predictive power indeed must be deep! ( )

Back to Robust Linear Optimization Now we are ready to answer the question When the scalar linear inequality in variables ζ Z = (p(x)) T ζ q(x) (!) with p(x), q(x) affine in x holds true for all ζ Z, where Z is well structured: ζ R L : u R M : A 0 ζ + B 0 u + c 0 = 0 A 1 ζ + B 1 u + c 1 K 1 A K ζ + B k u + c K K K ( ) K i : Nonnegative orthants/lorentz/semidefinite cones., Assume that either all K k are nonnegative orthants, or ( ) is strictly feasible. We ask when the scalar linear inequality (p(x)) T ζ + 0 T u q(x) in variables w = [ζ; u] holds true for all [ζ; u] satisfying the constraints in ( ). By conic Farkas Lemma, this is so if and only if there exist y 0, y 1,..., y K such that A T 0 y 0 + K k=1 AT k y k = p(x), B0 T y 0 + K k=1 BT k y k = 0, y0 T c 0 K k=1 yt k c k q(x) y k K k = K k, 1 k K ( ) Since p(x) and q(x) are affine in x, ( ) is a system of linear equations and conic constraints in variables x, y 0, y 1,..., y K We conclude that The set X of those x for which (!) holds true for all ζ Z is well-structured: X = {x : y 0, y 1,..., y k : (x, y 0,..., y k ) satisfy ( )}