4 Duality 4.1 Numerical perturbation analysis example. Consider the quadratic program with variables x 1, x 2, and parameters u 1, u 2. minimize x 2 1 +2x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1, (a) Solve this QP, for parameter values u 1 = 2, u 2 = 3, to find optimal primal variable values x 1 and x 2, and optimal dual variable values λ 1, λ 2 and λ 3. Let p denote the optimal objective value. Verify that the KKT conditions hold for the optimal primal and dual variables you found (within reasonable numerical accuracy). Hint: See 3.7 of the CVX users guide to find out how to retrieve optimal dual variables. To specify the quadratic objective, use quad_form(). (b) We will now solve some perturbed versions of the QP, with u 1 = 2+δ 1, u 2 = 3+δ 2, where δ 1 and δ 2 each take values from { 0.1,0,0.1}. (There are a total of nine such combinations, including the original problem with δ 1 = δ 2 = 0.) For each combination of δ 1 and δ 2, make a prediction p pred of the optimal value of the perturbed QP, and compare it to p exact, the exact optimal value of the perturbed QP (obtained by solving the perturbed QP). Put your results in the two righthand columns in a table with the form shown below. Check that the inequality p pred p exact holds. δ 1 δ 2 p pred p exact 0 0 0 0.1 0 0.1 0.1 0 0.1 0.1 0.1 0.1 0.1 0 0.1 0.1 0.1 0.1 4.2 A determinant maximization problem. We consider the problem minimize logdetx 1 subject to A T i XA i B i, i = 1,...,m, with variable X S n, and problem data A i R n k i, B i S k i ++, i = 1,...,m. The constraint X 0 is implicit. We can give several interpretations of this problem. Here is one, from statistics. Let z be a random variable in R n, with covariance matrix X, which is unknown. However, we do have (matrix) upper 27
bounds on the covariance of the random variables y i = A T i z Rk i, which is A T i XA i. The problem is to find the covariance matrix for z, that is consistent with the known upper bounds on the covariance of y i, that has the largest volume confidence ellipsoid. Derive the Lagrange dual of this problem. Be sure to state what the dual variables are (e.g., vectors, scalars, matrices), any constraints they must satisfy, and what the dual function is. If the dual function has any implicit equality constraints, make them explicit. You can assume that mi=1 A i A T i 0, which implies the feasible set of the original problem is bounded. What can you say about the optimal duality gap for this problem? 4.3 The relative entropy between two vectors x, y R n ++ is defined as n x k log(x k /y k ). This is a convex function, jointly in x and y. In the following problem we calculate the vector x that minimizes the relative entropy with a given vector y, subject to equality constraints on x: minimize n x k log(x k /y k ) subject to Ax = b 1 T x = 1 The optimization variable is x R n. The domain of the objective function is R n ++. The parameters y R n ++, A R m n, and b R m are given. Derive the Lagrange dual of this problem and simplify it to get (a k is the kth column of A). maximize b T z log n y k e at k z 4.4 Source localization from range measurements. 3 A signal emitted by a source at an unknown position x R n (n = 2 or n = 3) is received by m sensors at known positions y 1,..., y m R n. Fromthestrengthofthereceivedsignals, wecanobtainnoisyestimatesd k ofthedistances x y k 2. We are interested in estimating the source position x based on the measured distances d k. In the following problem the error between the squares of the actual and observed distances is minimized: m ( 2. minimize f 0 (x) = x y k 2 2 dk) 2 Introducing a new variable t = x T x, we can express this as minimize m subject to x T x t = 0. ( ) 2 t 2ykx+ y T k 2 2 d 2 k The variables are x R n, t R. Although this problem is not convex, it can be shown that strong duality holds. (It is a variation on the problem discussed on page 229 and in exercise 5.29 of Convex Optimization.) (4) 28
Solve (4) for an example with m = 5, 1.8 2.0 y 1 =, y 2.5 2 =, y 1.7 3 = and, y 4 = d = (2.00, 1.24, 0.59, 1.31, 1.44). 2.0, y 5 = 2.5 The figure shows some contour lines of the cost function f 0, with the positions y k indicated by circles. 3, 2.5 2 x2 1 0.5 0.5 1 2 2.5 3 x 1 To solve the problem, you can note that x is easily obtained from the KKT conditions for (4) if the optimal multiplier ν for the equality constraint is known. You can use one of the following two methods to find ν. Derive the dual problem, express it as an SDP, and solve it using CVX. Reduce the KKT conditions to a nonlinear equation in ν, and pick the correct solution (similarly as in exercise 5.29 of Convex Optimization). 4.5 Projection on the l 1 ball. Consider the problem of projecting a point a R n on the unit ball in l 1 -norm: minimize (1/2) x a 2 2 subject to x 1 1. Derive the dual problem and describe an efficient method for solving it. Explain how you can obtain the optimal x from the solution of the dual problem. 4.6 A nonconvex problem with strong duality. On page 229 of Convex Optimization, we consider the problem minimize f(x) = x T Ax+2b T x subject to x T (5) x 1 29
(a) Explain how to find c max i and c min i. Your method can involve the solution of a reasonable number (not exponential in n, m or r) of convex or quasiconvex optimization problems. (b) Carry out your method using the data found in deducing_costs_data.m. You may need to determine whether individual inequality constraints are tight; to do so, use a tolerance threshold of ǫ = 10 3. (In other words: if a T k x b k 10 3, you can consider this inequality as tight.) Give the values of c max i 4.14 Kantorovich inequality. and c min i, and make a very brief comment on the results. (a) Suppose a R n with a 1 a 2 a n > 0, and b R n with b k = 1/a k. Derive the KKT conditions for the convex optimization problem Show that x = (1/2,0,...,0,1/2) is optimal. minimize log(a T x) log(b T x) subject to x 0, 1 T x = 1. (b) Suppose A S n ++ with eigenvalues λ k sorted in decreasing order. Apply the result of part (a), with a k = λ k, to prove the Kantorovich inequality: ( ) 1/2 ( ) 1/2 2 u T Au u T A 1 λ 1 λ n u + λ n λ 1 for all u with u 2 = 1. 4.15 State and solve the optimality conditions for the problem minimize log det subject to trx 1 = α trx 2 = β trx 3 = γ. X1 X 2 X T 2 X 3 1 The optimization variable is X = X1 X 2 X T 2 X 3, with X 1 S n, X 2 R n n, X 3 S n. The domain of the objective function is S 2n ++. We assume α > 0, and αγ > β 2. 4.16 Consider the optimization problem minimize logdetx +tr(sx) subject to X is tridiagonal with domain S n ++ and variable X S n. The matrix S S n is given. Show that the optimal X opt satisfies (X 1 opt ) ij = S ij, i j 1. 35
5 4.5 4 3.5 3 φ(u) 2.5 2 1 0.5 0 1 0.5 0 0.5 1 4.21 Robust LP with polyhedral cost uncertainty. We consider a robust linear programming problem, with polyhedral uncertainty in the cost: u minimize sup c C c T x subject to Ax b, with variable x R n, where C = {c Fc g}. You can think of x as the quantities of n products to buy (or sell, when x i < 0), Ax b as constraints, requirements, or limits on the available quantities, and C as giving our knowledge or assumptions about the product prices at the time we place the order. The objective is then the worst possible (i.e., largest) possible cost, given the quantities x, consistent with our knowledge of the prices. In this exercise, you will work out a tractable method for solving this problem. You can assume that C, and the inequalities Ax b are feasible. (a) Let f(x) = sup c C c T x be the objective in the problem above. Explain why f is convex. (b) Find the dual of the problem maximize c T x subject to Fc g, with variable c. (The problem data are x, F, and g.) Explain why the optimal value of the dual is f(x). (c) Use the expression for f(x) found in part (b) in the original problem, to obtain a single LP equivalent to the original robust LP. (d) Carry out the method found in part (c) to solve a robust LP with data rand( seed,0); A = rand(30,10); b = rand(30,1); c_nom = 1+rand(10,1); % nominal c values and C described as follows. Each c i deviates no more than 25% from its nominal value, i.e., 0.75c nom c 1.25c nom, and the average of c does not deviate more than 10% from the average of the nominal values, i.e., 0.9(1 T c nom )/n 1 T c/n 1.1(1 T c nom )/n. 38
Compare the worst-case cost f(x) and the nominal cost c T nomx for x optimal for the robust problem, and for x optimal for the nominal problem (i.e., the case where C = {c nom }). Compare the values and make a brief comment. 4.22 Diagonal scaling with prescribed column and row sums. Let A be an n n matrix with positive entries, and let c and d be positive n-vectors that satisfy 1 T c = 1 T d = 1. Consider the geometric program minimize x T Ay subject to n i=1 n j=1 x c i i = 1 y d j j = 1, with variables x,y R n (and implicit constraints x 0, y 0). Write this geometric program in convex form and derive the optimality conditions. Show that if x and y are optimal, then the matrix B = 1 x T Ay diag(x)adiag(y) satisfies B1 = c and B T 1 = d. 4.23 A theorem due to Schoenberg. Suppose m balls in R n, with centers a i and radii r i, have a nonempty intersection. We define y to be a point in the intersection, so y a i 2 r i, i = 1,...,m. (18) Suppose we move the centers to new positions b i in such a way that the distances between the centers do not increase: b i b j 2 a i a j 2, i,j = 1,...,m. (19) We will prove that the intersection of the translated balls is nonempty, i.e., there exists a point x with x b i 2 r i, i = 1,...,m. To show this we prove that the optimal value of minimize t subject to x b i 2 2 r2 i +t, i = 1,...,m, (20) with variables x R n and t R, is less than or equal to zero. (a) Show that (19) implies that t (x b i ) T (x b j ) (y a i ) T (y a j ) for i,j I, if (x,t) is feasible in (20), and I {1,...,m} is the set of active constraints at x,t. (b) Suppose x,t are optimal in (20) and that λ 1,..., λ m are optimal dual variables. Use the optimality conditions for (20) and the inequality in part a to show that m m t = t λ i (x b i ) 2 2 λ i (y a i ) 2 2. i=1 i=1 39