Homework Set #8 - Solutions

Similar documents
EE364b Homework 4. L(y,ν) = (1/2) y x ν(1 T y 1), minimize (1/2) y x 2 2 subject to y 0, 1 T y = 1,

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Homework Set #6 - Solutions

Convex Optimization M2

Tutorial on Convex Optimization for Engineers Part II

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

5. Duality. Lagrangian

A Brief Review on Convex Optimization

EE364b Homework 5. A ij = φ i (x i,y i ) subject to Ax + s = 0, Ay + t = 0, with variables x, y R n. This is the bi-commodity network flow problem.

Convex Optimization M2

CSCI : Optimization and Control of Networks. Review on Convex Optimization

4. Convex optimization problems

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Lecture 7: Convex Optimizations

subject to (x 2)(x 4) u,

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Convex Optimization Boyd & Vandenberghe. 5. Duality

EE364a Homework 8 solutions

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Lecture: Convex Optimization Problems

7. Statistical estimation

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Convex optimization problems. Optimization problem in standard form

Semidefinite Programming Basics and Applications

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Lecture: Duality.

Constrained Optimization and Lagrangian Duality

Final exam solutions

Lecture 7: Weak Duality

Lecture 2: Convex functions

Lecture: Duality of LP, SOCP and SDP

4. Convex optimization problems

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 6 Fall 2009

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lagrangian Duality and Convex Optimization

EE364a Review Session 5

EE/AA 578, Univ of Washington, Fall Duality

Tutorial on Convex Optimization: Part II

12. Interior-point methods

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

4. Convex optimization problems (part 1: general)

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

Lecture Note 5: Semidefinite Programming for Stability Analysis

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

EE364 Review Session 4

9. Geometric problems

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

Linear and non-linear programming

EE363 homework 8 solutions

EE Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko, JPL Third Term

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

8. Geometric problems

Nonlinear Programming Models

Convex Optimization: Applications

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

8. Geometric problems

EE 227A: Convex Optimization and Applications October 14, 2008

CS-E4830 Kernel Methods in Machine Learning

Analytic Center Cutting-Plane Method

1. Introduction. mathematical optimization. least-squares and linear programming. convex optimization. example. course goals and topics

1. Introduction. mathematical optimization. least-squares and linear programming. convex optimization. example. course goals and topics

Constrained optimization

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

ICS-E4030 Kernel Methods in Machine Learning

7. Statistical estimation

Lecture 18: Optimization Programming

12. Interior-point methods

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Convex functions. Definition. f : R n! R is convex if dom f is a convex set and. f ( x +(1 )y) < f (x)+(1 )f (y) f ( x +(1 )y) apple f (x)+(1 )f (y)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

EE5138R: Problem Set 5 Assigned: 16/02/15 Due: 06/03/15

Optimisation convexe: performance, complexité et applications.

Lecture 6: Conic Optimization September 8

Convex Optimization in Communications and Signal Processing

Primal/Dual Decomposition Methods

Convex Optimization & Lagrange Duality

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Lecture: Examples of LP, SOCP and SDP

Convex Optimization and Modeling

INTERIOR-POINT METHODS ROBERT J. VANDERBEI JOINT WORK WITH H. YURTTAN BENSON REAL-WORLD EXAMPLES BY J.O. COLEMAN, NAVAL RESEARCH LAB

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Homework 5. Convex Optimization /36-725

Applications of Linear Programming

Homework 4. Convex Optimization /36-725

Equality constrained minimization

Exercises. Exercises. Basic terminology and optimality conditions. 4.2 Consider the optimization problem

10 Numerical methods for constrained problems

EE364b Homework 2. λ i f i ( x) = 0, i=1

Linear-Quadratic Optimal Control: Full-State Feedback

On the Method of Lagrange Multipliers

Math 273a: Optimization Subgradients of convex functions

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

CS675: Convex and Combinatorial Optimization Fall 2016 Convex Optimization Problems. Instructor: Shaddin Dughmi

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

3 The Simplex Method. 3.1 Basic Solutions

Constrained optimization: direct methods (cont.)

Convex Optimization and Modeling

Algorithms for constrained local optimization

Transcription:

EE 50 - Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko JPL Third Term 20-202 Homework Set #8 - Solutions. For all parts to this problem we will use the fact that H R f) can be expressed as H R f) = cf) T b where [cf)] k = cos2π k ) f) k =... M +. In addition we will define the following functions for sake of convenience. g b) sup H R f) = sup cf) T b 0 f f P 0 f f P g 2 b) inf H R f) = inf cf) T b 0 f f P 0 f f P g 3 b) sup H R f) = sup cf) T b f S f /2 f S f /2 g 4 b) inf H Rf) = inf f S f /2 f S f /2 cf)t b. Note that as g and g 3 are each the pointwise supremum of an affine actually linear) function of b they are each convex. Similarly as g 2 and g 4 are each the pointwise infimum of an affine or rather linear) function of b they are each concave. a) The optimization problem here is the following. minimize δ S g b) + δ P g 2 b) δ P g 3 b) δ S g 4 b) δ S with variables δ S R and b R M+. This is identical to the following problem. minimize δ S g b) + δ P ) 0 g 2 b) + δ P ) 0 g 3 b) δ S 0 g 4 b) δ S 0 As the objective δ S is clearly convex g and g 3 are convex and g 2 and g 4 are concave this problem is a convex optimization problem. b) Define the function g 5 b) as follows. g 5 b) inf {φ : δ S H R f) δ S for φ f /2} { } = inf φ : δ S cf) T b δ S for φ f /2. Note that g 5 b) is quasiconvex as a function of b. To see this note that the φ 0 -sublevel set C φ0 is given by { } C φ0 = {b : g 5 b) φ 0 } = b : δ S cf) T b δ S for φ 0 f /2

which is the intersection of an infinite number of halfspaces. As such C φ0 is convex and so g 5 b) is quasiconvex. With the function g 5 defined as such the optimization problem becomes the following. minimize g 5 b) g b) + δ P g 2 b) δ P with variable b R M+. This is identical to the problem minimize g 5 b) g b) + δ P ) 0 g 2 b) + δ P ) 0. As the objective function g 5 b) is quasiconvex and the constraints are convex this problem is a quasiconvex optimization problem. c) Define the function g 6 b) as follows. g 6 b) min {k : b k+ = = b M = 0}. Note that g 6 b) is a quasiconvex function of b. To see this note that the k 0 -sublevel set C k0 is given by C k0 = {b : g 6 b) k 0 } = {b : b k0 + = = b M = 0} which is an affine set. As such C k0 is convex and so g 6 b) is quasiconvex. With the function g 6 defined as such the optimization problem becomes the following. minimize g 6 b) g b) + δ P g 2 b) δ P g 3 b) δ S g 4 b) δ S with variable b R M+. This is identical to the following problem. minimize g 6 b) g b) + δ P ) 0 g 2 b) + δ P ) 0 g 3 b) δ S 0 g 4 b) δ S 0 As the objective function g 6 b) is quasiconvex and the constraints are convex this problem is a quasiconvex optimization problem. d) After discretizing the frequency to f l = l/ 2L) define the index sets I P and I S as follows. I P {l : 0 f l f P } I S {l : f S f l /2}.

In other words I P and I S denote the sets of discretized frequency indices corresponding to the passband and stopband respectively. With these sets defined as such the problem in part c) can be solved via bisection by solving the following feasibility LP at each step. find b cf l ) T b + δ P ) 0 l I P cf l ) T b + δ P ) 0 l I P cf l ) T b δ S 0 l I S cf l ) T b δ S 0 l I S. Using cvx in MATLAB we find M = 27 and that the optimal filter which achieves this minimal filter length has a magnitude response in db as shown in Figure. Close-ups of the passband and stopband regions are shown in Figure 2a) and b) respectively and clearly satisfy the design specifications. 20 0 H e j2πf) db) 20 40 60 80 00 20 0 0.05 0. 0.5 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Figure : Minimal length FIR low-pass filter magnitude response in db. f The following MATLAB code employing cvx) was used to generate these results. % Define filter specifications f_p =./6; f_s =./5; delta_p_db = 0.; delta_s_db = -30; % Map magnitude response specifications to amplitude response ones K = 0.^delta_P_dB./20); delta_p = K-)./K+); delta_s = 0.^delta_S_dB./20);

0.05 20 0.04 30 0.03 40 H e j2πf) db) 0.02 0.0 0 0.0 0.02 H e j2πf) db) 50 60 70 80 0.03 90 0.04 00 0.05 0 0.02 0.04 0.06 0.08 0. 0.2 0.4 0.6 a) f 0 0.2 0.25 0.3 0.35 0.4 0.45 0.5 b) f Figure 2: Minimal length FIR low-pass filter magnitude response in db: a) close-up of the passband region 0 f f P and b) close-up of the stopband region f S f /2. % Set discretized frequency parameters L = 6383; f = [0:L]./2.*L); I_P = findf >= 0 & f <= f_p); I_S = findf >= f_s & f <= 0.5); % Define bisection parameters M_LL = 0; M_UL = 00; bisec_tol =.; % Carry out bisection using a while-loop while M_UL - M_LL >= bisec_tol) % Set the midpoint filter order M_MP = roundm_ul+m_ll)./2); % Compute the discretized cosine vectors k = 0:M_MP) ; C = cos2.*pi.*k*f); % Solve the feasibility LP cvx_begin variable bm_mp + ); % Passband constraints C:I_P) *b - +delta_p) <= 0; -C:I_P) *b + -delta_p) <= 0; % Stopband constraints C:I_S) *b - delta_s <= 0; -C:I_S) *b - delta_s <= 0;

end cvx_end % Check for feasibility and update the bisection ignorance interval if strcmpcvx_status Solved )) M_UL = M_MP; b_opt = b; M_opt = M_MP; else M_LL = M_MP; end % Plot the optimal magnitude response in db f = linspace00.5892); k = 0:M_opt) ; C = cos2.*pi.*k*f); H_R = C *b_opt; plotf20*log0absh_r)) LineWidth ) grid on xlabel $f$ Interpreter LaTeX FontSize 20) ylabel $\left H\!\left e^{j2\pi f} \right) \right $ db)... Interpreter LaTeX FontSize 20) 2. a) We get a subset P R 3 which will soon be shown to be a polyhedron) of locations x that are consistent with the camera measurements. To find the smallest box that covers any subset in R 3 all we need to do is minimize and maximize the linear) functions x x 2 and x 3 to get l and u. Here P is a polyhedron so we will end up solving 6 LPs one to get each of l l 2 l 3 u u 2 and u 3. To verify that P is a polyhedron note that our measurements tell us that v i ρ i /2) c T i x + d A i x + b i ) v i + ρ i /2) i =... m. i Multiplying through by c T i x + d i) which is positive we get v i ρ i /2) ) c T i x + d i ) Ai x + b i v i + ρ i /2) ) c T i x + d i ) i =... m which is a set of 2m linear inequalities in x. In particular it defines a set P which is a polyhedron. To get l k and u k we solve the LPs minimize/maximize x k v i ρ i /2) ) c T i x + d i ) Ai x + b i i =... m A i x + b i v i + ρ i /2) ) c T i x + d i ) i =... m for k = 2 3. Here it is understood that l k is found by solving the minimization problem whereas u k is found by solving the maximization problem. b) The following MATLAB script using cvx) solves this specific problem instance. % Load the data and extract the sections of the camera matrices

camera_data; A = P:2:3); b = P:24); c = P3:3) ; d = P34); A2 = P2:2:3); b2 = P2:24); c2 = P23:3) ; d2 = P234); A3 = P3:2:3); b3 = P3:24); c3 = P33:3) ; d3 = P334); A4 = P4:2:3); b4 = P4:24); c4 = P43:3) ; d4 = P434); % Solve the 6 LPs to find the smallest bounding box consistent with the % measurements cvx_quiettrue); for bounds = :6 cvx_begin variable x3); switch bounds case minimize x) case 2 maximize x) case 3 minimize x2) case 4 maximize x2) case 5 minimize x3) case 6 maximize x3) end % Constraints for the st camera vhat:)-rho)/2)*c *x + d) <= A*x + b; A*x + b <= vhat:)+rho)/2)*c *x + d); % Constraints for the 2nd camera vhat:2)-rho2)/2)*c2 *x + d2) <= A2*x + b2; A2*x + b2 <= vhat:2)+rho2)/2)*c2 *x + d2); % Constraints for the 3rd camera vhat:3)-rho3)/2)*c3 *x + d3) <= A3*x + b3; A3*x + b3 <= vhat:3)+rho3)/2)*c3 *x + d3); % Constraints for the 4th camera vhat:4)-rho4)/2)*c4 *x + d4) <= A4*x + b4; A4*x + b4 <= vhat:4)+rho4)/2)*c4 *x + d4); cvx_end valbounds) = cvx_optval; end % Display the minimal bounding box bounds disp[ l = num2strval))]); disp[ u = num2strval2))]); disp[ l2 = num2strval3))]); disp[ u2 = num2strval4))]); disp[ l3 = num2strval5))]);

disp[ u3 = num2strval6))]); The script returns the following results. l = 0.9956 u = 0.8245 l 2 = 0.2753 u 2 = 0.37837 l 3 = 0.67899 u 3 = 0.57352. 3. a) Consider the desired measurement equations z i = φ y i ) i =... m. ) The function φ is unknown and needs to be estimated here however we do have bounds on its derivative. Specifically we have for all v. To show this note that we have /β φ ) v) /α 2) φ φ v) ) = v v. Differentiating both sides of the above relation and using the chain rule we have φ u) φ ) v) = where u = φ v). Hence we get φ ) v) = /φ u) where u = φ v). 3) But as α φ u) β for all u where 0 < α < β we have Substituting this into 3) thus yields /β /φ u) /α u. /β φ ) v) /α v. Returning to the problem at hand note that from ) and the mean value theorem there is a v ) for which we have φ ) v) = z i+ z i y i+ y i for any i =... m. Substituting this into 2) and assuming that the data is given with y i in nondecreasing order we get /β) y i+ y i ) z i+ z i /α) y i+ y i ) i =... m. Now as v i = z i a T i x and v i are i.i.d. with distribution N 0 σ 2) for i =... m the log-likelihood function lx z) has the form lx z) = C m i= zi a T i x ) 2 + C2

where C / 2σ 2) which satisfies C > 0) and C 2 m/2) log 2πσ 2) are constants. Thus to find an ML estimate of x and z we can minimize the objective f 0 x z) m i= zi a T i x ) 2 = z Ax 2 2 the constraints. Note that here the i-th row of A is a T i for i =... m. This leads to the following problem. minimize z Ax 2 2 /β) y i+ y i ) z i+ z i /α) y i+ y i ) i =... m with variables x R n and z R m. However this is a QP and hence a convex optimization problem. Note that the problem does not depend on the noise variance σ 2. b) The following MATLAB code which invokes cvx was used to solve this specific problem instance. % Load the nonlinear measurement data nonlin_meas_data; % Generate the m-) x m first order difference matrix Delta_ Delta row = zerosm); Delta row) = -; Delta row2) = ; Delta col = zerosm-); Delta col) = -; Delta_ = toeplitzdelta coldelta row); % Solve the QP used to get the ML estimates of x and z using cvx cvx_begin variable xn); variable zm); minimize normz - A*x)./beta).*Delta_*y <= Delta_*z; Delta_*z <=./alpha).*delta_*y; cvx_end % Display the ML estimate of x disp ML estimate of x: ); dispx); % Plot the estimated function \widehat{\phi}_{\mathrm{ml}} plotzy LineWidth ) xlabel $u$ Interpreter LaTeX FontSize 20) ylabel $\widehat{\phi}_{\mathrm{ml}}\!\left u \right)$ Interpreter... LaTeX FontSize 20) This yields the ML estimate of x given as follows. x ml = [ 0.489 0.4657 0.9364 0.9297 ] T.

2.5 2.5 φmlu) 0.5 0 0.5.5 2 2.5 6 5 4 3 2 0 2 3 4 5 Figure 3: Plot of the ML estimate of the nonlinear function φu) namely φ ml u). This was constructed by plotting [ẑ ml ] i versus y i for i =... m. u A plot of the estimated function φ ml u) is shown in Figure 3. 4. In this problem to eliminate outliers successively at each iteration we compute the Löwner-John ellipsoid for the current set of data points and remove the point corresponding to the largest Lagrange multiplier and add it to the set of outliers. Let D k and O k denote respectively the data and outlier sets at the k-th iteration. Also let x k) i denote the point removed at the k-th iteration i.e. the one that corresponds to the largest Lagrange multiplier. Then we have the following ellipsoidal peeling algorithm. Initialization: D 0 {x... x N } O 0. Iteration: For k =... N rem where N rem denotes the total number of points to remove from the original data set do the following.. Determine x k) i as the point corresponding to the largest Lagrange multiplier of the following convex optimization problem. minimize log det A Ax i + b 2 x i D k minimize det A)/n Ax i + b 2 x i D k with variables A S n and b R n. When it comes to solving instantiations of this problem using cvx we will use the second equivalent form of the problem as the det rootn subroutine is more efficient that the log det one. 2. Update the data and outlier sets as follows. { } { D k = D k \ O k = O k x k) i x k) i }.

3. Increment k as k k +. 4. If k N rem then go to. Otherwise stop. The following MATLAB code which uses cvx was used to remove outliers from the given data set here. % Load the data ellip_peel_data; % Set the number of outliers to remove N_rem = 30; % Initialize the ellipsoidal peeling algorithm D = data; O = []; log_vol_array = []; n = sizedata); % Run for-loop iteration to remove outliers successively for k = :N_rem+) % Find the minimum volume ellipsoid containing all current data points cvx_begin variable Ann) symmetric; variable bn); dual variable lambda; minimize -det_rootna)) %minimize -log_deta)) lambda : normsa*d + repmatbsized2))) <= ; cvx_end % Find the maximum dual variable i.e. the largest Lagrange multiplier % and remove the corresponding data point [lambda_maxoutlier_index] = maxlambda); % Update the outlier and data sets O = [O D:outlier_index)]; D:outlier_index) = []; % Store the logarithm of the volume of the optimal ellipsoid log_vol_array = [log_vol_array 0.5.*log./detA)))]; clear A b; end % Plot the log of the volume of the optimal ellipsoid as a function of the % number of points removed figure plot0:n_remlog_vol_array LineWidth ) grid on xlabel $\mathrm{card}\!\left \mathcal{o} \right)$ Interpreter... LaTeX FontSize 20) ylabel $\log\!\left\mathrm{vol}\!\left\mathcal{e}\right)\right)$... Interpreter LaTeX FontSize 20)

% Plot the data points to determine visually the number of outliers present figure plotdata:)data2:). LineWidth ) xlabel $x$ Interpreter LaTeX FontSize 20) ylabel $y$ Interpreter LaTeX FontSize 20) A plot of the number of points removed cardo) as a function of the logarithm of the volume of the resulting minimum volume covering ellipsoid E is shown in Figure 4a). Along with this a plot of the data points themselves is given in Figure 4b). logvole)) 3.5 3 2.5 2.5 0.5 0 y 5 0 5 0 5 0 5 0.5 20 25.5 0 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 cardo) a) 30 25 20 5 0 5 0 5 0 5 20 25 b) x Figure 4: Ellipsoidal peeling plots: a) logarithm of the volume of the minimum covering ellipsoid i.e. logvole))) versus the number of data points removed i.e. cardo)) and b) original data set. From the eyeball test it is clear that there are 4 outliers. This is not immediately clear from the plot in Figure 4a) which exhibits two knees. From Figure 4a) it can be seen that there is a dramatic decrease in the volume of the minimum covering ellipsoid for the first 3 points and then there is some stalling for the next two points which suggests that non-outlier points were erroneously added to O. This corresponds to the first knee of the curve. Then there is another dramatic decrease at the next point followed by a slow and gradual shrinking of the volume from that point on. This corresponds to the second knee of the curve. If we add up only the number of points leading to a dramatic decrease of the ellipsoidal volume we obtain 4 outliers consistent with the eyeball test. The results of this simulation bring to light some of the advantages and disadvantages of this ellipsoidal peeling approach. If we accumulate only the points corresponding to large decreases in the ellipsoidal volume then we obtain the correct number of outliers. However the algorithm did not extract all the outliers before extracting some of the normal data points.

*5. a) Let ν R n denote the dual variable. Then the Lagrangian is given by the following. n Lx ν) = Ax b 2 2 + ν k x 2 k ) k= = x T A T Ax 2b T Ax + b T b + x T diagν)) x T ν = x T A T A + diagν) ) x 2b T Ax T ν + b T b. 4) From 4) it can be seen that Lx ν) is unbounded below in x if A T A + diagν) 0 or A T b ) R A T A + diagν) ). When both A T A + diagν) 0 and A T b ) R A T A + diagν) ) the infimum of the Lagrangian occurs when the gradient with respect to x vanishes. This yields Lx ν) = 2 A T A + diagν) ) x 2A T b = 0 which is equivalent to the normal equations A T A + diagν) ) x = A T b. The solution to the normal equations is given by x = A T A + diagν) ) # A T b. Substituting this into 4) it follows that the dual function gν) = inf x Lx ν) is given by T ν b T A A T A + diagν) ) # A T b + b T b A T A + diagν) 0 gν) = A T b ) R A T A + diagν) ) otherwise. Hence the dual problem is the following. maximize T ν b T A A T A + diagν) ) # A T b + b T b A T A + diagν) 0 A T b ) R A T A + diagν) ). Expressing the problem in epigraph form yields the following equivalent problem. maximize T ν t + b T b b T A A T A + diagν) ) # A T b t. A T A + diagν) 0 A T b ) R A T A + diagν) ) In turn this is identical to the problem maximize T ν t + b T b t b T A ) A T A + diagν) ) # A T b ) 0 A T A + diagν) 0 A T b ) R A T A + diagν) )

which upon using Schur complements leads to the following equivalent problem which is an SDP. maximize T ν t + b T b [ A T A + diagν) A T b b T A t ] 0. b) We first write the dual problem as a minimization problem by negating the objective function. This leads to the problem minimize T ν + t b T b [ A T A + diagν) A T b b T A t ] 0. To handle the linear matrix inequality LMI) constraint we introduce a Lagrange multiplier matrix of the form [ ] Z z z T. λ With this the Lagrangian is given by the following. Lν t Z z λ) = [ ] [ ]) Z z A T T ν + t b T A + diagν) A T b b tr z T λ b T A t = T ν + t b T b tr Z A T A + diagν) ) zb T A ) z T A T b + λt ) = T ν + t b T b tr A T AZ ) diagz) T ν + b T Az + z T A T b λt = diagz)) T ν + t λ) tr A T AZ ) + 2b T Az b T b. This is unbounded below unless diagz) = and λ =. As such the dual function is given by gz z λ) = { tr A T AZ ) + 2b T Az b T b diagz) = λ = otherwise. Thus the dual problem of the SDP of part a) is the following. maximize tr A T AZ ) + 2b T Az b T b diagz) = [ ] Z z z T 0. Expressing this maximization problem as a minimization problem by negating the objective leads to the following equivalent problem. minimize tr A T AZ ) 2b T Az + b T b diagz) = [ ] Z z z T 0. 5)

This is the desired form here. To see that 5) is a relaxation of the original problem note that the binary least-squares problem is equivalent to minimize tr A T AZ ) 2b T Az + b T b diagz) = Z = zz T. 6) Suppose we relax the equality constraint Z = zz T with the weaker inequality constraint Z zz T. Using Schur complements this weaker inequality constraint is equivalent to [ ] z T 0. z Z But note that we have the following. [ ] [ ] [ z T 0n I n z T 0 z Z 0 n z Z [ ] Z z z T 0. ] [ 0 n I n 0 n ] 0 Substituting this equivalent weaker inequality constraint into 6) we get the problem from 5). If we have [ ]) Z z rank z T = at the optimum of 5) which is equivalent to saying that Z = zz T then we satisfy the equality constraint Z = zz T of 6) and so the relaxation is exact in this case. Thus the optimal values of problems 6) and 5) are equal and the optimal solution z of 5) is optimal for 6). c) Consider the problem minimize [ ] E Av b 2 2 E [ vk 2 ]. 7) = k =... n Note that we have the following upon expanding the objective. [ ] [ ] E Av b 2 2 = E Av b) T Av b) [ )] = E tr Av b) Av b) T [ ]) = tr E Av b) Av b) T = tr E [ Avv T A T Avb T bv T A T + bb T ]) = tr A E [ vv T ]) ) A T A E[v]) b T b E[v]) T A T + bb T = tr AZA T Azb T bz T A T + bb T ) 8) = tr A T AZ ) b T Az z T A T b + b T b = tr A T AZ ) 2b T Az + b T b. 9)

Here 8) follows from the fact that z = E[v] and Z = E [ vv T ]. Similarly for the left-hand-side of the equality constraints of 7) we get the following. E [ v 2 k] = Zkk = [Z] kk k =... n. 0) Finally the covariance matrix of v given by [ ] C v E v E[v]) v E[v]) T = Z zz T must be positive semidefinite i.e. C v 0. Hence we must have Z zz T 0 which was found in part b) to be equivalent to [ ] Z z z T 0. Combining this constraint with the results of 9) and 0) it follows that the problem from 7) is equivalent to the problem minimize tr A T AZ ) 2b T Az + b T b diagz) = [ ] Z z z T 0 which is the same as 5) from part b). d) The following MATLAB code employing cvx was used to compute suboptimal feasible solutions using the desired heuristics. % Initialize the problem data randn state 0) s = 0.5; %s = ; %s = 2; %s = 3; m = 50; n = 40; A = randnmn); xhat = signrandnn)); b = A*xhat + s.*randnm); f_xhat = norma*xhat - b).^2; % i) Compute the sign of the solution to the LS problem x_a = signa\b); f_x_a = norma*x_a - b).^2; % ii) Compute the sign of the solution to the SDP of part b) %cvx_precision high cvx_begin sdp variable zn);

variable Znn) symmetric; minimize tracea *A*Z) - 2*b *A*z + b *b) [Z z ; z ] >= 0; diagz) == ; cvx_end x_b = signz); f_x_b = norma*x_b - b).^2; % iii) Compute the sign of the rank-one approximation of the optimal % solution of the SDP of part b) Y = [Z z ; z ]; [VD] = eigy); [eig_sortedeig_index_sorted] = sortdiagd) descend ); v_ = V:neig_index_sorted)); x_c = signz); f_x_c = norma*x_c - b).^2; % iv) Compute the sign of the random sample with mean z and second moment % Z N = 00; U = randnnn); V_tilde = sqrtmz - z*z )*U + repmatzn); X_d = signrealv_tilde)); R = A*X_d - repmatbn); F_x_d = sumr.^2); [f_x_dx_d_index] = minf_x_d); x_d = X_d:x_d_index); % Report the objective function values for each approach disp[ f_xhat = num2strf_xhat)]); disp[ f_x_a = num2strf_x_a)]); disp[ f_x_b = num2strf_x_b)]); disp[ f_x_c = num2strf_x_c)]); disp[ f_x_d = num2strf_x_d)]); disp[ SDP lower bound = num2strcvx_optval)]); % Report the square of the l_2-norm of the difference between the actual % input to the problem and each heuristic solution % solutions and the actual input to the problem disp[ xhat - x_a _{2}^2 = num2strnormxhat-x_a).^2)]); disp[ xhat - x_b _{2}^2 = num2strnormxhat-x_b).^2)]); disp[ xhat - x_c _{2}^2 = num2strnormxhat-x_c).^2)]); disp[ xhat - x_d _{2}^2 = num2strnormxhat-x_d).^2)]); To generate samples of N z Z zz T ) we used the following result from probability theory. If u N µ Σ) then w Bu + c is such that w N Bµ + c BΣB T ). Thus if u N 0 n I n ) then w Z zz) /2 u + z is such that w N z Z zz T ).

The following table lists for each s the values of fx) = Ax b 2 2 for x = x and the four suboptimal solutions i)-iv) along with the lower bound obtained from the SDP relaxation given in 5). s f x) f x a)) f x b)) f x c)) f x d)) SDP lower bound 0.5 7.3243 7.3243 7.3243 7.3243 7.3243 6.427 69.2974 62.0505 69.2974 69.2974 69.2974 6.9026 2 277.895 908.5323 277.895 277.895 277.895 230.4446 3 623.6765 5.57 673.6883 673.6883 623.6765 489.0238 From this table and the results of the simulations carried out here we make the following observations. For s = 0.5 all heuristics return x. This is likely to be the global optimum but that is not necessarily true. However from the lower bound we know that the global optimum is in the interval [6.427 7.3243] so even if 7.3243 is not the global optimum it is quite close. For higher values of s the result from the first heuristic x a) is substantially worse than those derived from the SDP-based heuristics. All three SDP-based heuristics return x for s = and s = 2. For s = 3 the randomized rounding method returns x. The other SDP-based heuristics give slightly higher values.