Tutorial on Convex Optimization: Part II

Tutorial on Convex Optimization: Part II Dr. Khaled Ardah Communications Research Laboratory TU Ilmenau Dec. 18, 2018

Outline Convex Optimization Review Lagrangian Duality Applications Optimal Power Allocation for Rate Maximization Downlink Beamforming as SDP and SOCP Uplink-Downlink Duality via Lagrangian Duality Disciplined convex programming and CVX

Convex Optimization Review Mathematical optimization problem (P) min x s.t. f 0 (x) f i (x) 0, i = 1,..., m h j (x) = 0, j = 1,..., p Variable x R n, domain (set) X = {x f i (x) 0, i, h j (x) = 0, j} P is convex if: objective and domain X are convex, h j (x) are affine P is still convex if: min max, f i (x) 0, f i (x) are all concave feasible solution x: p = f 0(x), x X, local optimal solution x: p = f 0( x) f 0(x), x x ɛ, x, x X global optimal solution x : p = f 0(x ) f 0(x), x, x X if P is convex, then local optimal solution x is global optimal x.

Lagrangian Duality Mathematical optimization problem (P) (not necessarily convex) min x s.t. f 0 (x) f i (x) 0, i = 1,..., m h j (x) = 0, j = 1,..., p Variable x R n, domain X, optimal value p = f 0 (x ) Lagrangian function L (named after Joseph-Louis Lagrange 1811) m p L(x, λ, ν) = f 0 (x) + λ i f i (x) + ν j h j (x) i=1 j=1 L is a weighted sum of objective and constraint functions λ i is the Lagrange multiplier associated with f i (x) 0 ν j is the Lagrange multiplier associated with h j (x) = 0 other names: weights, penalties, prices,...

Lagrange dual function (problem) Lagrange dual function g(λ, ν) = min L(x, λ, ν) = min x x ( f 0 (x) + m λ i f i (x) + i=1 ) p ν i h j (x) For any given x, g(λ, ν) is a pointwise minimum of a family of linear functions (in (λ, ν)), thus g(λ, ν) is always a concave function We say that (λ, ν) is dual feasible if λ 0 and g(λ, ν) is finite g(λ, ν) f 0(x) (proof) This means: for any dual feasible vector (λ, ν), the dual function always serves as a lower bound. j=1

Lagrange dual function (problem) Dual problem (D) max (λ,ν) g(λ, ν) s.t. λ 0, Variable: vector (λ, ν), optimal value d = g(λ, ν ) Dual problem D is always convex regardless of the convexity of the original (primal) problem P Duality gap: e = p d In general, e > 0, i.e., there is a gap between primal and dual values If primal problem P is convex, strong duality holds and thus e = 0

Optimality Conditions The necessary conditions for x to be a (local) optimal solution to primal problem P is that there exists some (λ, ν) such that Primal feasibility conditions f i (x ) 0, i = 1,..., m h j (x ) = 0, j = 1,..., p Dual feasibility condition λ 0 Complementary slackness condition λ i f i (x ) = 0, i = 1,..., m First order optimality xl(x, λ, ν ) = xf 0(x ) + m i=1 λ i xf i (x ) + p j=1 ν i xh j (x ) = 0

Optimality Conditions The optimality conditions provided above are called the Karush-Kuhn-Tucker (KKT) conditions In general, the KKT conditions are necessarily, but not sufficient If problem is convex, the KKT conditions are sufficient Remark For unconstrained optimization problem, KKT conditions reduces to only first order Optimality condition xf 0(x ) = 0. The local optimal must be attained at a stationary point For constraint optimization problems, the (local) optimal is no longer attained at a stationary point, instead, it is attained at a KKT point.

Example Solve the following problem min x 2 + y 2 + 2z 2 s.t. 2x + 2y 4z 8

Example Solve the following problem min x 2 + y 2 + 2z 2 s.t. 2x + 2y 4z 8 The Lagrangian L(x, y, z, λ) = x 2 + y 2 + 2z 2 + λ(8 2x 2y + 4z) Dual function g(λ) = min x,y,z L(x, y, z, λ) λ = 2x 2λ = 0 x λ = 2y 2λ = 0 y λ = 4z + 4λ = 0 z x = y = λ, z = λ Substitute x = y = λ, z = λ into 2x + 2y 4z = 8, we get λ = 1 What is the dual problem? Are the optimality conditions satisfied?

Example Least-norm solution of linear equations min x x T x s.t. Ax = b, Recall optimal solution is x = A + p = (A T A) 1 A T b If A is very large, this solution cannot be used. Lagrangian function L(x, λ) = x T x + λ T (Ax b) Dual function g(λ) = min x L(x, λ) min x L(x, λ) = xl(x, λ) = 2x + A T λ = 0 x = 1 2 AT λ x here minimizes L(x, λ) g(λ) = L( 1 2 AT λ, λ) = 1 4 λt AA T λ b T λ, concave function of λ lower bound property: p 1 4 λt AA T λ b T λ, λ

Example Stranded form LP min x s.t. c T x Ax = b Lagrangian function L(x, λ) = c T x + λ T (Ax b) = b T λ + (c + A T λ) T x which is affine in x Dual function g(λ) = min x L(x, λ) = { b T λ c + A T λ = 0 otherwise which is linear on affine domain {λ c + A T λ = 0} lower bound property: p b T λ if c + A T λ 0

Optimal Power Allocation for Rate Maximization

Optimal Power Allocation for Rate Maximization Assume that we have n channels, where the i-th channel gain is α i For each channel i, the transmit power is given by p i The SNR of channel i is given as (where σ is noise power) Γ i = α ip i σ The rate of channel i is then given as r i = log(1 + Γ i ) Our problem: find power allocation vector p = [p 1,..., p n ] T that maximizes the sum rate subject to maximum power constraint, i.e., max p s.t. n r i = i=1 n i=1 n p i = p max i=1 p i 0, i ( log 1 + α ip i σ )

Optimal Power Allocation for Rate Maximization L(p, λ, ν) = n i=1 log ( 1 + α i p i Take gradient wrt p i, we have pi L(p, λ, ν) = 1 1 + α i p i σ σ ) + µ( n i=1 p i p max ) n i=1 λ ip i αi σ + µ λ i = 0. Thus, we have µ = α i σ+α i p i + λ i From the complementary slackness condition, we have λ i p i = 0 Case 1: λ i = 0 and p i > 0 thus α i µ = σ + α i p i p i = 1 µ σ α i, where 1 µ σ α i Case 2: p i = 0 and λ i > 0, thus µ = α i σ + λ i λ i = µ α i σ µ > α i σ = 1 µ < σ α i

Optimal Power Allocation for Rate Maximization From above, we have optimal power allocation as p i = max{ 1 µ σ [ 1, 0} = α i µ σ ] + α i We find µ such that n [ 1 µ σ ] + = pmax α i i=1 Remark: if α i increases σ α i decreases p i increases. Can we draw a diagram illustrating the above relation?

Downlink Beamforming as SDP and SOCP

Downlink Beamforming as SDP and SOCP A wireless network consisting of one Tx and K Rxs The Tx has N antennas, while each Rx has one antenna Problem: minimize the transmit power subject to SINR targets. First, the received signal at k-th Rx is y k = K k h H k w k s k + n k = h H k w k s k + h H k w j s j + n }{{}}{{} k j k desiredsignal }{{} noise interference y k C is the received signal h j C N is the channel between Tx and j-th Rx w j C N is the transmit beamforming for the j-th Rx s k s k = 1, while s k s j = 0 Thus, the SINR at Rx k is given as h H k Γ k = w k 2 j k hh k w j 2 + σ

Downlink Beamforming as SDP and SOCP The SINR at Rx k is given as h H k Γ k = w k 2 j k hh k w j 2 + σ The QoS constraints requires that Γ k γ k, k Mathematical optimization problem (nonconvex) w k 2 min w k, k s.t. k h H k w k 2 j k hh k w j 2 + σ γ k, k Note that the transmit power is represented by w k 2 = p k In other problems, you may want to design the beamforming direction and the beamforming power independently where w k 2 = 1 w k = p k w k

Downlink Beamforming as SDP and SOCP Solve the above problem using the relaxed SDP w k 2 = w H k w k = Tr(w k w H k ) = Tr(W k ), where W k C N N h H k w k 2 = (h H k w k ) H (h H k w k ) = w H k h k h H k w k = Tr(w k w H k h k h H k ) = Tr(W k H k ) where H k C N N W k and H k are both rank-one matrices Rearrange the SINR constraints as h H k w k 2 j k hh k w j 2 + σ γ k h H k w k 2 γ k ( j k Modify the SINR constraints using the above results ( ) Tr(W k H k ) γ k Tr(H k W j ) + σ j k ) h H k w j 2 + σ

Downlink Beamforming as SDP and SOCP The original problem can be written as (nonconvex) min w k, k s.t. Tr(W k ) k Tr(W k H k ) γ k ( W k 0 rank(w k ) = 1, k j k ) Tr(H k W j ) + σ, k The above problem is still nonconvex, due to rank-one constraint Ignoring the rank-one constraint, the problem becomes a relaxed SDP, which is convex

Downlink Beamforming as SDP and SOCP Based on observation that an arbitrary phase rotation can be added to the beamforming vectors without effecting the SINR functions Thus, h H k w k can be chosen to be real without the loss of generality. Let W = [w 1,... w K ] C N K The SINR constrains become (1 1 γ k ) h H k w k 2 [ ] h H σ k W 2 Because h H k w k can be assumed real, we can take square root as ( 1 1 ) [ ] h H h H k w k σ k W γ k which is a second-order cone constraint The original problem can be written as SOCP (convex) ( min w k 2 s.t. 1 1 ) [ ] h H h H k w k σ k W w k, k γ k k

Uplink-Downlink Duality via Lagrangian Duality

Uplink-Downlink Duality via Lagrangian Duality In engineering design, we are not only in the numerical solution to the problem, but also to the structure of the optimal solution. The Lagrangian Duality of the original problem often reveals such structure. Uplink-Downlink Duality: refers to the fact that the total transmit power required to satisfy a certain SINR constrains in the downlink is equal to the total transmit power required to satisfy a certain SINR constrains in the uplink p k = q k i=1 i=1 where p k is the downlink power and q k is the uplink power Note that, p k does not have be equal to q k. It is the sum of powers that is equal!

Uplink-Downlink Duality via Lagrangian Duality The original optimization problem is min w k, k s.t. w k 2 = wk H w k k k h H k w k 2 j k hh k w j 2 + σ γ k, k Lagrangian function L(w k, λ k ) = k w H k w k k λ k ( 1 γ k h H k w k 2 j k ) h H k w j 2 σ = λ k σ + k k [ Note that I N + j k λ jh j h H j w H k [ I N + λ j h j h H j j k ] λ k γ k h k h H k λ ] k h k h H k w k γ k

Uplink-Downlink Duality via Lagrangian Duality The dual optimization problem max λ k, k s.t. λ k σ k λ j h j h H j j + I N (1 + 1 γ k ) λ k h k h H k, k

Uplink-Downlink Duality via Lagrangian Duality Let us consider now the uplink problem The received uplink signal at Rx with regards to Tx k y k = K wk H h k q k s k + n k = wk H h k q k s k + wk H h j q j s j + wk H n k }{{}}{{} k j k desiredsignal }{{} noise interference SINR of Tx k at uplink is given as q k h Γ H k k = w k 2 j k q j h H j w k 2 + w k H w kσ The mathematical optimization problem min q k, k s.t. k q k q k h H k w k 2 j k q j h H j w k 2 + w H k w kσ γ k, k

Uplink-Downlink Duality via Lagrangian Duality The optimal beamforming direction w k is given by the MMSE as ( 1hk w k = ρ q j h j h H j + σi) where ρ is a normalization parameter so that w k = 1 j After substituting w k into the SINR constraints and rearranging it, we have min q k, k s.t. k q k q j h j h H j j + σi N (1 + 1 γ k ) q k h k h H k, k

Uplink-Downlink Duality via Lagrangian Duality The dual optimization problem of downlink problem max λ k, k s.t. λ k σ k λ j h j h H j j + I N (1 + 1 γ k ) λ k h k h H k, k The uplink optimization problem min q k s.t. k q k q j h j h H j j + σi N (1 + 1 γ k ) q k h k h H k, k Note that q k = λ k σ. Thus, Both problems are identical, except that the maximization, the minimization, and the SINR constraint are reversed

Disciplined convex programming and CVX

Disciplined convex programming and CVX LP solvers lots available (GLPK, Excel, Matlab s linprog,... ) Cone solvers typically handle (combinations of) LP, SOCP, SDP cones several available (SDPT3, SeDuMi, CSDP,... ) general convex solvers some available (CVXOPT, MOSEK,... ) could write your own use some tricks to transform the problem into an equivalent one that has a standard form (e.g., LP, SDP) modeling systems can partly automate this step

CVX runs in Matlab, between the cvx begin and cvx end commands relies on SDPT3 or SeDuMi (LP/SOCP/SDP) solvers refer to user guide, online help for more info the CVX example library has more than a hundred examples

Example: Constrained norm minimization between cvx begin and cvx end, x is a CVX variable statement subject to does nothing, but can be added for readability inequalities are treated element wise

What CVX does after cvx end, CVX solver will transforms problem into an LP calls solver SDPT3 overwrites (object) x with (numeric) optimal value assigns problem optimal value to cvx optval assigns problem status (which here is Solved) to cvx status

Some useful functions