Convex Optimization. Optimality conditions. (EE227BT: UC Berkeley) Lecture 9 (Optimality; Conic duality) 9/25/14. Laurent El Ghaoui.

Convex Optmzaton (EE227BT: UC Berkeley) Lecture 9 (Optmalty; Conc dualty) 9/25/14 Laurent El Ghaou Organsatonal Mdterm: 10/7/14 (1.5 hours, n class, double-sded cheat sheet allowed) Project: Intal proposal by 10/28/14 Project mdpont revew: 11/4/14 Project fnal report: Fnals week Optmalty condtons Optmalty condtons mn f 0 (x) f (x) 0, = 1,..., m. Recall: f 0 (x ), x x 0 for all feasble x X Can we smplfy ths usng Lagrangan? g(λ) = nf x L(x, λ) := f 0 (x) + λ f (x) Assume strong dualty and that both p and d attaned! Thus, there exsts a par (x, λ ) such that p = f 0 (x ) = d = g(λ ) = mn L(x, λ ) L(x, λ ) f 0 (x ) = p x Thus, equaltes hold n above chan. x argmn x L(x, λ ).

Optmalty condtons x argmn x L(x, λ ). If f 0, f 1,..., f m are dfferentable, ths mples x L(x, λ ) x=x = f 0 (x ) + λ f (x ) = 0. Moreover, snce L(x, λ ) = f 0 (x ), we also have λ f (x ) = 0. But λ 0 and f (x ) 0, so complementary slackness λ f (x ) = 0, = 1,..., m. KKT Optmalty condtons Karush-Kuhn-Tucker Condtons (KKT) f (x ) 0, = 1,..., m (prmal feasblty) λ 0, = 1,..., m (dual feasblty) λ f (x ) = 0, = 1,..., m (compl. slackness) x L(x, λ ) x=x = 0 (Lagrangan statonarty) We showed: f strong dualty holds, and (x, λ ) exst, then KKT condtons are necessary for par (x, λ ) to be optmal If problem s convex, then KKT also suffcent Exercse: Prove the above suffcency of KKT. Hnt: Use that L(x, λ ) s convex, and conclude from KKT condtons that g(λ ) = f 0 (x ), so that (x, λ ) optmal prmal-dual par. Read Ch. 5 of BV Mnmax Example: Lasso-lke problem p := mn x Ax b 2 + λ x 1. x 1 = max x T v v 1 x 2 = max x T u u 2 1. p = mn max x u,v = max mn u,v x = max u,v = max u Saddle-pont formulaton u T (b Ax) + v T x u 2 1, u T (b Ax) + x T v u 2 1, u T b, A T u = v, u 2 1, v λ u T b, u 2 1, A T u λ. v λ v λ

Mnmax problems Mnmax theory treats problems nvolvng a combnaton of mnmzaton and maxmzaton Let X and Y be arbtrary nonempty sets Let φ : X Y R ± nf over y Y, followed by sup over x X sup nf φ(x, y) = sup ψ(y(x)) sup over x X, followed by nf over y Y nf sup φ(x, y) = nf ξ(x(y)) When are nf sup and sup nf equal? Weak mnmax Theorem Let φ : X Y R ± be any functon. Then, Proof: sup nf φ(x, y) nf x, y, x, y, x, sup φ(x, y) nf φ(x, y) φ(x, y) x X nf φ(x, y) sup φ(x, y ) x X y Y sup nf φ(x, y) sup φ(x, y ) x X y Y = sup nf φ(x, y) nf x X sup y Y φ(x, y ). Exercse: Show that weak dualty s follows from above mnmax nequalty. Hnt: Use φ = L (Lagrangan), and sutably choose y. Strong mnmax If nf sup equals sup nf, common value called saddle-value Value exsts f there s a saddle-pont,.e., par (x, y ) φ(x, y ) φ(x, y ) φ(x, y) for all x X, y Y. Exercse: Verfy above nequalty! Strong mnmax Classes of problems dual to each other can be generated by studyng classes of functons φ, More nterestng queston: Startng from the prmal problem over X, how to ntroduce a space Y and a useful functon φ on X Y so that we have a saddle-pont? Suffcent condtons for saddle-pont Functon φ s contnuous, and It s convex-concave (φ(, y) convex for every y Y, and φ(x, ) concave for every x X ), and Both X and Y are convex; one of them s compact.

Strong mnmax Def. Let φ be as before. A pont (x, y ) s a saddle-pont of φ (mn over X and max over Y) ff the nfmum n the expresson nf sup φ(x, y) s attaned at x, and the supremum n the expresson sup nf φ(x, y) s attaned at y, and these two extrema are equal. x argmn max φ(x, y) y argmax mn φ(x, y). x argmn Optmalty va mnmax max φ(x, y) y Pont (x, y ) s a saddle-pont f and only f argmax mn φ(x, y). 0 φ(x, y ) = x φ(x, y ) y φ(x, y ) When φ s of convex-concave form, yelds KKT condtons. Conc dualty Consder lnear program LP Dualty Correspondng dual s mn c T x Ax b. max b T λ A T λ + c = 0, λ 0. LP dualty facts: If ether p or d fnte, then p = d, and both prmal, dual problem have optmal solutons If p =, then d = (follows from weak-dualty) If d =, then p = (agan, weak-dualty) Proof: See lecture notes. If LP s feasble, strong dualty holds.

Consder SOCP SOCP Dualty mn f T x A x + b 2 c T x + d, = 1,..., m. Lagrangan (ordnary) L(x, λ) := f T x + λ ( A x + b 2 c T x + d ) Recall that x 2 = sup u T x u 2 1. λ A x + b 2 = max u (λ u ) T (A x + b ) u 2 1 = max v v T (A x + b ) v 2 λ Thus, wth v 1,..., v m also as dual varables we have p = nf x sup f T x + v T λ,v 1,...,v m (A x + b ) λ (c T x + d ) s.t. v 2 λ, = 1,..., m. The dual problem s SOCP Dualty d = sup nf λ,v 1,...,v m x f T x + v T (A x + b ) λ (c T x + d ) s.t. v 2 λ, = 1,..., m. Inner mnmzaton over x very easy (unconstraned) f + AT v λ c = 0 Dual problem becomes d = sup λ T d + v T λ,v 1,...,v m s.t. f + AT v λ c = 0, v 2 λ, = 1,..., m. Also an SOCP, lke the prmal Apply Slater to obtan a condton for strong dualty. b SDP prmal form SDP dualty p := mn Tr(CX ), s.t. Tr(A X ) = b, = 1,..., m, X 0. How to handle the matrx constrant X 0? Introduce conc Lagrangan L(X, ν, Y ) := Tr(CX ) + ν (Tr(A X ) b ) Tr(YX ) where we have a matrx dual varable Y 0. Note: Tr(YX ) 0; so p sup ν,y L(X, ν, Y ) for any feasble X As before, p d := sup ν,y 0 nf X L(X, ν, Y ) Smplfyng nf X L, we obtan dual functon b T ν f C g(ν, Y ) = ν A Y = 0, otherwse. SDP Dualty Dual problem max ν,y 0 max ν b T ν s.t. C ν A = Y 0 b T ν s.t. ν A C. Ths s the conc form we saw n Lecture 5! Weak-dualty: Tr(CX ) ν T b for any feasble par (X, ν) Strong-dualty: If prmal strctly feasble,x 0 such that Tr(A X ) = b, for = 1,..., m, we have strong dualty. Alternatvely, f dual strctly feasble, we have strong dualty. But, contrary to LPs, feasblty alone does not suffce!

Example: falure of strong dualty Prmal problem x 2 + 1 0 0 p = mn x 2 0 x 1 x 2 0. X 0 x 2 0 [ ] x1 x Any prmal feasble requres 2 0; x x 2 0 1 0 and x2 2 0. Thus, we have x 2 = 0, whereby p = 0. Prmal obj: Tr(CX ) wth c 23 = c 32 = 1/2 (rest zeros). Lagrangan: Tr([C X ] T Y ) Example: falure of strong dualty Tr([C X ] T Y ) = (x 2 + 1)y 11 x 1 y 22 + x 2 2x 2 y 23 = y 11 x 1 y 22 + x 2 x 2 y 11 2x 2 y 23. g(y ) = nf X 0 Tr([C X ]T Y ) = Dual functon y 11 y 22 = 0, 1 y 11 2y 23 = 0 Dual SDP otherwse. d = max Y 0 y 11, y 22 = 0, 1 y 11 2y 23 = 0. Any feasble Y satsfes, y 23 = 0 (snce y 22 = 0) Thus y 11 = 1, so d = 1. dualty gap: p d = 1 S. Boyd, EE364b Sldes References Bertsekas, Nonlnear Programmng