Convex Optimization Lecture 13

Similar documents
Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Lecture 9 Sequential unconstrained minimization

Interior Point Algorithms for Constrained Convex Optimization

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725

12. Interior-point methods

Lagrange duality. The Lagrangian. We consider an optimization program of the form

CS711008Z Algorithm Design and Analysis

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Lecture 6: Conic Optimization September 8

Convex Optimization and SVM

Lecture 17: Primal-dual interior-point methods part II

12. Interior-point methods

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Numerical Optimization

Duality revisited. Javier Peña Convex Optimization /36-725

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Applications of Linear Programming

5. Duality. Lagrangian

Lagrangian Duality and Convex Optimization

ICS-E4030 Kernel Methods in Machine Learning

10 Numerical methods for constrained problems

Convex Optimization and l 1 -minimization

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Lecture: Duality of LP, SOCP and SDP

Convex Optimization M2

Optimization for Machine Learning

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

CS-E4830 Kernel Methods in Machine Learning

2.098/6.255/ Optimization Methods Practice True/False Questions

Advances in Convex Optimization: Theory, Algorithms, and Applications

Convex Optimization Boyd & Vandenberghe. 5. Duality

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Lecture: Duality.

Lecture 14: Optimality Conditions for Conic Problems

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Lecture 7: Convex Optimizations

Homework Set #6 - Solutions

Convex Optimization & Lagrange Duality

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Nonlinear Optimization for Optimal Control

11. Equality constrained minimization

CS711008Z Algorithm Design and Analysis

Analytic Center Cutting-Plane Method

Truncated Newton Method

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Lecture: Introduction to LP, SDP and SOCP

subject to (x 2)(x 4) u,

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

Primal/Dual Decomposition Methods

Lecture 10: Linear programming duality and sensitivity 0-0

EE364a Review Session 5

Homework 4. Convex Optimization /36-725

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lecture 5. The Dual Cone and Dual Problem

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Lagrangian Duality Theory

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 8. Strong Duality Results. September 22, 2008

Example Problem. Linear Program (standard form) CSCI5654 (Linear Programming, Fall 2013) Lecture-7. Duality

Lecture: Algorithms for LP, SOCP and SDP

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Solving Dual Problems

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Linear programming II

Optimisation and Operations Research

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Newton s Method. Javier Peña Convex Optimization /36-725

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Lecture 15: October 15

1 Review of last lecture and introduction

Lecture 16: October 22

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

EE/AA 578, Univ of Washington, Fall Duality

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Lecture 18: Optimization Programming

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

4TE3/6TE3. Algorithms for. Continuous Optimization

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Constrained Optimization and Lagrangian Duality

4. Algebra and Duality

Solving large Semidefinite Programs - Part 1 and 2

Nonlinear Programming

On the Method of Lagrange Multipliers

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Supplement: Universal Self-Concordant Barrier Functions

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

Primal-Dual Symmetric Interior-Point Methods from SDP to Hyperbolic Cone Programming and Beyond

Transcription:

Convex Optimization Lecture 13 Today: Interior-Point (continued) Central Path method for SDP Feasibility and Phase I Methods From Central Path to Primal/Dual

Central'Path'Log'Barrier'Method Init: Feasible&# " and&some&5 (") Do: Solve&5 > Gbarrier&problem&using&Newton&starting&at&# > # >GH # (5 > ) Stop&if& f Ñ ϵ 5 >GH ï 5 > (for&some&parameter&ï > 1) Access&to: 2 nd order&oracle&for&! ",&! _ Explicit&access&to&*, + Strictly&feasible&point&# " Assumptions:! " convex&and&selfgconcordant! _ convex&quadratic&(or&linear) # " strictly&convex&with&! _ # " < ö John von&neumann Narendra Karmarkar Overall&#Newton&Iterations:&V = (log H [ + log log H õ) Overall&runtime:& V = N + ú à + =:& ' :evals log H [ Arkadi Nemirovski Yuri Nesterov

Optimizing'with'Matrix'Inequalities min :! "(#) / R 2 3. 5.! _ # 0 *# = +! _ :R 9 > è Central&path&given&by&solutions&to: min :! / R 2 " # _ log det(! _ # ) 3. 5. *# = +

min :! / R 2 " # H log det! Ñ _`H _ # 3. 5. *# = + ã Ñ #, B = f Optimum&# Ñ,&dual&opt&B Ñ! " # H Ñ log det! _ # _ + B, *# + 0 = & / ã Ñ # Ñ, B Ñ = &! " # Ñ + éh! _ Ñ _ # éh Ñ &! _ (# Ñ ) + *? B Ñ Define&D Ñ = éh Ñ! _ # Ñ éh 0 ã #, D, B =! " # + D _,! _ # _ + B, *# + & / ã # Ñ, D Ñ, B Ñ min :! "(#) / R 2 3. 5.! _ # 0,&*# = + # Ñ is&strictly&feasible How&suboptimal&is&# Ñ? = &! " # Ñ + D Ñ &! _ (# Ñ ) _ + *? B Ñ = 0 S D Ñ, B Ñ = inf / ã #, D Ñ, B Ñ = ã # Ñ, D Ñ, B Ñ =! " # Ñ H Ñ! _ # Ñ éh,! _ (# Ñ ) + B Ñ,*# Ñ + =! " # Ñ e èß > è Ñ D Ñ, B Ñ dual&(strictly)&feasible&with f! " # Ñ S D Ñ, B Ñ = _`H U _ 5

Optimizing'with'Matrix'Inequalities min :! "(#) / R 2 3. 5.! _ # 0 > è *# = + min :! / R 2 " # H log det(! Ñ # ) 3. 5. *# = + An&optimum&# (5) for&the&5gbarrier&problem&is&f = suboptimal&for&constrained&problem Central&Path&method: e èß > è Ñ Init: Feasible&# " and&some&5 (") Do: Solve&5 > Gbarrier&problem&using&Newton&starting&at&# > # >GH # (5 > ) Stop&if& e èß > è ϵ Ñ 5 >GH ï 5 > (for&some&parameter&ï > 1)

Central'Path'Method'for'SDP Init: Feasible&# " and&some&5 (") Do: Solve&5 > Gbarrier&problem&using&Newton&starting&at&# > # >GH # (5 > ) Stop&if& e èß > è ϵ Ñ 5 >GH ï 5 > (for&some&parameter&ï > 1) Access&to: 2 nd order&oracle&for&! ",&! _ Explicit&access&to&*, + Strictly&feasible&point&# " Assumptions:! " : R 9 R convex&and&selfgconcordant! _ : R 9 > è convex&quadratic&(or&linear) # " strictly&feasible&with&! _ # " öç: Overall&#Newton&Iterations:&V =U (log H [ + log log H õ) Overall&runtime:& V =U N + ú à + =:& ' :evals log H [ Arkadi Nemirovski Yuri Nesterov

Feasibility and Phase I Methods Recall that in the Log Barrier Central Path method we need to start with a (strictly) feasible x (0). Two phases: Phase I : Solve feasibility problem Phase II : Use solution as starting point for barrier method We can convert feasibility to an optimization problem: (P ) Find x R n s.t. f i (x) 0 min s x R n,s R s.t. f i (x) s This optimization problem is always feasible: we can start from a solution to Ax (0) = b and set s = max i f i (x (0) ). Then we can apply the log barrier method to solve the optimization problem. ( P )

min x R n,s R s.t. s f i (x) s How well do we need to optimize? If we find a P -feasible (x, s) with s < 0 x is strictly P -feasible If we get an ɛ-suboptimal solution to P with s > ɛ P is infeasible Otherwise, there could be a solution that is feasible but not strictly so

Can convert feasibility to optimization with matrix constraints too: Find x R n s.t. f i (x) 0 min s x R n,s R s.t. f i (x) si Finally, note that we can also reduce optimization to feasibility: min f 0 (x) s.t. f i (x) 0 Find x s.t. f i (x) 0 f 0 (x) s (P s ) then search over s.

From Central Path to Primal/Dual Let us review our approach. We would like to solve the KKT of (P): (KKT) f i (x) 0 λ 0 f 0 (x) + i λ if i (x) + A ν = 0 λ i f i (x) = 0 At each iteration we consider problem (P t ), i.e., solving: f 0 (x) + i 1 tf i (x) f i(x) + A ν = 0 And we do this by Newton: linearize w.r.t. x (and ν) around x (k).

This can be viewed as solving modified KKT: (KKT t ) f i (x) 0 λ 0 f 0 (x) + i λ if i (x) + A ν = 0 λ i f i (x) = 1/t Solve by: (i) Eliminate λ i = 1 tf i (x), and get a problem in (x, ν) (ii) Linearize w.r.t. (x, ν) around x (k) Instead, in P/D we maintain both x (k) and λ (k), and linearize (KKT t ) w.r.t. both x and λ around x (k) and λ (k), without first eliminating λ.

Primal-dual method Define the residuals: r pri (x) = Ax b r dual (x, λ, ν) = f 0 (x) + λ i f i (x) + A ν i λ 1 f 1 (x) + 1/t r cent(t) (x, λ) =. λ m f m (x) + 1/t R p R n R m Jointly: r (t) (x, λ, ν) = (r pri, r dual, r cent(t) ) R p+n+m If x, λ, ν satisfy r (t) (x, λ, ν) = 0 (and f i (x) < 0, λ > 0), then x = x (t), λ = λ (t), and ν = ν (t).

610 11 Interior-point methods Therefore, at each iteration we approximately solve: r (t) (x + x, λ + λ, ν + ν) = 0 s.t. f i (x r cent + x) = diag(λ)f(x) 0 (1/t)1, λ + λ 0 This is done by linearizing w.r.t. x, λ: Boils down to: duality gap m/t. Thefirstblockcomponentofr t, r dual = f 0 (x)+df(x) T λ + A T ν, is called the dual residual, andthelastblockcomponent,r pri = Ax b, iscalled the primal residual. Themiddleblock, is the centrality residual, i.e., theresidualforthemodifiedcomplementaritycondition. Now consider the Newton step for solving the nonlinear equations r t (x, λ, ν) = 0, for fixed t (without first eliminating λ, asin 11.3.4), at a point (x, λ, ν) that satisifes f(x) 0, λ 0. We will denote the current point and Newton step as y =(x, λ, ν), y =( x, λ, ν), r (t) (x, λ, ν + ν) + x r (t) (x, λ, ν) x + λ r (t) (x, λ, ν) λ respectively. The Newton step is characterized by the linear equations r t (y + y) r t (y)+dr t (y) y =0, i.e., y = Dr t (y) 1 r t (y). In terms of x, λ, andν, wehave 2 f 0 (x)+ m i=1 λ i 2 f i (x) Df(x) T A T x diag(λ)df(x) diag(f(x)) 0 λ A 0 0 ν = r dual r cent r pri. (11.54) = ( x pd, λ pd, ν pd ) is defined as the The primal-dual search direction y pd solution of (11.54). The primal and dual search directions are coupled, both through the coefficient matrix and the residuals. For example, the primal search direction x pd depends on the current value of the dual variables λ and ν, aswellasx. Wenotealsothat if x satisfies, i.e., theprimalfeasibilityresidualr pri is zero, then we have A x pd =0,so x pd defines a (primal) feasible direction: for any s, x + s x pd will satisfy A(x + s x pd )=b. while always maintaining f i (x) < 0 and λ i > 0

It follows that: r pri (x) = 0 x is primal feasible r dual (x, λ, ν) = 0 x L(x, λ, ν) = 0, so x minimizes L, and g(λ, ν) = f 0 (x) + λ i f i (x) + ν (Ax b) >, i so (λ, ν) are dual feasible If in addition we have r cent = 0, then: g(λ, ν) = f 0 (x) + i λ i 1 λ i t + 0 = f 0(x) m t So the gap between (P) and (D): f 0 (x) g(λ, ν) m t. suboptimality m t

Even if r cent 0, as long as r pri = 0 and r dual = 0, then g(λ, ν) = f 0 (x) + i λ i f i (x) f 0 (x) g(λ, ν) = λ i f i (x) i }{{} ˆη(x,λ) where ˆη(x, λ) > 0 is the surrogate gap, and we are ˆη suboptimal.

Primal-dual interior-point algorithm Start at initial x (0), λ (0), ν (0) s.t. f i (x (0) ) < 0 and λ (0) i > 0 Iterate: Determine t (k) : set t (k) = µ m ˆη(x (k),λ (k) ) Compute search direction: Linearize (KKT k ) for x = x (k) + x, λ = λ (k) + λ, ν = ν (k) + ν Solve to obtain x (k), λ (k), ν (k) Set step size s (k) by line search on r (t) (x, λ, ν), ensuring f i (x) < 0 and λ i > 0 Update: (x (k+1), λ (k+1), ν (k+1) ) += s (k) ( x (k), λ (k), ν (k) ) Stop if: r pri < ɛ feas and r dual < ɛ feas (approx. feasible), and ˆη(x (k), λ (k) ) < ɛ Important: x (k) need not be feasible OK if Ax (k) b Also, (λ (k), ν (k) ) need not be feasible g(λ (k), ν (k) ) can be Advantages: single loop, no phase I

Why no need for phase I? We don t need to ensure, but we do need f i (x) < 0 and λ > 0. We can rewrite (P ) as: min x R n,s R s.t. f 0(x) f i (x) s s = 0 Now we can start with any x (0) s.t. f i (x (0) ) <, then set s = max i f i (x (0) ) + 1.

If finding such x (0) is hard, we can rewrite as: min x R n s R x 1,...,x m R n s.t. f 0 (x) f i (x i ) s s = 0 x = x i i Then can find point in domain for each f i separately. But many more variables (mn)