LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Similar documents
LECTURE 10 LECTURE OUTLINE

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions

LECTURE 13 LECTURE OUTLINE

LECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS.

LECTURE SLIDES ON CONVEX ANALYSIS AND OPTIMIZATION BASED ON LECTURES GIVEN AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY CAMBRIDGE, MASS

LECTURE 9 LECTURE OUTLINE. Min Common/Max Crossing for Min-Max

LECTURE SLIDES ON CONVEX OPTIMIZATION AND DUALITY THEORY TATA INSTITUTE FOR FUNDAMENTAL RESEARCH MUMBAI, INDIA JANUARY 2009 PART I

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory

LECTURE 4 LECTURE OUTLINE

Lecture 8. Strong Duality Results. September 22, 2008

Extended Monotropic Programming and Duality 1

Convex Optimization M2

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 6: Conic Optimization September 8

Convex Optimization Theory. Chapter 4 Exercises and Solutions: Extended Version

Convex Analysis and Optimization Chapter 2 Solutions

Convex Functions. Pontus Giselsson

8. Conjugate functions

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms

5. Duality. Lagrangian

Math 273a: Optimization Convex Conjugacy

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

10 Numerical methods for constrained problems

Solutions Chapter 5. The problem of finding the minimum distance from the origin to a line is written as. min 1 2 kxk2. subject to Ax = b.

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

LECTURE 3 LECTURE OUTLINE

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III

The proximal mapping

Primal/Dual Decomposition Methods

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Convex Optimization Boyd & Vandenberghe. 5. Duality

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Constrained Optimization and Lagrangian Duality

Lecture: Duality.

Optimality Conditions for Nonsmooth Convex Optimization

Lecture: Duality of LP, SOCP and SDP

Chapter 2: Preliminaries and elements of convex analysis

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

1 Convex Optimization Models: An Overview

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Math 273a: Optimization Subgradients of convex functions

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization & Lagrange Duality

Optimization for Machine Learning

2.098/6.255/ Optimization Methods Practice True/False Questions

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Optimization and Optimal Control in Banach Spaces

Convex Optimization Theory. Athena Scientific, Supplementary Chapter 6 on Convex Optimization Algorithms

Convex Optimization and Modeling

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CO 250 Final Exam Guide

A Brief Review on Convex Optimization

15. Conic optimization

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Lecture 18: Optimization Programming

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

4. Algebra and Duality

Convexity, Duality, and Lagrange Multipliers

Convex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE

EE/AA 578, Univ of Washington, Fall Duality

12. Interior-point methods

minimize x subject to (x 2)(x 4) u,

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Convex Analysis and Optimization Chapter 4 Solutions

Introduction to Nonlinear Stochastic Programming

Date: July 5, Contents

Semidefinite Programming Basics and Applications

Lecture Note 5: Semidefinite Programming for Stability Analysis

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Lagrangian Duality Theory

Interior Point Algorithms for Constrained Convex Optimization

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

4TE3/6TE3. Algorithms for. Continuous Optimization

Chapter 2 Convex Analysis

Algorithms for constrained local optimization

CONSTRAINED NONLINEAR PROGRAMMING

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Introduction to Convex and Quasiconvex Analysis

Radial Subgradient Descent

Linear and non-linear programming

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

ICS-E4030 Kernel Methods in Machine Learning

Transcription:

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization duality Subgradients - Optimality conditions ******************************************* CONVEX OPTIMIZATION ALGORITHMS Special problem classes Subgradient methods Polyhedral approximation methods Proximal methods Dual proximal methods - Augmented Lagrangeans Interior point methods Incremental methods Optimal complexity methods Various combinations around proximal idea and generalizations All figures are courtesy of Athena Scientific, and are used with permission. 1

BASIC CONCEPTS OF CONVEX ANALYSIS Epigraphs, level sets, closedness, semicontinuity f(x)!"#$ Epigraph 01.23415 Epigraph 01.23415 f(x)!"#$ dom(f) %&'()#*!+',-.&' Convex function # x dom(f) /&',&'()#*!+',-.&' Nonconvex function x # Finite representations of generated cones and convex hulls - Caratheodory s Theorem. Relative interior: Nonemptiness for a convex set Line segment principle Calculus of relative interiors Continuity of convex functions Nonemptiness of intersections of nested sequences of closed sets. Closure operations and their calculus. Recession cones and their calculus. Preservation of closedness by linear transformations and vector sums. 2

HYPERPLANE SEPARATION C 1 C 2 C 1 C 2 x 1 x x 2 a (a) (b) Separating/supporting hyperplane theorem. Strict and proper separation theorems. Dual representation of closed convex sets as unions of points and intersection of halfspaces. A union of points An intersection of halfspaces Nonvertical separating hyperplanes. 3

CONJUGATE FUNCTIONS f(x) ( y,1) 0 Slope = y x inf x n{f(x) x y} = f (y) Conjugacy theorem: f = f Support functions X y ˆx 0 σ X (y)/ y Polar cone theorem: C = C Special case: Linear Farkas lemma 4

BASIC CONCEPTS OF CONVEX OPTIMIZATION Weierstrass Theorem and extensions. Characterization of existence of solutions in terms of nonemptiness of nested set intersections. Level Sets of f X Optimal Solution Role of recession cone and lineality space. Partial Minimization Theorems: Characterization of closedness of f(x) = inf z m F (x, z) in terms of closedness of F. w w F (x, z) F (x, z) epi(f) epi(f) f(x) = inf F (x, z) z x 1 O z f(x) = inf F (x, z) z x 1 O z x x 2 x x 2 5

MIN COMMON/MAX CROSSING DUALITY M % 6 w. Min Common. %&'()*++*'(,*&'-(. / % w Point w M M % Min Common Point w %&'()*++*'(,*&'-(. / 0! 0 u! %#0()1*22&'3(,*&'-(4 / 7 0 Max Crossing %#0()1*22&'3(,*&'-(4 Max Crossing / Point q (a) "#$ Point q (b) "5$ u7 Min Common Point w %&'()*++*'(,*&'-(. / Max Crossing Point q w. %#0()1*22&'3(,*&'-(4 /! 0 (c) "8$ 9 6 % M M % u 7 Defined by a single set M n+1. w = inf (0,w) M w q = sup µ n q(µ) = inf (u,w) M {w + µ u} Weak duality: q w Two key questions: When does strong duality q = w hold? When do there exist optimal primal and dual solutions? 6

MC/MC THEOREMS (M CONVEX, W < ) MC/MC Theorem I: We have q = w if and only if for every sequence (u k,w k ) M with u k 0, there holds w lim inf w k. k MC/MC Theorem II: Assume in addition that <w and that D = u there exists w with (u, w) M} contains the origin in its relative interior. Then q = w and there exists µ such that q(µ) =q. MC/MC Theorem III: Similar to II but involves special polyhedral assumptions. (1) M is a horizontal translation of M by P, M = M (u, 0) u P, where P : polyhedral and M: convex. (2) We have D = ri(d) apple P = Ø, where u there exists w with (u, w) M} 7

IMPORTANT SPECIAL CASE Constrained optimization: inf x X, g(x) 0 f(x) Perturbation function (or primal function) p(u) = inf f(x), x X, g(x) u p(u) M = epi(p) { (g(x),f(x)) x X w = p(0) q 0 u Introduce L(x, µ) =f(x)+µ g(x). Then q(µ) = inf p(u ) + µ u u r = inf = u { r,x X, g(x) u infx X L(x, µ) f(x)+µ u if µ 0, otherwise. 8

NONLINEAR FARKAS LEMMA Let X n, f : X, and g j : X, j =1,...,r, be convex. Assume that Let f(x) 0, x X with g(x) 0 Q = µ µ 0, f(x)+µ g(x) 0, x X. Nonlinear version: Then Q is nonempty and compact if and only if there exists a vector x X such that g j (x) < 0 for all j =1,...,r. ( g(x),f(x) { (g(x), f(x)) x X { (g(x), f(x)) x X { (g(x), f(x)) x X 0 (µ, 1) 0 (µ, 1) 0} 0 (a) (b) (c) Polyhedral version: Q is nonempty if g is linear [g(x) = Ax b] and there exists a vector x ri(x) such that Ax b 0. 9

CONSTRAINED OPTIMIZATION DUALITY minimize f(x) subject to x X, g j (x) 0, j =1,...,r, where X n, f : X and g j : X are convex. Assume f : finite. Connection with MC/MC: M = epi(p) with p(u) = inf x X, g(x) u f(x) Dual function: q ( µ) = { inf x X L(x, µ) if µ 0, otherwise where L(x, µ) = f(x) +µ g(x) is the Lagrangian function. Dual problem of maximizing q(µ) over µ 0. Strong Duality Theorem: q = f and there exists dual optimal solution if one of the following two conditions holds: (1) There exists x X such that g(x) < 0. (2) The functions g j, j =1,...,r, are a ne, and there exists x ri(x) such that g(x) 0. 10

OPTIMALITY CONDITIONS We have q = f, and the vectors x and µ are optimal solutions of the primal and dual problems, respectively, iff x is feasible, µ 0, and x arg min L(x, µ ), µ j g j (x )=0, j. x X For the linear/quadratic program minimize 1 2 x Qx + c x subject to Ax b, where Q is positive semidefinite, (x,µ ) is a primal and dual optimal solution pair if and only if: (a) Primal and dual feasibility holds: Ax b, µ 0 (b) Lagrangian optimality holds [x minimizes L(x, µ ) over x n ]. (Unnecessary for LP.) (c) Complementary slackness holds: (Ax b) µ =0, i.e., µ j > 0 implies that the jth constraint is tight. (Applies to inequality constraints only.) 11

FENCHEL DUALITY Primal problem: minimize f 1 (x)+f 2 (x) subject to x n, where f 1 : n (, ] and f 2 : n (, ] are closed proper convex functions. Dual problem: minimize f 1 ( )+f 2 ( ) subject to n, where f 1 and f 2 are the conjugates. f 1 (x) Slope λ f 1 (λ) Slope λ q(λ) f 2 ( λ) f = q f 2 (x) x x 12

CONIC DUALITY Consider minimizing f(x) over x C, where f : n (, ] is a closed proper convex function and C is a closed convex cone in n. We apply Fenchel duality with the definitions f 1 (x) =f(x), f 2 (x) = { 0 if x C, if x/ C. Linear Conic Programming: minimize c x subject to x b S, x C. The dual linear conic problem is equivalent to minimize b subject to c S, Ĉ. Special Linear-Conic Forms: min c x max b, Ax=b, x C c A 0 C ˆ min Ax b C c x max b, A 0 =c, C ˆ where x n, m, c n, b m, A : m n. 13

SUBGRADIENTS f(z) ( g, 1) ( x, f(x) 0 z f(x) = Ø for x ri dom(f) Conjugate Subgradient Theorem: If f is closed proper convex, the following are equivalent for a pair of vectors (x, y): (i) x y = f(x)+f (y). (ii) y f(x). (iii) x f (y). Characterization of optimal solution set X = arg min x n f(x) of closed proper convex f: (a) X = f (0). ( (b) X is nonempty if 0 ri dom(f ). (c) X is nonempty ( and compact 0 int dom(f ). (. if and only if 14

CONSTRAINED OPTIMALITY CONDITION Let f : n (, ] be proper convex, let X be a convex subset of n, and assume that one of the following four conditions holds: (i) ri ( dom(f) apple ri(x) = Ø. (ii) f is polyhedral and dom(f) apple ri(x) = Ø. (iii) X is polyhedral and ri ( dom(f) apple X = Ø. (iv) f and X are polyhedral, and dom(f) apple X = Ø. Then, a vector x minimizes f over X iff there exists g f(x ) such that g belongs to the normal cone N X (x ), i.e., g (x x ) 0, x X. Level Sets of f N C (x ) N C (x ) Level Sets of f C f(x ) x C f(x ) g x 15

COMPUTATION: PROBLEM RANKING IN INCREASING COMPUTATIONAL DIFFICULTY Linear and (convex) quadratic programming. Favorable special cases. Second order cone programming. Semidefinite programming. Convex programming. Favorable cases, e.g., separable, large sum. Geometric programming. Nonlinear/nonconvex/continuous programming. Favorable special cases. Unconstrained. Constrained. Discrete optimization/integer programming Favorable special cases. Caveats/questions: Important role of special structures. What is the role of optimal algorithms? Is complexity the right philosophical view to convex optimization? 16

DESCENT METHODS Steepest descent method: Use vector of min norm on f(x); has convergence problems. 3 2 1 60 x2 0 40-1 z 20 0-2 -3-3 -2-1 0 1 2 3 x 1-20 3 2 1 0 x2-1 -2-3 1 0-1 x1-2 -3 2 3 Subgradient method: )*+*,$&*-&$./$0 Level sets of f! g k f(x k ) X xk "( x " # " #%1 $23! $ 4" # $%$&' # 5 x k+1 = P X (x k α k g k ) x" k # $%$&' α # k g k -subgradient method (approx. subgradient) Incremental (possibly randomized) variants for minimizing large sums (can be viewed as an approximate subgradient method). 17

OUTER AND INNER LINEARIZATION Outer linearization: Cutting plane f(x) f(x 1 )+(x x 1 ) g 1 f(x 0 )+(x x 0 ) g 0 x 2 x 0 x x 1 x 3 x X Inner linearization: Simplicial decomposition f(x 0 ) x 0 f(x 1 ) X x 1 x 2 x 2 f(x 2 ) f(x 3 ) x 3 x 1 x 4 x 4 = x x 3 Level sets of f Duality between outer and inner linearization. Extended monotropic programming framework Fenchel-like duality theory 18

PROXIMAL MINIMIZATION ALGORITHM A general algorithm for convex fn minimization { x k+1 arg min x n 1 f(x)+ x x k 2 2ck f : n (, ] is closed proper convex c k is a positive scalar parameter x 0 is arbitrary starting point f(x k ) γ k f(x) γ k 1 2c k x x k 2 x k x k+1 x x x k+1 exists because of the quadratic. Strong convergence properties Starting point for extensions (e.g., nonquadratic regularization) and combinations (e.g., with linearization) 19

PROXIMAL-POLYHEDRAL METHODS Proximal-cutting plane method γ k p k (x) f(x) γ k F k (x) y k x k+1 x x Proximal-cutting plane-bundle methods: Replace f with a cutting plane approx. and/or change quadratic regularization more conservatively. Dual Proximal - Augmented Lagrangian methods: Proximal method applied to the dual problem of a constrained optimization problem. γ k 1 2c k x x k 2 γ k f(x) δ k + x k λ c k 2 λ 2 f (λ) Slope = x x x k x k+1 x λ k+1 δ k Slope = x k Slope = x k+1 Primal Proximal Iteration Dual Proximal Iteration 20

DUALITY VIEW OF PROXIMAL METHODS Proximal Method Outer Linearization Proximal Cutting Plane/Bundle Method Fenchel Duality Fenchel Duality Dual Proximal Method Inner Linearization Proximal Simplicial Decomp/Bundle Method Applies also to cost functions that are sums of convex functions f(x) = m i=1 f i (x) in the context of extended monotropic programming 21

INTERIOR POINT METHODS Barrier method: Let x k = arg min f(x)+ k B(x), k =0, 1,..., x S where S = {x g j (x) < 0, j = 1,...,r} and the parameter sequence { k } satisfies 0 < k+1 < k for all k and k 0. Boundary of S ɛb(x), "-./,0 "-./ ɛ B(x),0*1*, ɛ < ɛ Boundary of S "#$%&'()*#+*! "#$%&'()*#+*! S! Ill-conditioning. Need for Newton s method 1 1 0.5 0.5 0 0-0.5-0.5-1 2.05 2.1 2.15 2.2 2.25 22-1 2.05 2.1 2.15 2.2 2.25

ADVANCED TOPICS Incremental subgradient-proximal methods Complexity view of first order algorithms Gradient-projection for differentiable problems Gradient-projection with extrapolation Optimal iteration complexity version (Nesterov) Extension to nondifferentiable problems by smoothing Gradient-proximal method Useful extension of proximal. General (nonquadratic) regularization - Bregman distance functions Entropy-like regularization Corresponding augmented Lagrangean method (exponential) Corresponding gradient-proximal method Nonlinear gradient/subgradient projection (entropic minimization methods) 23

MIT OpenCourseWare http://ocw.mit.edu 6.253 Convex Analysis and Optimization Spring 2012 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.