Lecture 4: Convex Functions, Part I February 1

Similar documents
IE 521 Convex Optimization

Lecture 1: Convex Sets January 23

Lecture 25: Subgradient Method and Bundle Methods April 24

BASICS OF CONVEX ANALYSIS

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

8. Conjugate functions

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Lecture 3. Optimization Problems and Iterative Algorithms

The proximal mapping

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

Lecture 1: January 12

Practice Exam 1: Continuous Optimisation

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Constrained Optimization and Lagrangian Duality

Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

Math 273a: Optimization Basic concepts

Nonlinear Programming Models

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

Helly's Theorem and its Equivalences via Convex Analysis

Convex functions. Definition. f : R n! R is convex if dom f is a convex set and. f ( x +(1 )y) < f (x)+(1 )f (y) f ( x +(1 )y) apple f (x)+(1 )f (y)

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

Convex Optimization. Convex Analysis - Functions

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

5. Duality. Lagrangian

Lecture: Convex Optimization Problems

Lecture: Duality of LP, SOCP and SDP

Linear and non-linear programming

4. Convex optimization problems

Convex Optimization and Modeling

Unconstrained minimization

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

LECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Functions and Optimization

CS-E4830 Kernel Methods in Machine Learning

A Brief Review on Convex Optimization

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Symmetric Matrices and Eigendecomposition

Lecture 2: Convex functions

Bounded uniformly continuous functions

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

Convex Optimization Notes

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

CSCI : Optimization and Control of Networks. Review on Convex Optimization

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

In Progress: Summary of Notation and Basic Results Convex Analysis C&O 663, Fall 2009

1 Strict local optimality in unconstrained optimization

Midterm 1. Every element of the set of functions is continuous

Information Theory and Communication

Chapter 13. Convex and Concave. Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44

Nonlinear Programming Algorithms Handout

Newton s Method. Javier Peña Convex Optimization /36-725

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 10. ( x domf. Remark 1 The domain of the conjugate function is given by

Convex optimization problems. Optimization problem in standard form

Convexity and constructive infima

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Computational Optimization. Mathematical Programming Fundamentals 1/25 (revised)

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Math 164-1: Optimization Instructor: Alpár R. Mészáros

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Monotone Function. Function f is called monotonically increasing, if. x 1 x 2 f (x 1 ) f (x 2 ) x 1 < x 2 f (x 1 ) < f (x 2 ) x 1 x 2

Iowa State University. Instructor: Alex Roitershtein Summer Homework #1. Solutions

Math 273a: Optimization Convex Conjugacy

EE Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko, JPL Third Term

IE 521 Convex Optimization

Lecture: Duality.

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Convex Analysis Background

ORIE 6326: Convex Optimization. Quasi-Newton Methods

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 3

Chapter 1 Preliminaries

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 2: Convex Sets and Functions

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Chapter 2 Convex Analysis

2019 Spring MATH2060A Mathematical Analysis II 1

Static Problem Set 2 Solutions

Introduction to Nonlinear Stochastic Programming

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

4. Convex optimization problems

Lagrangian Duality and Convex Optimization

Convex Optimization M2

Math 5052 Measure Theory and Functional Analysis II Homework Assignment 7

The Subdifferential of Convex Deviation Measures and Risk Functions

arxiv: v1 [cs.pl] 5 Apr 2012

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Module 04 Optimization Problems KKT Conditions & Solvers

Transcription:

IE 521: Convex Optimization Instructor: Niao He Lecture 4: Convex Functions, Part I February 1 Spring 2017, UIUC Scribe: Shuanglong Wang Courtesy warning: These notes do not necessarily cover everything discussed in the class. Please email TA (swang157@illinois.edu) if you find any typos or mistakes. In this lecture, we cover the following topics Convex Functions Examples Convexity-preserving Operations Reference: Boyd &Vandenberghe, Chapter 3.1-3.2 4.1 Convex Function Let f be a function from R n to R. The domain of f is defined as dom(f) = {x R n : f(x) < }. For example, f(x) = 1 x, dom(f) = R\ {0} f(x) = n i=1 x i ln(x i ), dom(f) = R n ++ = {x : x i > 0, i = 1,..., n} Definition 4.1 (Convex function) A function f(x) : R n R is convex if (i) dom(f) R n s a convex set; (ii) x, y dom(f) and λ [0, 1], f(λx + (1 λ)y) λf(x) + (1 λ)f(y). Geometrically, the line segment between (x, f(x)), (y, f(y)) sits above the graph of f. Definitions A function is called strictly convex if (ii) holds with strict sign, i.e. f(λx + (1 λ)y) < λf(x) + (1 λ)f(y). A function is called α-strongly convex if f(x) α 2 x 2 2 A function is called concave if f(x) Note that strongly convex = strictly convex = convex 4-1

Lecture 4: Convex Functions, Part I February 1 4-2 Figure 4.1: Convex function 4.2 Examples 1. Simple univariate functions: Even powers: x p, p is even Exponential: e ax, a R Negative logarithmic: log x Absolute value: x Negative entropy: x log(x) 2. Affine functions: f(x) = a T x + b both convex & concave, but not strctly convex/concave 3. Some quadratic functions: f(x) = 1 2 xt Qx + b T x + c convex if and only if Q 0 is positive semi-definite strictly convex if and only if Q 0 is positive definite special case: f(x) = Ax b 2 2 is convex 4. Norms: A function π( ) is called a norm if (a) π(x) 0, x and π(x) = 0 iff x = 0 (b) π(αx) = α π(x), α R (c) π(x + y) π(x) + π(y) Note that norms are convex: λ [0, 1], π(λx + (1 λ)y) π(λx) + π((1 λ)y) = λπ(x) + (1 λ)π(y) where the inequality comes from (c) and the equality comes from (b). Examples of norms include: l p -norm on R n : x p := ( n i=1 x i p ) 1/p, where p 1 Q-norm on R n : x Q = x T Qx, where Q 0 is positive definite

Lecture 4: Convex Functions, Part I February 1 4-3 Frobenius norm on R m n : A F = ( m n i=1 j=1 A i,j 2 ) 1/2 spectral norm on S n : A = max i=1,...,n λ i (A), where λ i s are the eigenvalues of A. { 0, x C 5. Indicator function I c (x) =, x C The indicator function I C (x) is convex if the set C is a convex set. 6. Supporting function: IC (x) = sup y C x T y The support function IC (x) is always convex for any set C. Proof: Note that sup y C f(y) + g(y) sup y C f(y) + sup y C g(y) Then x 1, x 2, λ [0, 1] IC(λx 1 + (1 λ)x 2 ) = sup λx T 1 y + (1 λ)x T 2 y y C sup y C λx T 1 y + sup(1 λ)x T 2 y y C = λi C(x 1 ) + (1 λ)i C(x 2 ) 7. More examples Piecewise linear functions: max(a T 1 x + b 1,..., a T k x + b k) Log of exponential sums: log( k i=1 eat i x+b i ) Negative log of determinant: log(det(x)) How to show convexity of these functions? 4.3 Convexity-Preserving Operators 1. Taking conic combination: If f i (x), i I are convex functions and α i 0, i I, then g(x) = α i f i (x) is a convex function. Proof: The domain of function g dom(g) = i:αi >0dom(f i ) For any x, y dom(g), λ [0, 1] g(λx + (1 λ)y) = α i f i (λx + (1 λ)y) α i [λf i (x) + (1 λ)f i (y)] = λ α i f i (x) + (1 λ) α i f i (y) = λg(x) + (1 λ)g(y)

Lecture 4: Convex Functions, Part I February 1 4-4 Remark The property extend to infinite sums and integrals. If f(x, ω) is convex in x for any ω Ω and α(ω) 0, ω Ω. then g(x) = α(ω)f(x, ω)dω is convex if well defined. Ω For example if η = η(ω) is a well-defined random variable on Ω, and f(x, η(ω)) is convex, ω Ω, then E η [f(x, η)] is a convex function. 2. Taking affine composition If f(x) : R n R is convex and A(y) : y Ay + b is an affine mapping from R m to R n, then g(y) := f(ay + b) is convex on R m. Proof: dom(g) = {y : Ay + b dom(f)} y 1, y 2 dom(g) : g(λy 1 + (1 λ)y 2 ) = f(λ(ay 1 + b) + (1 λ)(ay 2 + b)) λf(ay 1 + b) + (1 λ)f(ay 2 + b) = λg(y 1 ) + (1 λ)g(y 2 ) For example, Ax b 2 2, i eat i x b i and n i=1 log(at i x b i) are convex. 3. Taking pointwise maximum and supremum If f i (x), i I are convex, then is also convex. g(x) := max f i (x) Proof: First of all, dom(g) = dom(f i ) is convex, For any x, y dom(g), λ [0, 1] g(λx + (1 λ)y) = max f i(λx + (1 λ)y) max {λf i(x) + (1 λ)f i (y)} max λf i(x) + max (1 λ)f i(y) = λg(x) + (1 λ)g(y) Remark The property extends to the pointwise supremum over a infinite set. If f(x, ω) is convex in x, for ω Ω, then g(x) := sup f(x, ω) ω Ω For example, the following functions are convex:

Lecture 4: Convex Functions, Part I February 1 4-5 (a) piecewise linear functions: f(x) = max(a T 1 x + b 1,..., a T k x + b k) (b) support function: I C (x) = sup y C x T y (c) maximum distance to any set C: d max (x, C) = max y C y x 2 (d) maximum eigenvalue of a symmetric matrix: λ max (X) = max y 2 =1 y T Xy Indeed, almost every convex function can be expressed as the pointwise supremum of a family of affine functions! 4. Taking convex monotone composition: scalar case If f is a convex function on R n and F ( ) is a convex and non-decreasing function on R, then g(x) = F (f(x)) vector case If f i (x), i = 1,..., m are convex on R n and F (y 1,..., y m ) is convex and non-decreasing (component-wise) in each argument, then Proof: By convexity of f i, we have g(x) = F (f 1 (x),..., f m (x)) f i (λx + (1 λ)y) λf i (x) + (1 λ)f i (y), i, λ [0, 1]. Hence, we have for any x, y dom(g), λ [0, 1], g(λx + (1 λ)y) = F (f 1 (λx + (1 λ)y),..., f m (λx + (1 λ)y)) F (λf 1 (x) + (1 λ)f 1 (y),..., λf m (x) + (1 λ)f m (y)) ( by monotonicity of F ) λf (f 1 (x),..., f m (x)) + (1 λ)f (f 1 (x),..., f m (x)) ( by convexity of F ) = λg(x) + (1 λ)g(y) ( by definition of g) Remark Taking pointwise maximum is a special case of the above rule, by setting F (y 1,..., y m ) = max(y 1,..., y m ), max i=1,...,m f i(x) = F (f 1 (x),..., f m (x)) For example: (a) e f(x) is convex if f is convex (b) log f(x) is convex if f is concave (c) log( k i=1 ef i ) is convex if f i are convex. 5. Taking Partial minimization: If f(x, y) is convex in (x, y) R n and Y is a convex set, then g(x) = inf f(x, y) y Y

Lecture 4: Convex Functions, Part I February 1 4-6 Proof: dom(g) = {x : (x, y) dom(f) and y C} is a projection of dom(f), hence Given any x 1, x 2, by definition, for any ɛ > 0, y 1 Y, y 2 Y s.t. f(x 1, y 1 ) g(x 1 ) + ɛ/2 f(x 2, y 2 ) g(x 2 ) + ɛ/2 For any λ [0, 1], adding the two equations, we have λf(x 1, y 1 ) + (1 λ)f(x 2, y 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. By convexity of f(x, y), this implies f(λx 1 + (1 λ)x 2, λy 1 + (1 λ)y 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. Hence for any ɛ > 0, g(λx 1 + (1 λ)x 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. Letting ɛ 0 leads to the convexity of g. Examples (a) Minimum distance to a convex set: d(x, C) = min y C x y 2 where C is convex; (b) Define g(x) = inf y {h(y) Ay = x} is convex if h This is because g(x) = inf y f(x, y), where { h(x) Ay = x f(x, y) := o.w. is convex in (x, y).