Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

Computational Optimization Convexity and Unconstrained Optimization 1/9/08 and /1(revised)

Convex Sets A set S is convex if the line segment joining any two points in the set is also in the set, i.e., for any x,y S, λx+(1- λ)y S for all 0 λ 1 }. convex not convex convex not convex not convex

Proving Convexity Prove {x Ax<=b} is convex. Let x and y be elements of C= x Ax b. For any λ (0,1), A( λx+ (1 λ) y) = λax+ (1 λ) Ay λb+ (1 λ) b= b λx+ (1 λ) y C { }

You Try Prove D= {x x 1} is convex.

Convex Functions A function f is (strictly) convex on a convex set S, if and only if for any x,y S, f(λx+(1- λ)y)(<) λf(x)+ (1- λ)f(y) for all 0 λ 1. f(λx+(1- λ)y) f(y) f(x) x λx+(1- λ)y y

Proving Function Convex Linear functions n f ( x) = w' x = wx where x R i= 1 i n For any x, y R λ (0,1) f( λx+ (1 λ) y) = w'( λx+ (1 λ) y) = λwx ' + (1 λ) wy ' λ f( x) + (1 λ) f( y) i n

You Try = + 1 1 f ( x, x ) x x

Hint: x is convex Consider any two points x,y and λ (0,1) λx + (1 λ) y = λ x + (1 λ) y + (1 λ) λx + λ(1 λ) y ( ) ( ) ( λx (1 λ) y) = λx + (1 λ) y λ(1 λ) xy+ (1 λ) λx + λ(1 λ) y = λx + (1 λ) y + λ(1 λ)( x y) + λx = λ x + λ λx First line uses (1 ) and similarly for (1 λ). Second line completes the square of λ + (1 λ). Third line observes the remaining terms are a square. Fouth line follows since λ(1 λ)( ) 0. x y x y y

Handy Facts Let g1( x),, gm( x) be convex functions And a>0. m Then f ( x) = g i ( x) is convex. i= 1 And hx ( ) = ag( x) 1 is convex.

Convexity and Curvature Convex functions have positive curvature everywhere. Curvature can be measured by the second derivative or Hessian. Properties of the Hessian indicate if a function is convex or not.

Convex Functions A function f is (strictly) convex on a convex set S, if and only if for any x,y S, f(λx+(1- λ)y)(<) λf(x)+ (1- λ)f(y) for all 0 λ 1. f(x+(1- λ)y) f(y) f(x) x λx+(1- λ)y y

Theorem Let f be twice continuously differentiable. f(x) is convex on S if and only if for all x S, the Hessian at x f ( x) is positive semi-definite.

Definition The matrix H is positive semi-definite (p.s.d.) if and only if for any vector y yhy 0 The matrix H is positive definite (p.d.) if and only if for any nonzero vector y yhy > 0 Similarly for negative (semi-) definite.

Theorem Let f be twice continuously differentiable. f(x) is strictly convex on S if and only if for all x X, the Hessian at x f ( x) is positive definite.

Checking Matrix H is p.s.d/p.d. Manually 4 1 x x x = 4x x x x x + 3x [ ] 1 1 1 1 1 1 3 x = 4x x x + 3x 1 1 = ( x x )^+ 3x + x > 0 [ x x ] 0 1 1 1, so matrix is positive definite

Useful facts The sum of convex functions is convex The composition of convex functions is convex.

via eigenvalues The eigenvalues of 4 1 are 4.618 and.38 1 3 so matrix is positive definite

Summary: using eigenvalues If all eigenvalues are positive, then matrix is positive definite, p.d. If all eigenvalues are nonnegative, then matrix is positive semi-definite, p.s.d If all eigenvalues are negative, then matrix is negative definite, n.d. If all eigenvalues are nonpositive, then matrix is negative semi-definite, n.s.d Otherwise the matrix is indefinite.

Try with Hessians f( x, x ) = x + x 1 1 x 1 f( x) = 4x 0 f( x) = 0 4 0 a a b 0 4 b = a + b > for any ab Strictly Convex [ ] 4 0 [ ] 0

Check Hessian H = [ 0; 0 4] Eigs(H) are and 4 So Hessian matrix is always p.d. So function is strictly convex

Differentiability and Convexity For convex function, linear approximation underestimates function f(x) (x*,f(x*)) g() x = f (*) x + ( x x*) f (*) x

Theorem Assume f is continuously differentiable on a Set S. f is convex on S if and only if f( y) f( x) + ( y x)' f ( x) x, y S

Theorem y Consider problem min f(x) unconstrained. If f ( x) = 0 and f is convex, then x is a global minimum. Proof: f( y) f( x) + ( y x)' f ( x) by convexity of f = f( x) since f( x) = 0.

Unconstrained Optimality Basic Problem: Conditions (1) min f ( x) x S Where S is an open set e.g. R n

First Order Necessary Conditions Theorem: Let f be continuously differentiable. If x* is a local minimizer of (1), then f ( x*) = 0

Stationary Points Note that this condition is not sufficient f ( x*) = 0 Also true for local max and saddle points

Proof Assume false, e.g., f(*) x 0 Let d = f( x*), then f ( x* + λd) = f( x*) + λd f( x*) + λd α( x*, λd) f( x* + λd) f( x*) λ = d f(*) x + d α(*, x λd) f( x* + λd) f( x*) < 0 for λsufficiently small since d f( x*) < 0and α( x*, λd) 0. CONTRADICTION!! x* is a local min.

Second Order Sufficient Conditions Theorem: Let f be twice continuously differentiable. If f ( x*) = 0 and f ( x*) is positive definite then x* is a strict local minimizer of (1).

Proof Any point x in neighborhood of x* can be written as x*+λd for some vector d with norm 1 and λ<= λ*. Since f is twice continuously differentiable, we can choose λ* such that f( ε )is p.d. for all ε such that ε x* λ* 1 d, λ λ*, f( x* + λd) = f( x*) + λd f( x*) + λ d' f( ε) d since f ( x*) = 0 1 f( x* + λd) f( x*) = λ d' f( x*) d > 0 therefore x* is a strict local min.

Second Order Necessary Conditions Theorem: Let f be twice continuously differentiable. If x* is a local minimizer of (1) then f ( x*) = 0 f ( x*) is positive semi definite

Proof by contradiction d Assume false, namely there exists some d such that f x d < (*) 0 then f x + λd = f x + λd f x 1 + λ d f x d+ λd α x λd ( * ) (*) (*) ' (*) (*, ) ( * + λ ) ( *) 1 = d ' f ( x *) d+ d α ( x *, λd ) f x d f x λ f( x* + λd) f( x*) < 0 for λsufficiently small since d f( x*) d < 0for some d and α( x*, λd) 0. Contradiction!!! x* is a local min.

Example Say we are minimizing 1 f ( x, x ) = x xx + x 15x 4x 1 1 1 1 [8,]???

Solve FONC Solve FONC to find stationary point * fxx = 15 = 1 x1 ( 1, ) 0 1 4 x 4 1 1 x1 * 15 8 1 x * = = 4 4

Check SOSC The Hessian at x* 1 fx ( 1*, x*) = 1 4 is p.d. since the eigenvalues, 4.118 and 1.88, are positive. Therefore SOSC are satisfied. x* is a strictly local min;

Alternative Argument The Hessian at every value x is 1 fxx ( 1, ) = 1 4 which is p.d. since the eigenvalues, 4.118 and 1.88, are positive. Therefore the function is strictly convex. Since f(x*)=0 and f is a strictly convex, x* is the unique strict global minimum.

You Try Use FONC and matlab to find solution of min f ( x, x, x ) = 10x + 5x + 3x x x + x x 4x x + x 1 3 1 3 1 3 1 3 min f ( x, x, x ) = 10x + 5x 3x x x + x x 4x x + x 1 3 1 3 1 3 1 3 Are SOSC satisfied? Are SONC? Is f convex?

Optimality Conditions for 1- dimen. functions First Order Necessary Condition If x* is a local min then f (x*)=0. If f (x*)=0 then??????????

nd Derivatives - 1D Case Sufficient conditions If f (x*)=0 and f (x*) >0, then x* is a strict local min. If f (x*)=0 and f (x*) <0, then x* is a strict local max. Necessary conditions If x* is a local min, then f (x*)=0 and f (x*) >=0. If x* is a local max, then f (x*)=0 and f (x*) <=0.

Optimality Conditions for function of R n First Order Necessary Condition If x* is a local min then f ( x*) = 0 If f ( x*) = 0 then??????????

Second Order Conditions Sufficient conditions If f ( x*) = 0 and f (*) x is p.d. then x* is a strict local min. If f ( x*) = 0 and f (*) x is n.d. then x* is a strict local max.

Second Order Conditions Necessary conditions If x* is a local min, then f ( x*) = 0 and f (*) x is p.s.d. If x* is a local max, then f ( x*) = 0 and f (*) x is n.s.d.

Optimality Conditions for Convex Convexity Let f be continuously differentiable convex function, x* is a global minimum of f if and only if f ( x*) = 0 Let f be continuously differentiable strictly convex function, If f ( x*) = 0 then x* is the unique global minimum of f. Works similarly for max and concave.

Line Search Assume f function maps the vector f to a n scalar: f : R R n Current point is x R Have interval Want to find: λ [ ab, ] λ [,] ab min f( x + λd) = g( λ)

Example Say we are minimizing 1 fxx ( 1, ) = x1 xx 1 + x 15x1 4x 1 1 x1 x [ ] 1 = [ xx 1 ] 1 15 4 4 x x Solution is [8 ] Say we are at [ 0, -1] and we want to do a linesearch in direction d=[1 0]

Line Search We are at [ 0, -1] and we want to do a linesearch in direction d=[1 0] [8,] d [0,-1] [9/4,-1]

Descent Directions If the directional derivative is negative then linesearch will lead to decrease in the function f ( x) d < 0 [8,] [0,-1] f () x d

Example continued The exact stepsize can be found 0 1 λ x + λd = λ 1 + 0 = 1 λ g = f x + d = + + + ( λ) ( λ ) λ 15λ 4 1 ' λ 15 1 g '( λ) = f '( x + λd) = f ( x) d = 1 4 1 4 0 1 = λ + 15 = 0 9 λ = 4 '

Example continued So new point is x 9 0 9 1 4 + λ d = + 4 1 = 0 1 f([0,-1]=6 f([9/4, -1])=-46.5 f([8,])=-64 In this case, f ( x + λ d ) is a convex function (verify) so this is a Global min.

Example Consider 3 + 1 1 min x x x x Find all points satisfy FONC What can you say about that point based on SONC?

FONC The first order necessary conditions are 3x 1 x 1 x x 1 + 4x = 0 = 0 So both (0,0) and (6,9) satisfy FONC

Second Order Conditions The Hessian is The Hessian at (6,9) Is indefinite so (6,9) is not a local min (or max). = 4 6 ) ( 1 1 1 x x x x x f 18 1 (6,9) 1 4 f =

Second Order Conditions The Hessian at (0,0) 0 0 f (0,0) = 0 4 is psd. So this point satisfies the SONC But not the SOSC. It might be a local min, (but in fact it is not try (-ε,0) for any ε>0

Do Lab