Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p.
Introduction Nonlinear Programming Models p.
NLP problems minf(x) x S R n Standard form: min f(x) h i (x) = 0 i = 1,m g j (x) 0 j = 1,k Here S = {x R n : h i (x) = 0 i,g j (x) 0 j} Nonlinear Programming Models p.
Local and global optima A global minimum or global optimum is any x S such that x S f(x) f(x ) A point x is a local optimum if ε > 0 such that x S B( x,ε) f(x) f( x) where B( x,ε) = {x R n : x x ε} is a ball in R n. Any global optimum is also a local optimum, but the opposite is generally false. Nonlinear Programming Models p.
Convex Functions A set S R n is convex if x,y S λx + (1 λ)y S for all choices of λ [0, 1]. Let Ω R n : non empty convex set. A function f : Ω R is convex iff for all x,y Ω,λ [0, 1] f(λx + (1 λ)y) λf(x) + (1 λ)f(y) Nonlinear Programming Models p.
Convex Functions x y Nonlinear Programming Models p.
Properties of convex functions Every convex function is continuous in the interior of Ω. It might be discontinuous, but only on the frontier. If f is continuously differentiable then it is convex iff for all y Ω f(y) f(x) + (y x) T f(x) Nonlinear Programming Models p.
Convex functions x y Nonlinear Programming Models p.
If f is twice continuously differentiable f it is convex iff its Hessian matrix is positive semi-definite: [ ] 2 2 f f(x) := x i x j then 2 f(x) 0 iff v T 2 f(x)v 0 v R n or, equivalently, all eigenvalues of 2 f(x) are non negative. Nonlinear Programming Models p.
Example: an affine function is convex (and concave) For a quadratic function (Q: symmetric matrix): we have f(x) = 1 2 xt Qx + b T x + c f(x) = Qx + b 2 f(x) = Q f is convex iff Q 0 Nonlinear Programming Models p. 1
Convex Optimization Problems minf(x) x S is a convex optimization problem iff S is a convex set and f is convex on S. For a problem in standard form min f(x) h i (x) = 0 i = 1,m g j (x) 0 j = 1,k if f is convex, h i (x) are affine functions, g j (x) are convex functions, then the problem is convex. Nonlinear Programming Models p. 1
Maximization Slight abuse in notation: a problem maxf(x) x S is called convex iff S is a convex set and f is a concave function (not to be confused with minimization of a concave function, (or maximization of a convex function) which are NOT a convex optimization problem) Nonlinear Programming Models p. 1
Convex and non convex optimization Convex optimization is easy, non convex optimization is usually very hard. Fundamental property of convex optimization problems: every local optimum is also a global optimum (will give a proof later) Minimizing a positive semidefinite quadratic function on a polyhedron is easy (polynomially solvable); if even a single eigenvalue of the hessian is negative the problem becomes NP hard Nonlinear Programming Models p. 1
Convex functions: examples Many (of course not all... ) functions are convex! affine functions a T x + b quadratic functions 1 2 xt Qx + b T x + c with Q = Q T, Q 0 any norm is a convex function x log x (however log x is concave) f is convex if and only if x 0,d R n, its restriction to any line: φ(α) = f(x 0 + αd), is a convex function a linear non negative combination of convex functions is convex g(x,y) convex in x for all y g(x,y)dy convex Nonlinear Programming Models p. 1
more examples... max i {a T i x + b} is convex f,g: convex max{f(x),g(x)} is convex f a convex functions for any a A (a possibly uncountable set) sup a A f a (x) is convex f convex f(ax + b) let S R n be any set f(x) = sup s S x s is convex Trace(A T X) = i,j A ijx ij is convex (it is linear!) log det X 1 is convex over the set of matrices X R n n : X 0 λ max (X) (the largest eigenvalue of a matrix X) Nonlinear Programming Models p. 1
Data Approximation Nonlinear Programming Models p. 1
Table of contents norm approximation maximum likelihood robust estimation Nonlinear Programming Models p. 1
Norm approximation Problem: min x Ax b where A, b: parameters. Usually the system is over-determined, i.e. b Range(A). For example, this happens when A R m n with m > n and A has full rank. r := Ax b: residual. Nonlinear Programming Models p. 1
Examples r = r T r: least squares (or regression ) r = r T Pr with P 0: weighted least squares r = max i r i : minimax, or l or di Tchebichev approximation r = i r i : absolute or l 1 approximation Possible (convex) additional constraints: maximum deviation from an initial estimate: x x est ǫ simple bounds l i x i u i ordering: x 1 x 2 x n Nonlinear Programming Models p. 1
Example: l 1 norm Matrix A R 100 30 80 70 norm 1 residuals 60 50 40 30 20 10 0-5 -4-3 -2-1 0 1 2 3 4 5 Nonlinear Programming Models p. 2
l norm 20 18 norm residuals 16 14 12 10 8 6 4 2 0-5 -4-3 -2-1 0 1 2 3 4 5 Nonlinear Programming Models p. 2
l 2 norm 18 16 norm 2 residuals 14 12 10 8 6 4 2 0-5 -4-3 -2-1 0 1 2 3 4 5 Nonlinear Programming Models p. 2
Variants min i h(y i a T i x) where h: convex function: { z 2 z 1 h linear quadratic h(z) = 2 z 1 z > 1 { 0 z 1 dead zone : h(z) = z 1 z > 1 { log(1 z 2 ) z < 1 logarithmic barrier: h(z) = z 1 Nonlinear Programming Models p. 2
comparison 4 3.5 3 2.5 2 1.5 1 0.5 0 norm 1(x) norm 2(x) linquad(x) deadzone(x) logbarrier(x) -0.5-2 -1.5-1 -0.5 0 0.5 1 1.5 2 Nonlinear Programming Models p. 2
Maximum likelihood Given a sample X 1,X 2,...,X k and a parametric family of probability density functions L( ; θ), the maximum likelihood estimate of θ given the sample is ˆθ = arg max θ L(X 1,...,X k ;θ) Example: linear measures with and additive i.i.d. (independent identically dsitributed) noise: X i = a T i θ + ε i (1) where ε i iid random variables with density p( ): L(X 1...,X k ;θ) = k i=1 p(x i a T i θ) Nonlinear Programming Models p. 2
Max likelihood estimate - MLE (taking the logarithm, which does not change optimum points): ˆθ = arg max θ log(p(x i a T i θ)) If p is log concave this problem is convex. Examples: i ε N(0,σ), i.e. p(z) = (2πσ) 1/2 exp( z 2 /2σ 2 ) MLE is the l 2 estimate: θ = arg min Aθ X 2 ; p(z) = (1/(2a)) exp( z /a) l 1 estimate: ˆθ = arg min θ Aθ X 1 Nonlinear Programming Models p. 2
p(z) = (1/a) exp( z/a)1 {z 0} (negative exponential) the estimate can be found solving the LP problem: min 1 T (X Aθ) Aθ X p uniform on [ a,a] the MLE is any θ such that Aθ X a Nonlinear Programming Models p. 2
Ellipsoids An ellipsoid is a subset of R n of the form E = {x R n : (x x 0 ) T P 1 (x x 0 ) 1} where x 0 R n is the center of the ellipsoid and P is a symmetric positive-definite matrix. Alternative representations: where A 0, or E = {x R n : Ax b 2 1} E = {x R n : x = x 0 + Au u 2 1} where A is square and non singular (affine transformation of the unit ball) Nonlinear Programming Models p. 2
Robust Least Squares Least Squares: ˆx = arg min i (at i x b i) 2 Hp: a i not known, but it is known that a i E i = {ā i + P i u : u 1} where P i = P T i 0. Definition: worst case residuals: max a i E i (a T i x b i) 2 i A robust estimate of x is the solution of ˆx r = arg minmax (a T i x b x i) 2 a i E i i Nonlinear Programming Models p. 2
RLS It holds: α + β T y α + β y then, choosing y = β/ β if α 0 and y = β/ β, otherwise if α < 0, then y = 1 and then: α + β T y = α + β T β/ β sign(α) = α + β max (a T i x b i ) a i E i = max ā T i x b i + u T P i x u 1 = ā T i x b i + P i x Nonlinear Programming Models p. 3
... Thus the Robust Least Squares problem reduces to min ( ) 1/2 ( ā T i x b i + P i x ) 2 i (a convex optimization problem). Transformation: min x,t t 2 ā T i x b i + P i x t i i i.e. Nonlinear Programming Models p. 3
... min x,t t 2 ā T i x b i + P i x t i ā T i x + b i + P i x t i (Second Order Cone Problem). A norm cone is a convex set C = {(x,t) R n+1 : x t} Nonlinear Programming Models p. 3
Geometrical Problems Nonlinear Programming Models p. 3
Geometrical Problems projections and distances polyhedral intersection extremal volume ellipsoids classification problems Nonlinear Programming Models p. 3
Projection on a set Given a set C the projection of x on C is defined as: P C (x) = arg min z C z x Nonlinear Programming Models p. 3
Projection on a convex set If C = {x : Ax = b,f i (x) 0,i = 1,m} where f i : convex C is a convex set and the problem P C (x) = arg min x z Az = b f i (z) 0 i = 1,m is convex. Nonlinear Programming Models p. 3
Distance between convex sets dist(c (1),C (2) ) = min x y x C (1),y C (2) Nonlinear Programming Models p. 3
Distance between convex sets If C (j) = {x : A (j) x = b (j),f (j) i 0} then the minimum distance can be found through a convex model: min x (1) x (2) A (1) x (1) = b (1) A (2) x (2) = b (2) f (1) i x (1) 0 f (2) i x (2) 0 Nonlinear Programming Models p. 3
Polyhedral intersection 1: polyhedra described by means of linear inequalities: P 1 = {x : Ax b}, P 2 = {x : Cx d} Nonlinear Programming Models p. 3
Polyhedral intersection P 1 P2 =? It is a linear feasibility problem: Ax b,cx d P 1 P 2? Just check sup{c T k x : Ax b} d k k (solution of a finite number of LP s) Nonlinear Programming Models p. 4
Polyhedral intersection (2) 2: polyhedra (polytopes) described through vertices: P 1 = conv{v 1,...,v k }, P 2 = conv{w 1,...,w h } P 1 P2 =? Need to find λ 1,λ k,µ 1,µ h 0: λ i = 1 µ j = 1 i λ i v i i j = µ j w j j P 1 P 2? i = 1,...,k check whether µ j 0: µ j = 1 j µj w j = v i Nonlinear Programming Models p. 4
Minimal ellipsoid containing k points Given v 1,...,v k R n find an ellipsoid E = {x : Ax b 1} with minimal volume containing the k given points. Nonlinear Programming Models p. 4
A = A T 0. Volume of E is proportional to deta 1 convex optimization problem (in the unknowns: A, b): min log deta 1 A = A T A 0 Av i b 1 i = 1,k Nonlinear Programming Models p. 4
Max. ellipsoid contained in a polyhedron Given P = {x : Ax b} find an ellipsoid: E = {By + d : y 1} contained in P with maximum volume. Nonlinear Programming Models p. 4
Max. ellipsoid contained in a polyhedron E P a T i (By + d) b i y : y 1 sup {a T i By + a T i d} b i i y 1 Ba i + a T i d b i max B,d log detb B = B T 0 Ba i + a T i d b i i = 1,... Nonlinear Programming Models p. 4
Difficult variants These problems are hard: find a maximal volume ellipsoid contained in a polyhedron given by its vertices Nonlinear Programming Models p. 4
find a minimal volume ellipsoid containing a polyhedron described as a system of linear inequalities. Nonlinear Programming Models p. 4
It is already a difficult problem to show whether a given ellipsoid E contains a polyhedron P = {Ax b}. This problem is still difficult even when the ellipsoid is a sphere: this problem is equivalent to norm maximization in a polyhedron it is an NP hard concave optimization problem. Nonlinear Programming Models p. 4
Linear classification (separation) Nonlinear Programming Models p. 4
Given two point sets X 1,...,X k,y 1,...,Y h find an hyperplane a T x = t such that: (LP feasibility problem). a T X i 1 i = 1,k a T Y j 1 j = 1,h Nonlinear Programming Models p. 5
Robust separation Nonlinear Programming Models p. 5
Robust separation Find a maximal separation: max a: a 1 equivalent to the convex problem: ( ) min a T X i maxa T Y j i j maxt 1 t 2 a T X i t 1 i a T Y j t 2 j a 1 Nonlinear Programming Models p. 5