Advanced Continuous Optimization
|
|
- Florence Perry
- 6 years ago
- Views:
Transcription
1 University Paris-Saclay Master Program in Optimization Advanced Continuous Optimization J. Ch. Gilbert (INRIA Paris-Rocquencourt) October 17, 2017 Lectures: September 18, 2017 November 6, 2017 Examination: November 13, 2017 Keep a close eye to Outline 1 Background Convex analysis Nonsmooth analysis Optimization Algorithmics 2 Optimality conditions First order optimality conditions for (P G ) Second order optimality conditions for (P EI ) 3 Linearization methods The Josephy-Newton algorithm for inclusions The SQP algorithm in constrained optimization 4 Semidefinite optimization Problem definition Existence of solution and optimality conditions An interior point algorithm 2 / 110
2 Preliminaries Outline of the course Three parts (60h) 1 Advanced Continuous Optimization I (30h) Theory and algorithms in smooth nonconvex optimization Goal: improve the basic knowledge in optimization with some advanced concepts and algorithms Period: September, 18 November, 6 (examination: November, 13) 2 Advanced Continuous Optimization II (20h) Implementation of algorithms (in Matlab on an easy to implement concrete problem) Goal: improve algorithm understanding Period: November, 20 December, 18 (examination: January,??) 3 Specialized course by Jacek Gondzio (10h) 3 / 110 Preliminaries Some features of the course Smooth nonconvex optimization Dealing with the general optimization problem { infx E f(x) (P G ) c(x) G, where f : E R, c : E F, and G is a closed convex set in F. The latter includes the nonlinear optimization problem { infx R f(x) (P EI ) c E (x) = 0 and c I (x) 0, where f : E R, E and I form a partition of [1:m], c E : E R m E, and c I : E R m I are smooth (possibly non convex) functions, the linear semidefinite optimization problem { infx S n C,X (P SDO ) A(X) = b and X 0, where C S n, A : S n R m is linear, and b R m. 4 / 110
3 Preliminaries Some features of the course Finite dimension: E and F are Euclidean vector spaces The unit closed ball is compact A bounded sequence has a converging subsequence Nothing is said about optimal control problems ( see other courses) Emphasis is put on theory optimality conditions of problem (PG ) and many other topics, and algorithms their analysis (scope, convergence, speed of convergence, strong and weak features,... ). 5 / 110 Preliminaries Outline of the course Advanced Continuous Optimization I Anticipated program (we ll see...) 1 Background in convex analysis and optimization Convex analysis: dual cone, Farkas lemma, asymptotic cone,... Nonsmooth analysis: multifunction, subdifferential,... Optimization: optimality conditions,... 2 Abstract optimality conditions First order optimality conditions for (PG ) Second order optimality conditions for (PEI ) (or (P G )?) 3 Linearization methods (Newton) Josephy-Newton algorithm for nonlinear inclusions SQP algorithm for (PEI ) Semismooth Newton algorithm for nonlinear equations 6 / 110
4 Preliminaries Outline of the course Advanced Continuous Optimization I 4 Proximal and augmented Lagrangian methods Proximal algorithm for solving inclusions and optimization problems Augmented Lagrangian algorithm in optimization Application to convex quadratic optimization (error bound and global linear convergence) 5 Semidefinite optimization theory (optimality conditions, duality,... ), algorithm (interior point methods,... ). 7 / 110 Preliminaries Outline of the course Advanced Continuous Optimization II Goal: mastering the implementation of an algorithm Choose one of the following topics (20h) 1 Implementation of an SQP algorithm, application to the hanging chain problem 2 Implementation of an SDO algorithm, application to the OPF problem (Rank relaxation of its QCQP modelling) 8 / 110
5 Background Convex analysis (notation) R := R {,+ }. R + := {t R : t 0} and R ++ := {t R : t > 0}. B, B: open and closed unit balls centered at the origin; for r 0: B(x,r) = x +rb and B(x,r) = x +r B. E, F, G usually denote Euclidean vector spaces. 10 / 110 Background Convex analysis (relative interior I) The affine hull of P E is the smallest affine space containing P: affp := {A : A is an affine space containing P}. The relative interior of P E is its interior in affp: rip := {x P : r > 0 such that [B(x,r) affp] P}. In finite dimension, there holds C convex and nonempty = { ric, affc = aff(ric). 11 / 110
6 Background Convex analysis (relative interior II) Proposition (relative interior criterion) Let C be a nonempty convex set and x E. Then x ric and y C = [x,y[ ric. x ric x 0 C (or affc), t > 1 : (1 t)x 0 +tx C. Let C be a nonempty convex set. Then ri C is convex, C is convex and affc = affc, ri C = C and ri C = ri C (i.e., the last operation prevails). A point x C is said absorbing if d E, t > 0 such that x +td C. x intc x is absorbing. 12 / 110 Background Convex analysis (dual cone and Farkas lemma) The (positive) dual cone of a set P E is defined by P + := {d E : d,x 0, x P}. The (negative) dual cone of a set P is P := P +. Lemma (Farkas, generalized) Let E and F be two Euclidean spaces, A : E F a linear map, and K a nonempty convex cone of E. Then A(K) = {y F : A y K + } +. A : F E is defined by: (x,y) E F, A y,x = y,ax. One cannot get rid of the closure on A(K) in general. If K is polyhedral, then A(K) is polyhedral, hence closed. For K = E, one recovers R(A) = N(A ). 13 / 110
7 Background Convex analysis (tangent and normal cones) Tangent cone Let C be a convex set of E and x C. The cone of feasible directions for C at x is T f x C := R + (C x). The tangent cone to C at x is the closure of the previous one T x C T C (x) = R + (C x). Normal cone Let C be a convex set of E and x C. The normal cone to C at x is There hold N x C N C (x) = {d E : x x,d 0, x C}. N x C = (T x C) and T x C = (N x C). 14 / 110 Background Convex analysis (asymptotic cone I) Let E be a vector space of finite dimension and C be a nonempty closed convex set of E. The asymptotic cone of C is C := {d E : C +R + d C} = {d E : C +d C}. Properties C is closed. For any x C: C = {d E : x +R + d C} = t>0 = C x t { d E : {x k } C, {t k } such that x k t k d Boundedness by calculation: C is bounded C = {0}. }. 15 / 110
8 Background Convex analysis (asymptotic cone II) Two calculation rules (there are many more) If K, then K is a closed convex cone K = K. For an arbitrary collection {Ci } i I of closed convex sets C i with nonempty intersection: ( i I C i ) = i I C i. Example Let A : E F and B : E G be linear maps, a F, b G, K G be a closed convex cone, and P = {x E : Ax = a, Bx b +K}. Then P = {d E : Ad = 0, Bd K}. 16 / 110 Background Convex analysis (convex polyhedron) A convex polyhedron in E is a set of the form P = {x E : Ax b}, where A : E R m is a linear map and b R m. It is a closed set. For x P, define I(x) := {i [1:m] : (Ax b) i = 0}. ({x k } x) = I(x k ) I(x) for large k. If T : E F is linear, then T(P) is a convex polyhedron. If P 1 and P 2 are polyhedra, then P 1 +P 2 is a polyhedron. T x P = T f x P = {d E : (Ad) I(x) 0}. I(x 1 ) I(x 2 ) = T x1 P T x2 P. N x P = cone{a e i : i I(x)}. I(x 1 ) I(x 2 ) = N x1 P N x2 P. 17 / 110
9 Background Nonsmooth analysis (multifunction I) A multifunction T (or set-valued mapping) between two sets E and F is a function from E to P(F), the set of the subsets of F. Notation: T : E F : x T(x) F. Same concept as a binary relation (i.e., the data of a part of E F). A graph, the domain, the range of T : E F are defined by G(T) := {(x,y) E F : y T(x)}, D(T) := {x E : (x,y) G(T) for some y F} = π E G(T), R(T) := {y F : (x,y) G(T) for some x E} = π F G(T). The image of a part P E by T is T(P) := x P T(x). 19 / 110 Background Nonsmooth analysis (multifunction II) The inverse of a multifunction T : E F (it always exists!) is the multifunction T 1 : F E defined by T 1 (y) := {x E : y T(x)}. Hence y T(x) x T 1 (y). When E, F are topological/metric spaces, a multifunction T : E F is said to be closed at x E if y T(x) when (xk,y k ) G(T) converges to (x,y), closed if G(T) is closed in E F (i.e., T is closed at any x E). upper semicontinuous at x E if ε > 0, δ > 0, x x +δb, there holds T(x ) T(x)+εB. 20 / 110
10 Background Nonsmooth analysis (multifunction III) When E, F are vector spaces, a multifunction T : E F is said to be convex if G(T) is convex in E F. This is equivalent to saying that (x 0,x 1 ) E 2 and t [0, 1]: Note that T((1 t)x 0 +tx 1 ) (1 t)t(x 0 )+tt(x 1 ). T convex and C convex in E = T(C) convex in F. 21 / 110 Background Optimization (a generic problem) One considers the generic optimization problem { min f(x) (P X ) x X. where f : E R (E is a Euclidean vector space), X is a set of E (possibly nonconvex). Définitions: solution or (global) minimum x X if x X, f(x ) f(x), local minimum x X if V N(x ), x X V, f(x ) f(x), strict local/global minimum x if f(x ) < f(x) above when x x. 23 / 110
11 Background Optimization (tangent cone to a nonconvex set) A direction d E is tangent to X E at x X (in the sense of Bouligand) if {x k } X, {t k } 0 : x k x t k d. The tangent cone to X at x (in the sense of Bouligand) is the set of tangent directions. It is denoted by Properties Let x X. T x X or T X (x). T x X is closed. X is convex = T x X is convex and T x X = R + (X x). 24 / 110 Background Optimization (Peano-Kantorovich NC1) Theorem (Peano-Kantorovich NC1) If x is a local minimizer of (P X ) and f is differentiable at x, then f(x ) (T x X) +. The gradient of f at x is denoted by f(x) E and is defined from the derivative f (x) by d E : f(x ),d = f (x) d. 25 / 110
12 Background Optimization (NSC1 for convex problems) Recall that a convex function f : E R {+ } has directional derivatives f (x;d) R for all x domf and all d E. Proposition (NSC1 for a convex problem) Suppose that X is convex, f is convex on X, and x X. Then x is a global solution to (P X ) if and only if x X : f (x ;x x ) 0. Proof. Straightforward, using the convexity inequality x X : f(x) f(x )+f (x ;x x ). 26 / 110 Background Optimization (problem (P E )) Let E, F be Euclidean vector spaces. The equality constrained problem is (P E ) { infx f(x) c(x) = 0, where f : E R, c : E F are smooth (possibly non convex) functions. The feasible set is denoted by X E := {x E : c(x) = 0}. (P E ) is said to be convex if f is convex and c E is affine. 27 / 110
13 Background Optimization (problem (P E ) Lagrange optimality conditions) Theorem (NC1 for (P E ), Lagrange, XVIIIth) If x is a local minimum of (P E ), if f and c are differentiable at x, and if c is qualified for representing X E at x in the sense (2) below, then there exists a multiplier λ F such that x l(x,λ ) = 0, c(x ) = 0. (1a) (1b) Some explanations. The constraint c is qualified for representing X E at x X E if T x X E = T x X E := N(c (x)). (2) Qualification holds of c (x) is surjective (sufficient condition of CQ). The Lagrangian of (P E ) is the function l : (x,λ) E F l(x,λ) = f(x)+ λ,c(x). 28 / 110 Background Optimization (problem (P E ) second order optimality conditions) Theorem (NC2 for (P E )) If x is a local minimum of (P E ), if f and c are twice differentiable at x, and if (1a) holds for some λ F, then d T x X E : 2 xx l(x,λ )d,d 0. (3) Inequality in (3) is not necessarily true for d N(c (x ))\T x X E. 2 xxl(x,λ ) is not necessarily positive semi-definite (even if qualification holds). Theorem (SC2 for (P E )) If f and c are twice differentiable at x, if (1) holds for some λ F, and if d T x X E \{0} : d T 2 xxl(x,λ )d > 0, (4) then x is a strict local minimum of (P E ). (4) stronger (hence conclusion holds) if inequality holds d N(c (x ))\{0}. 29 / 110
14 Background Optimization (problem (P EI )) A generic form of the nonlinear optimization problem: inf x f(x) (P EI ) c E (x) = 0 c I (x) 0, where f : E R, E and I form a partition of [1:m], c E : E R m E, and c I : E R m I are smooth (possibly non convex) functions. The feasible set is denoted by X EI := {x E : c E (x) = 0, c I (x) 0}. We say that an inequality constraint is active at x X EI if c i (x) = 0. The set of indices of active inequality constraints is denoted by I 0 (x) := {i I : c i (x) = 0}. (P EI ) is said to be convex if f is convex, c E is affine, and c I is componentwise convex. 30 / 110 Background Optimization (problem (P EI ) NC1 or KKT conditions) Theorem (NC1 for (P EI ), Karush-Kuhn-Tucker (KKT)) If x is a local minimum of (P EI ), if f and c = (c E,c I ) are differentiable at x, and if c is qualified for representing X EI at x in the sense (6) below, then there exists a multiplier λ R m such that Some explanations. x l(x,λ ) = 0, c E (x ) = 0, The Lagrangian of (P EI ) is the function (5a) (5b) 0 (λ ) I c I (x ) 0. (5c) l : (x,λ) E R m l(x,λ) = f(x)+λ T c(x). The complementarity condition (5c) means (λ ) I 0, (λ ) T I c I (x ) = 0, and c I (x ) / 110
15 Background Optimization (problem (P EI ) constraint qualification) The tangent cone T x X EI is always contained in the linearizing cone T x X EI := {d E : c E (x) d = 0, c I 0 (x)(x) d 0}. It is said that the constraint c is qualified for representing X EI at x if T x X EI = T x X EI. (6) Sufficient conditions of qualification: continuity and/or differentiability and one of the following (CQ-A) ce I 0 (x) is affine near x (Affinity), (CQ-S) ce is affine, c I 0 (x) is componentwise convex, ˆx X EI such that c I 0 (x)(ˆx) < 0 (Slater), (CQ-LI) i E I 0 (x) α i c i (x) = 0 = α = 0 (Linear Independence), (CQ-MF) i E I 0 (x) α i c i (x) = 0 and α I0 (x) 0 = α = 0 (Mangasarian-Fromovitz). 32 / 110 Background Optimization (problem (P EI ) more on constraint qualification) Proposition (other forms of (CQ-MF)) Suppose that c E I0 (x) is differentiable at x X EI. Then the following properties are equivalent: i (CQ-MF) holds at x, ii v R m, d E: c E (x) d = v E and c I 0 (x) (x) d v I 0 (x), iii c E (x) is surjective and d E: c E (x) d = 0 and c I 0 (x)(x) d < / 110
16 Background Optimization (problem (P EI ) SC1 for convex problem) Theorem (SC1 for convex (P EI )) If (P EI ) is convex, if f and c are differentiable at x X EI, and if there is a λ F such that (x,λ ) satisfies (5a)-(5b), then x is a global minimum of (P EI ). No need of constraint qualification. The goal of the first part of this course is to extend the previous NC1 and SC1 to a more general problem and to derive second order optimality conditions. 34 / 110 Background Optimization (linear optimization duality) Let c E (a Euclidean vector space), A : E R m and B : E R p linear, a R m, and b R p. A linear optimization problem (P L ) and its dual (D L ) read (P L ) Properties inf x E c,x Ax = a Bx b and (D L ) sup (y,s) R m R p at y b T s A y B s = c s 0. (P L ) has a solution val(p L ) R, val(d L ) val(p L ) [named weak duality], (P L ), (D L ) feasible Sol(P L ) Sol(D L ). (7) When (7) holds [named strong duality], val(d L ) = val(p L ). 35 / 110
17 Background Algorithmics (speeds of convergence) Let E be a normed space and {x k } E be a sequence converging to x. {x k } is said to converge linearly, if r [0,1[ and K N such that k K, there holds x k+1 x r x k x. Depends on the norm of E. {x k } is said to converge superlinearly, if x k+1 x = o( x k x ). Independent of the norm of E. Faster than linear convergence. Typical of the quasi-newton methods. {x k } is said to converge quadratically, if x k+1 x = O( x k x 2 ). Independent of the norm of E. Faster than superlinear convergence. Typical of Newton s method. 37 / 110 Background Algorithmics (Dennis & Moré criterion for superlinear convergence) Let F : E F and consider the nonlinear system to solve in x E: F(x) = 0. A quasi-newton algorithm locally generates a sequence {x k } by the recurrence F(x k )+M k (x k+1 x k ) = 0, (8) where M k L(E,F) is an approximation of F (x k ), generated by the algorithm. Proposition (Dennis & Moré criterion for superlinear convergence) If F is differentiable at a zero x of F, F (x ) is nonsingular, {x k } generated by (8) converges to x, then the convergence is superlinear if and only if ( ) M k F (x ) (x k+1 x k ) = o( x k+1 x k ). 38 / 110
18 Background Algorithmics (global convergence in unconstrained optimization - line-search) Consider the problem of minimizing a function f : E R (no constraint): min f(x). x E Global convergence means only g k := f(x k ) 0 (or even something weaker). It can be obtained with line-search or trust-region algorithms. A line-search algorithm generates a sequence {x k } as follows. Choice of a descent direction at xk, i.e., d k E such that f (x k ) d k < 0 (standard examples: gradient, conjugate gradient, Newton, quasi-newton, Gauss-Newton). Determination of a step-size αk > 0 by a line-search rule to force the decrease of f along d k (standard examples: Cauchy, Armijo, Wolfe). xk+1 := x k +α k d k. Proposition (global convergence by line-search) If {f(x k )} is bounded below, then k 1 g k 2 cos 2 θ k < +, where θ k := arccos g k,d k /( g k d k ) is the angle between d k and g k. 39 / 110 Background Algorithmics (global convergence in unconstrained optimization - trust-region) A trust-region algorithm generates a sequence {x k } as follows. Choice of a model ϕ of f around xk, usually quadratic: ϕ(s) := g k,s M ks,s (standard models: gradient, Newton, quasi-newton, Gauss-Newton). Determination of a trust-radius k > 0 such that f(x k +s k ) is sufficiently smaller than f(x k ), where s k arg min{ϕ(s) : s k }. xk+1 := x k +s k. Proposition (global convergence by trust-region) If {f(x k )} is bounded below and {M k } is bounded, then g k / 110
19 Background Algorithmics (global convergence for nonlinear systems) Let F : E E and consider the problem of finding x such that F(x) = 0. No algorithm with global convergence (exception, e.g., when F is polynomial, but the methods are expensive or for rather small dimensions). A less ambitious goal: minimizing the least-squares objective min x E ( ϕ(x) := 1 2 F(x) 2 2 Amazing but misleading fact: if ϕ(x k ) 0, the Gauss-Newton direction d GN k ). arg min F(x k )+F (x k )d 2 2 d E is a descent direction of ϕ at x k. May yield false convergence. It is better to use the trust-region approach: x k+1 = x k +s k, where with well adapted trust-radius k > 0. s k arg min s 2 k F(x k )+F (x k )s 2 2, 41 / 110 Optimality conditions First order optimality conditions for (P G ) (the problem) [WP.fr] We consider the problem (P G ) { min f(x) c(x) G, where f : E R (E is a Euclidean vector space), c : E F (F is another Euclidean vector space), G is nonempty closed convex set in F, The feasible set is denoted by X G := {x E : c(x) G} = c 1 (G). (P G ) is said to be convex if f is convex, T : E F : x c(x) G is convex (then XG is convex). 43 / 110
20 Optimality conditions First order optimality conditions for (P G ) (tangent and linearizing cones, qualification) Proposition (tangent and linearizing cones) If c is differentiable at x X G, then T x X G T x X G := {d E : c (x) d T c(x) G}. T x X G is called the linearizing cone to X at x. The equality T x X G = T x X G is not guaranteed. The constraint function c is said to be qualified for representing X G at x if T x X G = T x X G, c (x) [(T c(x) G) + ] is closed. (9a) (9b) 44 / 110 Optimality conditions First order optimality conditions for (P G ) (NC1) Theorem (NC1 for (P G )) If x is a local minimum of (P G ), if f and c are differentiable at x, and if c is qualified for representing X G at x in the sense (9a)-(9b), then 1 there exists a multiplier λ F such that f(x )+c (x ) λ = 0, λ N c(x )G. (10a) (10b) 2 if, furthermore, G K is a convex cone, then (10b) can be written K λ c(x ) K or c(x ) N λ K. (10c) One recognizes in (10a) the gradient of the Lagrangian wrt x l : E F R : x l(x,λ) = f(x)+ λ,c(x) and in (10c) the complementarity conditions. 45 / 110
21 Optimality conditions First order optimality conditions for (P G ) (SC1 for convex problem) Theorem (SC1 for convex (P G )) If (P G ) is convex, if f and c are differentiable at x X G, and if there is a λ F such that (x,λ ) satisfies (10a)-(10b), then x is a global minimum of (P G ). The goal of the next slides is to highlight and analyze a condition (Robinson s condition) that 1 provides an error bound for the feasible set y +X G (y small), 2 claims the stability of the feasible set X G, 3 ensures that c is qualified for representing X G. 46 / 110 Optimality conditions First order optimality conditions for (P G ) (Robinson s condition) [WP.fr] We say that Robinson s condition holds at x X G if (CQ-R) We will see below that it is useful since it ( ) 0 int c(x)+c (x) E G. (11) provides an error bound for small perturbations y +XG of X G, claims the stability of the feasible set XG with respect to small perturbations, shows that the constraint function c is qualified for representing XG at x, in the sense (9a)-(9b), it generalizes to (P G ) the Mangasarian-Fromovitz constraint qualification (CQ-MF) for (P EI ). 47 / 110
22 Optimality conditions First order optimality conditions for (P G ) (Robinson s error bound I) [16] Theorem ((CQ-R) and metric regularity) If c is continuously differentiable near x 0 X G, then the following properties are equivalent: 1 (CQ-R) holds at x = x 0, 2 there exists a constant µ 0, such that (x,y) near (x 0,0): dist(x,c 1 (y +G)) µ dist(c(x),y +G). (12) Condition 2 is named metric regularity since it is equivalent to that property (see its definition below) for the multifunction T : E F : x c(x) G. 48 / 110 Optimality conditions First order optimality conditions for (P G ) (Robinson s error bound II) [16] An error bound is an estimation of the distance to a set by a quantity easier to compute. Corollary (Robinson s error bound) If c is continuously differentiable near x 0 X G and if (CQ-R) holds at x = x 0, then there exists a constant µ 0, such that x near x 0 : dist(x,x G ) µ dist(c(x),g). (13) dist(x,x G ) is often difficult to evaluate in E, dist(c(x),g) is often easier to evaluate in F (it is the case if c(x) is easy to evaluate and G is simple), useful in theory (e.g., for proving that (CQ-R) implies constraint qualification), in algorithmics (e.g., for proving [speed of] converge) or in practice (for estimating dist(x,x G )). 49 / 110
23 Optimality conditions First order optimality conditions for (P G ) (stability wrt small perturbations) The estimate (12) readily implies the following stability result. Corollary (stability wrt small perturbations) If c is continuously differentiable near some x 0 X G and if (CQ-R) holds at x 0, then, for all small y F: {x E : c(x) y +G}. 50 / 110 Optimality conditions First order optimality conditions for (P G ) (open multifunction theorem I) [20, 18, 5] Let T : E F be a multifunction and B E and B F be the closed balls of E and F. T is open at (x 0,y 0 ) G(T) with ratio ρ > 0 if there exist a neighborhood W of (x 0,y 0 ) and a radius r max > 0 such that for all (x,y) W G(T) and r [0,r max ]: y +ρr B F T(x +r B E ). T is metric regular at (x 0,y 0 ) G(T) with modulus µ > 0 if for all (x,y) near (x 0,y 0 ): dist(x,t 1 (y)) µ dist(y,t(x)). Difference: (x,y) G(T) for the openness, but not for the metric regularity. y slope ρ G(T) T(x) y +ρ(2r) B F y +ρr B F y dist(y,t(x)) r T(x) x x x 0 y 0 slope 1/µ G(T) T 1 (y) dist(x,t 1 (y)) Both estimate the change in T(x) with x (but these are not infinitesimal notions), either from inside G(T) (openness) or outside (metric regularity). 51 / 110
24 Optimality conditions First order optimality conditions for (P G ) (open multifunction theorem II) [20, 18, 5] Extention of the open mapping theorem for linear (continuous) maps to (nonlinear) convex multifunctions. Theorem (open multifunction theorem, finite dimension) If T : E F is convex and (x 0,y 0 ) G(T), then the following properties are equivalent: 1 y 0 intr(t), 2 for all r > 0, y 0 intt(x +r B E ), 3 T is open at (x 0,y 0 ) with rate ρ > 0, 4 T is metric regular at (x 0,y 0 ) with modulus µ > 0. One can take µ = 1/ρ in point 4 if ρ is given by point / 110 Optimality conditions First order optimality conditions for (P G ) (metric regularity diffusion I) [6] ϕ : E R + is subcontinuous if for all sequences {x k } x such that ϕ(x k ) 0, there holds ϕ(x) = 0. ϕ : E R + is r-steep at x E, with r [0,1[, if for all x B(x,ϕ(x)/(1 r)), there exists x B(x,ϕ(x )), such that ϕ(x ) r ϕ(x ). Lemma (Lyusternik [15, 6]) If ϕ : E R + is subcontinuous and r-steep at x domϕ, with r [0,1[, then ϕ has a zero in B(x,ϕ(x)/(1 r)). 53 / 110
25 Optimality conditions First order optimality conditions for (P G ) (metric regularity diffusion II) [6] Theorem (metric regularity diffusion) Suppose that c : E F a continuous function, G a closed convex set of F, T : E F : x c(x) G is µ-metric regular at (x 0,y 0 ) G(T), δ : E F is Lipschitz near x 0 with modulus L < 1/µ, T : E F : x c(x)+δ(x) G. Then T is also metric regular at (x 0,y 0 +δ(x 0 )) G( T) with modulus µ/(1 Lµ): for all (x,y) near (x 0,y 0 +δ(x 0 )), there holds No need of convexity. dist(x, T 1 (y)) µ 1 Lµ dist(y, T(x)). (14) 54 / 110 Optimality conditions First order optimality conditions for (P G ) (qualification with (CQ-R) I) Proposition (other forms of (CQ-R)) Suppose that c is differentiable at x X G. Then the following properties are equivalent i 0 int(c(x)+c (x)e G) [this is (CQ-R)], ii c (x)e T f c(x) G = F, iii c (x)e T c(x) G = F, iv c (x)e T c(x) G = F. 55 / 110
26 Optimality conditions First order optimality conditions for (P G ) (qualification with (CQ-R) II) Proposition (qualification with (CQ-R)) Suppose that c is continuously differentiable near x X G and that (CQ-R) holds at x. Then c is qualified for representing X G at x. The figure below shows how (CQ-R) is used to create an appropriate sequence {x k } in X G := c 1 (G) to get qualification (the figure requires some oral explanations, though... ). X G := c 1 (G) x k d T x X G x k x c : E F c(x k ) c(x) c (x) d T c(x) G c(x k ) y k G 56 / 110 Optimality conditions First order optimality conditions for (P G ) (qualification with (CQ-R) III) Proposition (Gauvin s boundedness property) Suppose that f and c are differentiable at x X G and that the set Λ of multipliers λ F satisfying (10a)-(10b) is nonempty. Then 1 Λ = [c (x )E T c(x )G] +, 2 Λ is bounded if and only if (CQ-R) holds. The property was originally established for problem (P EI ) and (CQ-MF) [8]. 57 / 110
27 Optimality conditions Second order optimality conditions for (P EI ) [WP.fr] Let x be a local solution to (P EI ), λ be an associated optimal multiplier, and L := 2 xx l(x,λ ). In view of the second order optimality conditions of the equality constrained optimization problem (P E ), it is tempting to claim that d T x X EI : L d,d 0. But this is not guaranteed! The good cone is not T x X EI but the critical cone C (to be defined). The multiplier λ must be chosen, depending on d C. 59 / 110 Optimality conditions Second order optimality conditions for (P EI ) (critical cone) Critical cone at x X EI (it is a part of T x X EI ) C(x) := {d E : c E (x) d = 0, c I 0 (x) (x) d 0, f (x) d 0}. Short notation for index sets I 0 := {i I : c i (x ) = 0} := I 0 (x ), I 0+ := {i I : c i (x ) = 0, (λ ) i > 0}, I 00 := {i I : c i (x ) = 0, (λ ) i = 0}. Other forms of the critical cone at a stationary pair (x,λ ): C = {d E : c E (x ) d = 0, c I 0 (x ) d 0, f (x ) d = 0}, = {d E : c (x E I 0+ ) d = 0, c I (x ) d 0} / 110
28 Optimality conditions Second order optimality conditions for (P EI ) (three instructive examples) Three instructive examples 1 Strong NC2 (not always true): λ Λ, d C : L d,d 0. { min x2 x 2 x 2 1, 2 Semi-strong NC2 (not always true): λ Λ, d C : L d,d 0. min x 2 x 2 x1 2 x x Weak NC2 (always true): d C, λ Λ : L d,d 0. min x 3 x 3 (x 1 +x 2 )(x 1 x 2 ) x 3 (x 2 + 3x 1 )(2x 2 x 1 ) x 3 (2x 2 +x 1 )(x 2 3x 1 ). 61 / 110 Optimality conditions Second order optimality conditions (NC2) Notation at a stationary point x : Λ := {λ R m : (x,λ ) satisfies the KKT system (5)}. L := 2 xx l(x,λ ) for some specified λ Λ. Theorem (NC2 for (P EI )) Suppose that x is a local minimum of (P EI ), f and c E are C 2 near x, c I 0 is twice differentiable at x, c I\I 0 is continuous at x, (CQ-MF) holds at x, then d C, λ Λ such that L d,d 0. These conditions are named weak second order necessary conditions and also read d C : max L d,d 0. (15) λ Λ 62 / 110
29 Optimality conditions Second order optimality conditions (SC2) Theorem (SC2 for (P EI )) Suppose that f and c E I 0 are twice differentiable at x, (x,λ ) verifies the KKT conditions (5), the following equivalent conditions hold d C \{0}, λ Λ : L d,d > 0, γ > 0, d C, λ Λ : L d,d γ d 2, (16a) (16b) then γ [0, γ[, a neighborhood V of x, x (V \{x }) X EI : f(x) > f(x )+ γ 2 x x 2. (17) In particular, x is a strict local minimum of (P EI ). 63 / 110 Optimality conditions Second order optimality conditions (SC2) Condition (17) is called the quadratic growth property. The property λ Λ : d C \{0}, L d,d > 0 is stronger than (16a)-(16b) and is called the semi-strong SC2. The even stronger property λ Λ : d C \{0}, L d,d > 0 is called the strong SC2. 64 / 110
30 Linearization methods Overview Two classes of linearization methods that are used to solve systems with nonsmoothness. 1 Methods that capture much of the local behavior of the system. Features Expensive iteration, fast convergence, easy to globalize. Examples The Josephy-Newton algorithm for functional inclusions. The SQP algorithm for (P EI ). 2 Methods that use a single piece of the local behavior of the system. Features Cheap iteration, fast convergence, difficult to globalize. Example The semismooth Newton algorithm for nonsmooth system of equations. 65 / 110 Linearization methods Josephy-Newton algorithm for functional inclusions (functional inclusion I) [WP.fr] Let E and F be Euclidean vector spaces of the same dimension, F : E F be a smooth function, and N : E F be a multifunction. We consider the functional inclusion problem (P FI ) F(x)+N(x) 0. (18) Interested in algorithmic issues (not theoretical ones, like existence of solution). Examples 1 The variational problem if N = N X the normal cone to X E at x (= if x / X): (P V ) F(x)+N X (x) 0. (19) 2 The variational inequality problem is (P V ) with X = C (a convex): { x C (P VI ) F(x),y x 0, y C. 3 The complementarity problem is (P FI ) with N = N K G (closed convex cone K F, G : E F) (20) (P CP ) K G(x) F(x) K +. (21) 67 / 110
31 Linearization methods Josephy-Newton algorithm for inclusions (functional inclusion II) [WP.fr] Examples (continued) 4 The Peano-Kantorovitch NC1 of problem min{f(x) : x X} reads f(x)+n X (x) 0. 5 The first order optimality conditions for (P G ), when G is a closed convex cone, can be written like (P CP ) with the variable x (x,λ) E F, ( ) f(x)+c (x) λ F(x) F(x,λ) =, and K = E G. (22) c(x) 6 If N is the constant multifunction x {0 m E R } Rm I + R m F where E and I make a partition of [1:m], (P FI ) becomes the system of equalitites and inequalitites F E (x) = 0 and F I (x) / 110 Linearization methods Josephy-Newton algorithm for inclusions (the JN algorithm) [13, 14] [WP.fr] Algorithm (Josephy-Newton algorithm for solving (P FI )) Given x k, compute x k+1 as a solution to the problem in x: F(x k )+M k (x x k )+N(x) 0, (23) where M k = F (x k ) or an approximation to it. Remarks Only F is linearized, not N (is the reason for the chosen structure of (P FI )). (23) captures more information from (P FI ) than a simple linearization. (23) is often a nonlinear problem, hence yielding an expensive iteration. Makes sense computationally if N is sufficiently simple. 69 / 110
32 Linearization methods Josephy-Newton algorithm for inclusions (semistability) [3] A solution x to (P FI ) is said semistable if σ 1 > 0 and σ 2 > 0 such that for any (x,p) satisfying F(x)+N(x) p and x x σ 1 there holds x x σ 2 p. Remarks The perturbed inclusion F(x)+N(x) p is not required to have a solution. The property is demanding only for small perturbations p (i.e., p < σ 1 /σ 2 ). Semistability implies that x is the unique solution to (P FI ) on B(x,σ 1 ). If N {0}, then semistability of x injectivity of F (x ) (hence, nonsingularity if dime = dimf). 70 / 110 Linearization methods Josephy-Newton algorithm for inclusions (speed of convergence) [3] Proposition (speed of convergence of quasi-newton methods) Suppose that F is C 1 near a semistable solution x to (P FI ). Let {x k } be a sequence generated by algorithm (23), converging to x. 1 If (M k F (x ))(x k+1 x k ) = o( x k+1 x k ), then {x k } converges superlinearly. 2 If (M k F (x ))(x k+1 x k ) = O( x k+1 x k 2 ) and F is C 1,1 near x, then {x k } converges quadratically. Corollary (speed of convergence of Newton s method) Suppose that F is C 1 near a semistable solution x to (P FI ). Let {x k } be a sequence generated by algorithm (23) with M k = F (x k ), converging to x. Then 1 {x k } converges superlinearly, 2 if, furthermore F is C 1,1 near x, then {x k } converges quadratically. 71 / 110
33 Linearization methods Josephy-Newton algorithm for inclusions (semistability for polyhedral VI) [3] Proposition (semistability characterization for polyhedral VI) Consider problem (P VI ) with a closed convex polyhedron C and F being C 1 near a solution x. Then the following properties are equivalent: 1 x is semistable, 2 x is an isolated solution to F(x )+F (x )(x x )+N C (x) 0, 3 there holds F (x )(x x ),x x > 0 when x C \{x } satisfies F(x ),x x = 0 and F(x )+F (x )(x x )+N C (x ) 0, 4 x is the unique solution to N C (x) N C (x ), F(x ),x x = 0, R + F(x )+F (x )(x x )+N C (x) / 110 Linearization methods Josephy-Newton algorithm for inclusions (local convergence) [3] A solution x to (P FI ) is said hemistable if α > 0, β > 0, x 0 B(x,β), the linearized inclusion in x has a solution in B(x,α). F(x 0 )+F (x 0 )(x x 0 )+N(x) 0 Theorem (local convergence of JN) Suppose that F is C 1 near a semistable and hemistable solution x to (P FI ). Then ε > 0 such that if x 1 B(x,ε), then 1 the JN algorithm (23) with M k = F (x k ) can generate {x k } B(x,ε), 2 any sequence {x k } generated in B(x,ε) by the JN algorithm converges superlinearly to x (quadratically if F is C 1,1 ). 73 / 110
34 Linearization methods The SQP algorithm (overview) Recall the equality and inequality constrained problem inf x f(x) (P EI ) c E (x) = 0 c I (x) / 110 Linearization methods The SQP algorithm (definition - KKT is a nonlinear complementarity problem) [WP.fr] Similarly to Newton s method in unconstrained optimization, the SQP algorithm is conceptually interested in solutions of the first order optimality (KKT) system in (x,λ) of (P EI ): x l(x,λ) = 0, (24a) c E (x) = 0, (24b) 0 λ I c I (x) 0. (24c) This system in (x,λ) can be written like (P FI ) or (P CP ), namely F(x,λ)+N K (x,λ) 0 or K (x,λ) F(x,λ) K +, (25) with the data F(x,λ) = ( ) x l(x,λ) c(x) and K = E (R m E R m I + ). (26) N K (x,λ) = {0 E } ( {0 R m E } N m (λ R I I) ), K + = {0 + E } ( {0 R m E } R m ) I + ). 76 / 110
35 Linearization methods The SQP algorithm (definition - the SQP algorithm viewed as a JN method) [17] [WP.fr] The SQP algorithm (SQP for Sequential Quadratic Programming) for solving (P EI ) is the JN algorithm (23) on the inclusion (25)-(26): x l(x k,λ k )+M k (x x k )+c (x k ) (λ λ k ) = 0, c E (x k )+c E(x k ) (x x k ) = 0, (27a) (27b) 0 λ I ( c I (x k )+c I(x k ) (x x k ) ) 0, (27c) where M k = L k := 2 xx l(x k,λ k ) or an approximation to it (M k 2 f(x k )!). (27) is formed of the KKT conditions of the osculating quadratic problem min x f(x k ),x x k M k(x x k ),x x k (OQP) c E (x k )+c E (x k) (x x k ) = 0, (28) c I (x k )+c I (x k) (x x k ) 0. The primal-dual solution (x k+1,λ k+1 ) to the OQP is the new iterate. 77 / 110 Linearization methods The SQP algorithm (definition - the local SQP algorithm) [10, 11] [WP.fr] Algorithm (local SQP) From (x k,λ k ) to (x k+1,λ k+1 ): 1 If (24) holds at (x,λ) = (x k,λ k ), stop. 2 Compute a primal-dual stationary point (x k+1,λ k+1 ) of the OQP (28). Remarks The OQP s are still hard to solve (not just a linear system, expensive iteration): If M k = L k 0, the OQP is NP-hard. One of the good reasons for taking M k L k with M k 0, updated by a quasi-newton method; the OQP is then polynomial, but still difficult. Other (non local) difficulties to overcome: What if the linearized constraints are incompatible? What if the OQP is unbounded? Nothing is done for forcing the convergence from remote starting points. 78 / 110
36 Linearization methods The SQP algorithm (stability result for a polyhedral (P K ) I) [17] Let P be a vector space. For p P, consider the perturbated problem { (P p K ) min f(x,p) c(x,p) K, where f : E P R, c : E P F, K F convex polyhedral cone. Optimality system at x E: λ F such that { x f(x,p)+c x(x,p) λ = 0 K λ c(x,p) K.. (29) The multiplier multifunction Λ : E P F is defined at (x,p) E P by Λ(x,p) := {λ F : (x,p,λ) satisfies (29)}. The stationnary multifunction St : P E is defined at p P by St(p) := {x E : Λ(x,p) }. 79 / 110 Linearization methods The SQP algorithm (stability result for a polyhedral (P K ) II) Assume the framework defined above. Proposition (stability of (P K )) If f x, c and c x are Lipschitz on U P N(x 0 ) N(p 0 ), (x 0,λ 0 ) satisfies the strong SC2 for (P p 0 K ), (CQ-R) holds at x 0, meaning that 0 int ( c(x 0,p 0 )+c x(x 0,p 0 )E K ), then W N(p 0 ), L 0, p W: 1 St(p), 2 x St(p) near x 0, λ Λ(x,p): dist ( (x,λ),{x 0 } Λ(x 0,p 0 ) ) L p p / 110
37 Linearization methods The SQP algorithm (local convergence - semistability and hemistability of a KKT pair) [3] Proposition (semistability and SC2) If (x,λ ) satisfies KKT, then the following properties are equivalent: 1 x is a local minimizer and (x,λ ) is semistable, 2 Λ = {λ } and x satisfies SC2. At a local solution, semistability implies hemistability: Proposition (SC for hemistability) If x is a local minimizer of (P EI ), (x,λ ) satisfies the KKT conditions, (x,λ ) is semistable, then (x,λ ) is hemistable. 81 / 110 Linearization methods The SQP algorithm (local convergence - the result) [3] Theorem (local convergence of SQP) If f and c are C 2,1 near a local minimizer x of (P EI ), there is a unique multiplier λ associated with x, SC2 is satified, then there exists a neighborhood V of (x,λ ) such that if the first iterate (x 1,λ 1 ) V, then 1 the SQP algorithm can generate {(x k,λ k )} in V, 2 any sequence {(x k,λ k )} generated in V by the SQP algorithm converges quadratically to (x,λ ). 82 / 110
38 M.F. Anjos, J.B. Lasserre (2012). Handbook on Semidefinite, Conic and Polynomial Optimization. Springer. [doi]. A. Ben-Tal, A. Nemirovski (2001). Lectures on Modern Convex Optimization Analysis, Algorithms, and Engineering Applications. MPS-SIAM Series on Optimization 2. SIAM. J.F. Bonnans (1994). Local analysis of Newton-type methods for variational inequalities and nonlinear programming. Applied Mathematics and Optimization, 29, [doi]. J.F. Bonnans, J.Ch. Gilbert, C. Lemaréchal, C. Sagastizábal (2006). Numerical Optimization Theoretical and Practical Aspects (second edition). Universitext. Springer Verlag, Berlin. [authors] [editor] [doi]. J.F. Bonnans, A. Shapiro (2000). Perturbation Analysis of Optimization Problems. Springer Verlag, New York. R. Cominetti (1990). Metric regularity, tangent sets, and second-order optimality conditions. Applied Mathematics and Optimization, 21(1), [doi]. A.L. Dontchev, R.T. Rockafellar (2009). Implicit Functions and Solution Mappings A View from Variational Analysis. Springer Monographs in Mathematics. Springer. J. Gauvin (1977). A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Mathematical Programming, 12, J.Ch. Gilbert (2015). Éléments d Optimisation Différentiable Théorie et Algorithmes. Syllabus de cours à l ENSTA, Paris. [internet]. S.-P. Han (1976). Superlinearly convergent variable metric algorithms for general nonlinear programming problems. Mathematical Programming, 11, S.-P. Han (1977). A globally convergent method for nonlinear programming. Journal of Optimization Theory and Applications, 22, A.F. Izmailov, M.V. Solodov (2014). 110 / 110 Newton-Type Methods for Optimization and Variational Problems. Springer Series in Operations Research and Financial Engineering. Springer. N.H. Josephy (1979). Newton s method for generalized equations. Technical Summary Report 1965, Mathematics Research Center, University of Wisconsin, Madison, WI, USA. N.H. Josephy (1979). Quasi-Newton s method for generalized equations. Summary Report 1966, Mathematics Research Center, University of Wisconsin, Madison, WI, USA. L. Lyusternik (1934). Conditional extrema of functionals. Math. USSR-Sb., 41, S.M. Robinson (1976). Stability theory for systems of inequalities, part II: differentiable nonlinear systems. SIAM Journal on Numerical Analysis, 13, S.M. Robinson (1982). Generalized equations and their solutions, part II: applications to nonlinear programming. Mathematical Programming Study, 19, R.T. Rockafellar, R. Wets (1998). Variational Analysis. Grundlehren der mathematischen Wissenschaften 317. Springer. H. Wolkowicz, R. Saigal, L. Vandenberghe (editors) (2000). Handbook of Semidefinite Programming Theory, Algorithms, and Applications, volume 27 of International Series in Operations Research & Management Science. Kluwer Academic Publishers. J. Zowe, S. Kurcyusz (1979). Regularity and stability for the mathematical programming problem in Banach spaces. Applied Mathematics and Optimization, 5(1), / 110
Advanced Continuous Optimization
University Paris-Saclay Master Program in Optimization Advanced Continuous Optimization J. Ch. Gilbert (INRIA Paris-Rocquencourt) September 26, 2017 Lectures: September 18, 2017 November 6, 2017 Examination:
More informationAdvanced Continuous Optimization
Advanced Continuous Optimization Jean Charles Gilbert To cite this version: Jean Charles Gilbert. Advanced Continuous Optimization. Master. France. 2015. HAL Id: cel-01249369 https://hal.inria.fr/cel-01249369
More informationSome Properties of the Augmented Lagrangian in Cone Constrained Optimization
MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented
More informationOptimality Conditions for Constrained Optimization
72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)
More informationStructural and Multidisciplinary Optimization. P. Duysinx and P. Tossings
Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be
More informationNumerical Optimization
Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,
More informationAlgorithms for constrained local optimization
Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS
MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth
More informationPacific Journal of Optimization (Vol. 2, No. 3, September 2006) ABSTRACT
Pacific Journal of Optimization Vol., No. 3, September 006) PRIMAL ERROR BOUNDS BASED ON THE AUGMENTED LAGRANGIAN AND LAGRANGIAN RELAXATION ALGORITHMS A. F. Izmailov and M. V. Solodov ABSTRACT For a given
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationDUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS
1 DUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS Alexander Shapiro 1 School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA, E-mail: ashapiro@isye.gatech.edu
More informationConstrained Optimization Theory
Constrained Optimization Theory Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Constrained Optimization Theory IMA, August
More informationGlobal Optimality Conditions in Maximizing a Convex Quadratic Function under Convex Quadratic Constraints
Journal of Global Optimization 21: 445 455, 2001. 2001 Kluwer Academic Publishers. Printed in the Netherlands. 445 Global Optimality Conditions in Maximizing a Convex Quadratic Function under Convex Quadratic
More informationSemi-infinite programming, duality, discretization and optimality conditions
Semi-infinite programming, duality, discretization and optimality conditions Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205,
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationOptimality, Duality, Complementarity for Constrained Optimization
Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear
More informationCHARACTERIZATIONS OF LIPSCHITZIAN STABILITY
CHARACTERIZATIONS OF LIPSCHITZIAN STABILITY IN NONLINEAR PROGRAMMING 1 A. L. DONTCHEV Mathematical Reviews, Ann Arbor, MI 48107 and R. T. ROCKAFELLAR Dept. of Math., Univ. of Washington, Seattle, WA 98195
More informationLecture 6: Conic Optimization September 8
IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions
More informationGEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III
GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III CONVEX ANALYSIS NONLINEAR PROGRAMMING THEORY NONLINEAR PROGRAMMING ALGORITHMS
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More information1. Introduction. Consider the following parameterized optimization problem:
SIAM J. OPTIM. c 1998 Society for Industrial and Applied Mathematics Vol. 8, No. 4, pp. 940 946, November 1998 004 NONDEGENERACY AND QUANTITATIVE STABILITY OF PARAMETERIZED OPTIMIZATION PROBLEMS WITH MULTIPLE
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationOptimization and Optimal Control in Banach Spaces
Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,
More information5. Duality. Lagrangian
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationConvex Optimization Boyd & Vandenberghe. 5. Duality
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationThe Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1
October 2003 The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 by Asuman E. Ozdaglar and Dimitri P. Bertsekas 2 Abstract We consider optimization problems with equality,
More informationON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS
ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS GERD WACHSMUTH Abstract. Kyparisis proved in 1985 that a strict version of the Mangasarian- Fromovitz constraint qualification (MFCQ) is equivalent to
More informationLecture: Duality of LP, SOCP and SDP
1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationA Proximal Method for Identifying Active Manifolds
A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationCSCI : Optimization and Control of Networks. Review on Convex Optimization
CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one
More informationA Brief Review on Convex Optimization
A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review
More informationInexact Josephy Newton framework for generalized equations and its applications to local analysis of Newtonian methods for constrained optimization
Comput Optim Appl (2010) 46: 347 368 DOI 10.1007/s10589-009-9265-2 Inexact Josephy Newton framework for generalized equations and its applications to local analysis of Newtonian methods for constrained
More informationSolving generalized semi-infinite programs by reduction to simpler problems.
Solving generalized semi-infinite programs by reduction to simpler problems. G. Still, University of Twente January 20, 2004 Abstract. The paper intends to give a unifying treatment of different approaches
More informationSOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM. 1. Introduction
ACTA MATHEMATICA VIETNAMICA 271 Volume 29, Number 3, 2004, pp. 271-280 SOME STABILITY RESULTS FOR THE SEMI-AFFINE VARIATIONAL INEQUALITY PROBLEM NGUYEN NANG TAM Abstract. This paper establishes two theorems
More informationCONVERGENCE OF INEXACT NEWTON METHODS FOR GENERALIZED EQUATIONS 1
CONVERGENCE OF INEXACT NEWTON METHODS FOR GENERALIZED EQUATIONS 1 Asen L. Dontchev Mathematical Reviews, Ann Arbor, MI 48107-8604 R. T. Rockafellar Department of Mathematics, University of Washington,
More informationLecture: Duality.
Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong
More informationON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS
ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS A. F. Izmailov and A. S. Kurennoy December 011 ABSTRACT In the context of mixed complementarity problems various concepts of solution regularity are
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationConvex Optimization & Lagrange Duality
Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT
More informationConditioning of linear-quadratic two-stage stochastic programming problems
Conditioning of linear-quadratic two-stage stochastic programming problems W. Römisch Humboldt-University Berlin Institute of Mathematics http://www.math.hu-berlin.de/~romisch (K. Emich, R. Henrion) (WIAS
More informationSubdifferential representation of convex functions: refinements and applications
Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationPerturbation Analysis of Optimization Problems
Perturbation Analysis of Optimization Problems J. Frédéric Bonnans 1 and Alexander Shapiro 2 1 INRIA-Rocquencourt, Domaine de Voluceau, B.P. 105, 78153 Rocquencourt, France, and Ecole Polytechnique, France
More informationA convergence result for an Outer Approximation Scheme
A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento
More informationNUMERICAL OPTIMIZATION. J. Ch. Gilbert
NUMERICAL OPTIMIZATION J. Ch. Gilbert Numerical optimization (past) The discipline deals with the classical smooth (nonconvex) problem min {f(x) : c E (x) = 0, c I (x) 0}. Applications: variable added
More informationRobust Farkas Lemma for Uncertain Linear Systems with Applications
Robust Farkas Lemma for Uncertain Linear Systems with Applications V. Jeyakumar and G. Li Revised Version: July 8, 2010 Abstract We present a robust Farkas lemma, which provides a new generalization of
More informationOn the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method
Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationSome new facts about sequential quadratic programming methods employing second derivatives
To appear in Optimization Methods and Software Vol. 00, No. 00, Month 20XX, 1 24 Some new facts about sequential quadratic programming methods employing second derivatives A.F. Izmailov a and M.V. Solodov
More informationSequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems. Hirokazu KATO
Sequential Quadratic Programming Method for Nonlinear Second-Order Cone Programming Problems Guidance Professor Masao FUKUSHIMA Hirokazu KATO 2004 Graduate Course in Department of Applied Mathematics and
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More informationLecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008
Lecture 13 Newton-type Methods A Newton Method for VIs October 20, 2008 Outline Quick recap of Newton methods for composite functions Josephy-Newton methods for VIs A special case: mixed complementarity
More informationStrong duality in Lasserre s hierarchy for polynomial optimization
Strong duality in Lasserre s hierarchy for polynomial optimization arxiv:1405.7334v1 [math.oc] 28 May 2014 Cédric Josz 1,2, Didier Henrion 3,4,5 Draft of January 24, 2018 Abstract A polynomial optimization
More informationOn duality theory of conic linear problems
On duality theory of conic linear problems Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 3332-25, USA e-mail: ashapiro@isye.gatech.edu
More informationminimize x subject to (x 2)(x 4) u,
Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for
More informationICS-E4030 Kernel Methods in Machine Learning
ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationDuality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities
Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form
More informationMetric regularity and systems of generalized equations
Metric regularity and systems of generalized equations Andrei V. Dmitruk a, Alexander Y. Kruger b, a Central Economics & Mathematics Institute, RAS, Nakhimovskii prospekt 47, Moscow 117418, Russia b School
More informationLectures on Parametric Optimization: An Introduction
-2 Lectures on Parametric Optimization: An Introduction Georg Still University of Twente, The Netherlands version: March 29, 2018 Contents Chapter 1. Introduction and notation 3 1.1. Introduction 3 1.2.
More informationOptimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes
Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationLecture 7: Convex Optimizations
Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1
More informationGeneralization to inequality constrained problem. Maximize
Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum
More informationThe sum of two maximal monotone operator is of type FPV
CJMS. 5(1)(2016), 17-21 Caspian Journal of Mathematical Sciences (CJMS) University of Mazandaran, Iran http://cjms.journals.umz.ac.ir ISSN: 1735-0611 The sum of two maximal monotone operator is of type
More informationCONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS
CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics
More information2.3 Linear Programming
2.3 Linear Programming Linear Programming (LP) is the term used to define a wide range of optimization problems in which the objective function is linear in the unknown variables and the constraints are
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationUNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems
UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction
More informationarxiv: v1 [math.oc] 13 Mar 2019
SECOND ORDER OPTIMALITY CONDITIONS FOR STRONG LOCAL MINIMIZERS VIA SUBGRADIENT GRAPHICAL DERIVATIVE Nguyen Huy Chieu, Le Van Hien, Tran T. A. Nghia, Ha Anh Tuan arxiv:903.05746v [math.oc] 3 Mar 209 Abstract
More informationOn well definedness of the Central Path
On well definedness of the Central Path L.M.Graña Drummond B. F. Svaiter IMPA-Instituto de Matemática Pura e Aplicada Estrada Dona Castorina 110, Jardim Botânico, Rio de Janeiro-RJ CEP 22460-320 Brasil
More informationNewton-Type Methods: A Broader View
DOI 10.1007/s10957-014-0580-0 Newton-Type Methods: A Broader View A. F. Izmailov M. V. Solodov Received: 27 December 2013 / Accepted: 29 April 2014 Springer Science+Business Media New York 2014 Abstract
More informationFirst-order optimality conditions for mathematical programs with second-order cone complementarity constraints
First-order optimality conditions for mathematical programs with second-order cone complementarity constraints Jane J. Ye Jinchuan Zhou Abstract In this paper we consider a mathematical program with second-order
More informationI.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010
I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0
More informationLectures 9 and 10: Constrained optimization problems and their optimality conditions
Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained
More informationRadius Theorems for Monotone Mappings
Radius Theorems for Monotone Mappings A. L. Dontchev, A. Eberhard and R. T. Rockafellar Abstract. For a Hilbert space X and a mapping F : X X (potentially set-valued) that is maximal monotone locally around
More informationMcMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.
McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationFIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1. Tim Hoheisel and Christian Kanzow
FIRST- AND SECOND-ORDER OPTIMALITY CONDITIONS FOR MATHEMATICAL PROGRAMS WITH VANISHING CONSTRAINTS 1 Tim Hoheisel and Christian Kanzow Dedicated to Jiří Outrata on the occasion of his 60th birthday Preprint
More informationIntroduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationLecture 15 Newton Method and Self-Concordance. October 23, 2008
Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications
More informationConvergence of Stationary Points of Sample Average Two-Stage Stochastic Programs: A Generalized Equation Approach
MATHEMATICS OF OPERATIONS RESEARCH Vol. 36, No. 3, August 2011, pp. 568 592 issn 0364-765X eissn 1526-5471 11 3603 0568 doi 10.1287/moor.1110.0506 2011 INFORMS Convergence of Stationary Points of Sample
More informationOn the Midpoint Method for Solving Generalized Equations
Punjab University Journal of Mathematics (ISSN 1016-56) Vol. 40 (008) pp. 63-70 On the Midpoint Method for Solving Generalized Equations Ioannis K. Argyros Cameron University Department of Mathematics
More informationConvex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version
Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More informationRelationships between upper exhausters and the basic subdifferential in variational analysis
J. Math. Anal. Appl. 334 (2007) 261 272 www.elsevier.com/locate/jmaa Relationships between upper exhausters and the basic subdifferential in variational analysis Vera Roshchina City University of Hong
More informationMathematical Economics. Lecture Notes (in extracts)
Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter
More informationChap 2. Optimality conditions
Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1
More informationOptimisation in Higher Dimensions
CHAPTER 6 Optimisation in Higher Dimensions Beyond optimisation in 1D, we will study two directions. First, the equivalent in nth dimension, x R n such that f(x ) f(x) for all x R n. Second, constrained
More informationConvex Functions. Pontus Giselsson
Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum
More informationc 1998 Society for Industrial and Applied Mathematics
SIAM J. OPTIM. Vol. 9, No. 1, pp. 179 189 c 1998 Society for Industrial and Applied Mathematics WEAK SHARP SOLUTIONS OF VARIATIONAL INEQUALITIES PATRICE MARCOTTE AND DAOLI ZHU Abstract. In this work we
More informationOn constraint qualifications with generalized convexity and optimality conditions
On constraint qualifications with generalized convexity and optimality conditions Manh-Hung Nguyen, Do Van Luu To cite this version: Manh-Hung Nguyen, Do Van Luu. On constraint qualifications with generalized
More informationAN ABADIE-TYPE CONSTRAINT QUALIFICATION FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS. Michael L. Flegel and Christian Kanzow
AN ABADIE-TYPE CONSTRAINT QUALIFICATION FOR MATHEMATICAL PROGRAMS WITH EQUILIBRIUM CONSTRAINTS Michael L. Flegel and Christian Kanzow University of Würzburg Institute of Applied Mathematics and Statistics
More informationarxiv: v1 [math.oc] 1 Jul 2016
Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the
More informationUnconstrained Optimization
1 / 36 Unconstrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University February 2, 2015 2 / 36 3 / 36 4 / 36 5 / 36 1. preliminaries 1.1 local approximation
More informationDownloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see
SIAM J. OPTIM. Vol. 23, No., pp. 256 267 c 203 Society for Industrial and Applied Mathematics TILT STABILITY, UNIFORM QUADRATIC GROWTH, AND STRONG METRIC REGULARITY OF THE SUBDIFFERENTIAL D. DRUSVYATSKIY
More informationInequality Constraints
Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the
More informationSECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING
Nf SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING f(x R m g HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 5, DR RAPHAEL
More information