Active sets, steepest descent, and smooth approximation of functions

Size: px
Start display at page:

Download "Active sets, steepest descent, and smooth approximation of functions"

Transcription

1 Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian S. Lewis (Cornell) October 23, 2012

2 Outline Outline: Active sets (finite identification). Steepest descent. Smooth approximation of functions. Unifying themes: 1. representation independence. 2. intrinsic geometry (semi-algebraic).

3 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 3/34

4 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Active sets Steepest Descent Smooth approximation 4/34

5 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Active sets Steepest Descent Smooth approximation 4/34

6 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Assuming linear independence of { g i ( x) : i Ī }, The KKT conditions hold: The set f ( x) cone { g i ( x) : i Ī }. M = {x R n : g i (x) = 0 for each i Ī } is a C2 manifold. Active sets Steepest Descent Smooth approximation 4/34

7 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Assuming linear independence of { g i ( x) : i Ī }, The KKT conditions hold: The set f ( x) cone { g i ( x) : i Ī }. M = {x R n : g i (x) = 0 for each i Ī } Assuming, in addition, strict complementarity: f ( x) ri cone { g i (x) : i Ī }, is a C2 manifold. the active manifold M is central for algorithms and sensitivity analysis. Active sets Steepest Descent Smooth approximation 4/34

8 Classical active manifold Figure: min{z : z x(1 x) + y 2, z x(1 + x) + y 2 }, M = {(0, y, y 2 ) : y R} Active sets Steepest Descent Smooth approximation 5/34

9 Generic Properties Should we expect such nice conditions to hold typically? Active sets Steepest Descent Smooth approximation 6/34

10 Generic Properties Should we expect such nice conditions to hold typically? Rockafellar-Spingarn 79, considered perturbed problems P(v, u) : min f (x) v, x, s.t. g i (x) u i, for all i I := {1,..., m}, Active sets Steepest Descent Smooth approximation 6/34

11 Generic Properties Should we expect such nice conditions to hold typically? Rockafellar-Spingarn 79, considered perturbed problems P(v, u) : min f (x) v, x, s.t. g i (x) u i, for all i I := {1,..., m}, Theorem (Rockafellar-Spingarn 79) For almost all (v, u) at every local minimizer of P(v, u): Active manifold: active gradients are linearly independent. Strict Complementarity: multipliers are strictly positive. Quadratic growth: objective function grows quadratically. Active sets Steepest Descent Smooth approximation 6/34

12 Goals What is the active set in general independent of functional representation? Active sets Steepest Descent Smooth approximation 7/34

13 Goals What is the active set in general independent of functional representation? What does it generically look like (semi-algebraic setting)? Active sets Steepest Descent Smooth approximation 7/34

14 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Active sets Steepest Descent Smooth approximation 8/34

15 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Can extend the definition of f to nonconvex functions. Generalized critical points: x crit f 0 f ( x) Active sets Steepest Descent Smooth approximation 8/34

16 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Can extend the definition of f to nonconvex functions. Generalized critical points: Example: Consider again min f (x), x crit f 0 f ( x) s.t. g i (x) 0, for all i I := {1,..., m}, Assuming { g i ( x) : i Ī } are linearly independent, have ( f + δ {x:gi (x) 0 for i I }) ( x) = f ( x) + cone { gi ( x) : i Ī }. Critical points are the KKT points! Active sets Steepest Descent Smooth approximation 8/34

17 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). Active sets Steepest Descent Smooth approximation 9/34

18 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). What if we knew that for some set M, have x i M eventually? Active sets Steepest Descent Smooth approximation 9/34

19 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). What if we knew that for some set M, have x i M eventually? Then the problems min f (x) and min f (x). x Rn x M are equivalent for the algorithm. Active sets Steepest Descent Smooth approximation 9/34

20 Finite identification Finite identification considered implicitly by a number of authors: Bertsekas 76, Rockafellar 76, Calamai 87, Burke and Moré 88, Dunn 87, Ferris 91, Wright 93, Lewis and Hare Active sets Steepest Descent Smooth approximation 10/34

21 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Active sets Steepest Descent Smooth approximation 11/34

22 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x M c Q Active sets Steepest Descent Smooth approximation 11/34

23 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x c v i x i M Q x i x v i v Active sets Steepest Descent Smooth approximation 11/34

24 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x M c Q Active sets Steepest Descent Smooth approximation 11/34

25 Sensitivity Analysis What about sensitivity analysis? Active sets Steepest Descent Smooth approximation 12/34

26 Sensitivity Analysis Proposition (D-Lewis) What about sensitivity analysis? Suppose M is an identifiable set at x relative to f. Active sets Steepest Descent Smooth approximation 12/34

27 Sensitivity Analysis Proposition (D-Lewis) What about sensitivity analysis? Suppose M is an identifiable set at x relative to f. x is a (strict) local minimizer of f x is a (strict) local minimizer of f M. f grows quadratically near x f M grows quadratically near x. f is tilt-stable at x f M is tilt-stable at x. Active sets Steepest Descent Smooth approximation 12/34

28 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Active sets Steepest Descent Smooth approximation 13/34

29 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Active sets Steepest Descent Smooth approximation 13/34

30 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Definition An identifiable set M at x is locally minimal if M identifiable at x = M M, locally near x. Active sets Steepest Descent Smooth approximation 13/34

31 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Definition An identifiable set M at x is locally minimal if M identifiable at x = M M, locally near x. Locally minimal identifiable sets may fail to exist in general (e.g. f (x, y) = x 4 + y 2 ). Active sets Steepest Descent Smooth approximation 13/34

32 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. Active sets Steepest Descent Smooth approximation 14/34

33 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Active sets Steepest Descent Smooth approximation 14/34

34 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Calculus: A strong chain rule is available for composite functions f (x) = g(f(x)). Active sets Steepest Descent Smooth approximation 14/34

35 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Calculus: A strong chain rule is available for composite functions f (x) = g(f(x)). Critical cones: The critical cone K Q ( x, v) to a convex set Q has a natural interpretation: K Q ( x, v) = cl conv T M ( x), where M is an appropriate locally minimal identifiable set. Active sets Steepest Descent Smooth approximation 14/34

36 Identifiable manifolds Active sets Steepest Descent Smooth approximation 15/34

37 Identifiable manifolds M is an identifiable manifold at x if M is identifiable, M is a manifold, and f M is smooth. Active sets Steepest Descent Smooth approximation 15/34

38 Identifiable manifolds M is an identifiable manifold at x if M is identifiable, M is a manifold, and f M is smooth. Proposition (D-Lewis) Identifiable manifolds M dom f are automatically locally minimal. Yields a refinement of partly smooth manifolds (Lewis 03) (and Wright 93 in the convex case). Active sets Steepest Descent Smooth approximation 15/34

39 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). Active sets Steepest Descent Smooth approximation 16/34

40 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. Active sets Steepest Descent Smooth approximation 16/34

41 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. (e.g. (f λ)(a) = λ n (A) or (f λ)(a) = i λ i (A) ). Active sets Steepest Descent Smooth approximation 16/34

42 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. (e.g. (f λ)(a) = λ n (A) or (f λ)(a) = i λ i (A) ). Identifiable manifolds lift : (D-Lewis), (Daniilidis-Malick-Sendov) M identifiable manifold at λ( X) relative to f = λ 1 (M) identifiable manifold at X relative to f λ. Active sets Steepest Descent Smooth approximation 16/34

43 Generic Properties Goal: Eliminate representation dependence. Active sets Steepest Descent Smooth approximation 17/34

44 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. Active sets Steepest Descent Smooth approximation 17/34

45 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by finitely many polynomial inequalities. Active sets Steepest Descent Smooth approximation 17/34

46 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by Theorem (D-Ioffe-Lewis) finitely many polynomial inequalities. For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Active sets Steepest Descent Smooth approximation 17/34

47 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by Theorem (D-Ioffe-Lewis) finitely many polynomial inequalities. For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Then for a typical v R n, at every minimizer x v of f v, Active sets Steepest Descent Smooth approximation 17/34

48 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by finitely many polynomial inequalities. Theorem (D-Ioffe-Lewis) For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Then for a typical v R n, at every minimizer x v of f v, Active manifold: existence of an identifiable manifold Strict complementarity: 0 ri f v (x v ) Quadratic growth: f v grows quadratically near x v. Active sets Steepest Descent Smooth approximation 17/34

49 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Active sets Steepest Descent Smooth approximation 18/34

50 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, Active sets Steepest Descent Smooth approximation 18/34

51 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Active sets Steepest Descent Smooth approximation 18/34

52 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Metric regularity of F : R n R m at ( x, ȳ) d(x, F 1 (y)) κd(y, F(x)) for all (x, y) near ( x, ȳ), is typical (Nonsmooth Sard s theorem). Active sets Steepest Descent Smooth approximation 18/34

53 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Metric regularity of F : R n R m at ( x, ȳ) d(x, F 1 (y)) κd(y, F(x)) for all (x, y) near ( x, ȳ), is typical (Nonsmooth Sard s theorem). Theorem (Ioffe) For semi-algebraic F : R n R m, for generic y R m, have x F 1 (y) = F is metrically regular at (x, y). Active sets Steepest Descent Smooth approximation 18/34

54 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 19/34

55 Steepest Descent Steepest descent drives algorithm design. Active sets Steepest Descent Smooth approximation 20/34

56 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. Active sets Steepest Descent Smooth approximation 20/34

57 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. What are curves of steepest descent? Active sets Steepest Descent Smooth approximation 20/34

58 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. What are curves of steepest descent? In the smooth case, two equivalent approaches: 1. Solution to ẋ(t) = f (x(t)). 2. A curve achieving fastest instantaneous decrease in function value. Active sets Steepest Descent Smooth approximation 20/34

59 Slope Key idea: slope and limiting slope f ( x) := limsup x x x x f ( x) := liminf x x (f ( x) f (x)) + x x f (x). Active sets Steepest Descent Smooth approximation 21/34

60 Slope Key idea: slope and limiting slope If f is smooth, then f ( x) := limsup x x x x f ( x) := liminf x x (f ( x) f (x)) + x x f (x). f ( x) = f ( x) = f ( x). Active sets Steepest Descent Smooth approximation 21/34

61 Steepest Descent Two types of steepest descent: Active sets Steepest Descent Smooth approximation 22/34

62 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). Active sets Steepest Descent Smooth approximation 22/34

63 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). 2. An absolutely continuous curve x : [0, T) R n is a curve of maximal slope if (i) Speed: (ii) Rate of descent: ẋ(t) = f (x(t)), a.e. on [0, T) d(f x) ( 2, (t) = f (x(t))) a.e. on [0, T). dt Active sets Steepest Descent Smooth approximation 22/34

64 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). 2. An absolutely continuous curve x : [0, T) R n is a curve of maximal slope if (i) Speed: (ii) Rate of descent: ẋ(t) = f (x(t)), a.e. on [0, T) d(f x) ( 2, (t) = f (x(t))) a.e. on [0, T). dt Replace f (x(t)) with f (x(t)) to get a curve of near-maximal slope (DeGiorgi-Marino-Tosques 80). Active sets Steepest Descent Smooth approximation 22/34

65 Example Figure: f (x, y) = max{x + y, x y } + x(x + 1) + y(y + 1) Active sets Steepest Descent Smooth approximation 23/34

66 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Active sets Steepest Descent Smooth approximation 24/34

67 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... Active sets Steepest Descent Smooth approximation 24/34

68 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... semi-algebraic. Active sets Steepest Descent Smooth approximation 24/34

69 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... semi-algebraic. Furthermore... Active sets Steepest Descent Smooth approximation 24/34

70 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. Active sets Steepest Descent Smooth approximation 25/34

71 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Active sets Steepest Descent Smooth approximation 25/34

72 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Application: Global convergence of first-order methods with no regularity conditions (Attouch-Bolte-Svaiter 11)! Active sets Steepest Descent Smooth approximation 25/34

73 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Application: Global convergence of first-order methods with no regularity conditions (Attouch-Bolte-Svaiter 11)! No results for second-order methods yet... Active sets Steepest Descent Smooth approximation 25/34

74 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 26/34

75 Approximation of functions Two standard ways to smooth a function f : R n R. 1. Mollification: f ɛ (x) := (f φ ɛ )(x) = φ ɛ (x y)f (y) dy R n where φ ɛ is a scaled standard bump function φ Active sets Steepest Descent Smooth approximation 27/34

76 Approximation of functions Two standard ways to smooth a function f : R n R. 1. Mollification: f ɛ (x) := (f φ ɛ )(x) = φ ɛ (x y)f (y) dy R n where φ ɛ is a scaled standard bump function φ 2. Moreau envelope: { e λ f (x) := inf f (y) + 1 y x 2} y R n 2λ Active sets Steepest Descent Smooth approximation 27/34

77 Approximation of functions Set-up: Q R n f R. Q Assume Q is a disjoint union of C manifolds Q = M 1 M 2... M k 1 M k Active sets Steepest Descent Smooth approximation 28/34

78 Approximation of functions Motivation: Integration by parts g f dx Q Active sets Steepest Descent Smooth approximation 29/34

79 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Active sets Steepest Descent Smooth approximation 29/34

80 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Active sets Steepest Descent Smooth approximation 29/34

81 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. Active sets Steepest Descent Smooth approximation 29/34

82 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. provided {M i } is a Whitney stratification of Q. Active sets Steepest Descent Smooth approximation 29/34

83 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. provided {M i } is a Whitney stratification of Q. For semi-algebraic Q, Whitney stratifications always exist! Active sets Steepest Descent Smooth approximation 29/34

84 Approximation of functions What about higher order of smoothness? Active sets Steepest Descent Smooth approximation 30/34

85 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Active sets Steepest Descent Smooth approximation 30/34

86 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Definition (Normally flat stratification) The stratification {M i } is normally flat if whenever M i cl M j, there are neighborhoods V of M i and U of M j so that P Mi (x) = P Mi P Mj (x) whenever x V U. x P Mj (x) M i P Mi (x) = P Mi P Mj (x) M j Active sets Steepest Descent Smooth approximation 30/34

87 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Definition (Normally flat stratification) The stratification {M i } is normally flat if whenever M i cl M j, there are neighborhoods V of M i and U of M j so that P Mi (x) = P Mi P Mj (x) whenever x V U. x P Mj (x) M i P Mi (x) = P Mi P Mj (x) M j If {M i } is normally flat, can guarantee f is C smooth! Active sets Steepest Descent Smooth approximation 30/34

88 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Active sets Steepest Descent Smooth approximation 31/34

89 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Active sets Steepest Descent Smooth approximation 31/34

90 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Active sets Steepest Descent Smooth approximation 31/34

91 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Applications: Active sets Steepest Descent Smooth approximation 31/34

92 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Applications: 1. Probability: properties of the matrix-valued Bessel Process (Larsson 12) 2. PDEs: Geometry of Sobolev spaces. 3. More forthcoming... Active sets Steepest Descent Smooth approximation 31/34

93 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, Active sets Steepest Descent Smooth approximation 32/34

94 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: Active sets Steepest Descent Smooth approximation 32/34

95 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: New tools blending optimization, geometry, and analysis. Active sets Steepest Descent Smooth approximation 32/34

96 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: New tools blending optimization, geometry, and analysis. Classical ideas active sets, steepest descent, and smooth approximation attain new life in light of intrinsic geometry! Active sets Steepest Descent Smooth approximation 32/34

97 References Optimality, identifiability, and sensitivity, D-Lewis, submitted to Math. Programming Ser. A. The dimension of semi-algebraic subdifferential graphs, D-Ioffe-Lewis. Nonlinear Analysis, 75(3), , Semi-algebraic functions have small subdifferentials, D-Lewis. to appear in Math. Prog. Ser. B. Tilt stability, uniform quadratic growth, and strong metric regularity of the subdifferential, D-Lewis, to appear in SIAM J. on Opt. Approximating functions on stratifiable domains, D-Larsson, submitted to the J. of the London Math. Soc. Generic nondegeneracy in convex optimization, D-Lewis, Proc. Amer. Soc. 139 (2011), Trajectories of subgradient dynamical systems, D-Ioffe-Lewis, submitted to SIAM J. on Control and Opt. Available at Active sets Steepest Descent Smooth approximation 33/34

98 Thank you. Active sets Steepest Descent Smooth approximation 34/34

Tame variational analysis

Tame variational analysis Tame variational analysis Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Daniilidis (Chile), Ioffe (Technion), and Lewis (Cornell) May 19, 2015 Theme: Semi-algebraic geometry

More information

Optimality, identifiability, and sensitivity

Optimality, identifiability, and sensitivity Noname manuscript No. (will be inserted by the editor) Optimality, identifiability, and sensitivity D. Drusvyatskiy A. S. Lewis Received: date / Accepted: date Abstract Around a solution of an optimization

More information

Optimality, identifiability, and sensitivity

Optimality, identifiability, and sensitivity Noname manuscript No. (will be inserted by the editor) Optimality, identifiability, and sensitivity D. Drusvyatskiy A. S. Lewis Received: date / Accepted: date Abstract Around a solution of an optimization

More information

SLOPE AND GEOMETRY IN VARIATIONAL MATHEMATICS

SLOPE AND GEOMETRY IN VARIATIONAL MATHEMATICS SLOPE AND GEOMETRY IN VARIATIONAL MATHEMATICS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of

More information

Variational Analysis and Tame Optimization

Variational Analysis and Tame Optimization Variational Analysis and Tame Optimization Aris Daniilidis http://mat.uab.cat/~arisd Universitat Autònoma de Barcelona April 15 17, 2010 PLAN OF THE TALK Nonsmooth analysis Genericity of pathological situations

More information

Key words. Descent, slope, subdifferential, subgradient dynamical system, semi-algebraic

Key words. Descent, slope, subdifferential, subgradient dynamical system, semi-algebraic CURVES OF DESCENT D. DRUSVYATSKIY, A.D. IOFFE, AND A.S. LEWIS Abstract. Steepest descent is central in variational mathematics. We present a new transparent existence proof for curves of near-maximal slope

More information

Identifying Active Constraints via Partial Smoothness and Prox-Regularity

Identifying Active Constraints via Partial Smoothness and Prox-Regularity Journal of Convex Analysis Volume 11 (2004), No. 2, 251 266 Identifying Active Constraints via Partial Smoothness and Prox-Regularity W. L. Hare Department of Mathematics, Simon Fraser University, Burnaby,

More information

A Proximal Method for Identifying Active Manifolds

A Proximal Method for Identifying Active Manifolds A Proximal Method for Identifying Active Manifolds W.L. Hare April 18, 2006 Abstract The minimization of an objective function over a constraint set can often be simplified if the active manifold of the

More information

On the iterate convergence of descent methods for convex optimization

On the iterate convergence of descent methods for convex optimization On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.

More information

Key words. constrained optimization, composite optimization, Mangasarian-Fromovitz constraint qualification, active set, identification.

Key words. constrained optimization, composite optimization, Mangasarian-Fromovitz constraint qualification, active set, identification. IDENTIFYING ACTIVITY A. S. LEWIS AND S. J. WRIGHT Abstract. Identification of active constraints in constrained optimization is of interest from both practical and theoretical viewpoints, as it holds the

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information

Downloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see

Downloaded 09/27/13 to Redistribution subject to SIAM license or copyright; see SIAM J. OPTIM. Vol. 23, No., pp. 256 267 c 203 Society for Industrial and Applied Mathematics TILT STABILITY, UNIFORM QUADRATIC GROWTH, AND STRONG METRIC REGULARITY OF THE SUBDIFFERENTIAL D. DRUSVYATSKIY

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L.

McMaster University. Advanced Optimization Laboratory. Title: A Proximal Method for Identifying Active Manifolds. Authors: Warren L. McMaster University Advanced Optimization Laboratory Title: A Proximal Method for Identifying Active Manifolds Authors: Warren L. Hare AdvOl-Report No. 2006/07 April 2006, Hamilton, Ontario, Canada A Proximal

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Nonsmooth optimization: conditioning, convergence, and semi-algebraic models

Nonsmooth optimization: conditioning, convergence, and semi-algebraic models Nonsmooth optimization: conditioning, convergence, and semi-algebraic models Adrian Lewis ORIE Cornell International Congress of Mathematicians Seoul, August 2014 1/16 Outline I Optimization and inverse

More information

A semi-algebraic look at first-order methods

A semi-algebraic look at first-order methods splitting A semi-algebraic look at first-order Université de Toulouse / TSE Nesterov s 60th birthday, Les Houches, 2016 in large-scale first-order optimization splitting Start with a reasonable FOM (some

More information

A user s guide to Lojasiewicz/KL inequalities

A user s guide to Lojasiewicz/KL inequalities Other A user s guide to Lojasiewicz/KL inequalities Toulouse School of Economics, Université Toulouse I SLRA, Grenoble, 2015 Motivations behind KL f : R n R smooth ẋ(t) = f (x(t)) or x k+1 = x k λ k f

More information

Semi-algebraic functions have small subdifferentials

Semi-algebraic functions have small subdifferentials Mathematical Programming, Ser. B manuscript No. (will be inserted by the editor) Semi-algebraic functions have small subdifferentials D. Drusvyatskiy A.S. Lewis Received: date / Accepted: date Abstract

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

D. Drusvyatskiy, A. D. Ioffe & A. S. Lewis

D. Drusvyatskiy, A. D. Ioffe & A. S. Lewis Transversality and Alternating Projections for Nonconvex Sets D. Drusvyatskiy, A. D. Ioffe & A. S. Lewis Foundations of Computational Mathematics The Journal of the Society for the Foundations of Computational

More information

Generic Optimality Conditions for Semialgebraic Convex Programs

Generic Optimality Conditions for Semialgebraic Convex Programs MATHEMATICS OF OPERATIONS RESEARCH Vol. 36, No. 1, February 2011, pp. 55 70 issn 0364-765X eissn 1526-5471 11 3601 0055 informs doi 10.1287/moor.1110.0481 2011 INFORMS Generic Optimality Conditions for

More information

Taylor-like models in nonsmooth optimization

Taylor-like models in nonsmooth optimization Taylor-like models in nonsmooth optimization Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Ioffe (Technion), Lewis (Cornell), and Paquette (UW) SIAM Optimization 2017 AFOSR,

More information

c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp ,???? 000

c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp ,???? 000 SIAM J. OPTIMIZATION c???? Society for Industrial and Applied Mathematics Vol. 1, No. 1, pp. 000 000,???? 000 TILT STABILITY OF A LOCAL MINIMUM * R. A. POLIQUIN AND R. T. ROCKAFELLAR Abstract. The behavior

More information

Conditioning of linear-quadratic two-stage stochastic programming problems

Conditioning of linear-quadratic two-stage stochastic programming problems Conditioning of linear-quadratic two-stage stochastic programming problems W. Römisch Humboldt-University Berlin Institute of Mathematics http://www.math.hu-berlin.de/~romisch (K. Emich, R. Henrion) (WIAS

More information

BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA. Talk given at the SPCOM Adelaide, Australia, February 2015

BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA. Talk given at the SPCOM Adelaide, Australia, February 2015 CODERIVATIVE CHARACTERIZATIONS OF MAXIMAL MONOTONICITY BORIS MORDUKHOVICH Wayne State University Detroit, MI 48202, USA Talk given at the SPCOM 2015 Adelaide, Australia, February 2015 Based on joint papers

More information

Inexact alternating projections on nonconvex sets

Inexact alternating projections on nonconvex sets Inexact alternating projections on nonconvex sets D. Drusvyatskiy A.S. Lewis November 3, 2018 Dedicated to our friend, colleague, and inspiration, Alex Ioffe, on the occasion of his 80th birthday. Abstract

More information

Convergence rate estimates for the gradient differential inclusion

Convergence rate estimates for the gradient differential inclusion Convergence rate estimates for the gradient differential inclusion Osman Güler November 23 Abstract Let f : H R { } be a proper, lower semi continuous, convex function in a Hilbert space H. The gradient

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

APPROXIMATING FUNCTIONS ON STRATIFIED SETS

APPROXIMATING FUNCTIONS ON STRATIFIED SETS TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9947(XX)0000-0 APPROXIMATING FUNCTIONS ON STRATIFIED SETS D. DRUSVYATSKIY AND M. LARSSON Abstract. We investigate

More information

Variational inequalities for set-valued vector fields on Riemannian manifolds

Variational inequalities for set-valued vector fields on Riemannian manifolds Variational inequalities for set-valued vector fields on Riemannian manifolds Chong LI Department of Mathematics Zhejiang University Joint with Jen-Chih YAO Chong LI (Zhejiang University) VI on RM 1 /

More information

Generic identifiability and second-order sufficiency in tame convex optimization

Generic identifiability and second-order sufficiency in tame convex optimization Generic identifiability and second-order sufficiency in tame convex optimization J. Bolte, A. Daniilidis, and A. S. Lewis Date: 19th January 2009 Abstract We consider linear optimization over a fixed compact

More information

FINITE-DIFFERENCE APPROXIMATIONS AND OPTIMAL CONTROL OF THE SWEEPING PROCESS. BORIS MORDUKHOVICH Wayne State University, USA

FINITE-DIFFERENCE APPROXIMATIONS AND OPTIMAL CONTROL OF THE SWEEPING PROCESS. BORIS MORDUKHOVICH Wayne State University, USA FINITE-DIFFERENCE APPROXIMATIONS AND OPTIMAL CONTROL OF THE SWEEPING PROCESS BORIS MORDUKHOVICH Wayne State University, USA International Workshop Optimization without Borders Tribute to Yurii Nesterov

More information

Morse-Sard theorem for Clarke critical values

Morse-Sard theorem for Clarke critical values Morse-Sard theorem for Clarke critical values Luc Barbet, Marc Dambrine, Aris Daniilidis Abstract The Morse-Sard theorem states that the set of critical values of a C k smooth function defined on a Euclidean

More information

A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION. 1. Problem Statement. We consider minimization problems of the form. min

A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION. 1. Problem Statement. We consider minimization problems of the form. min A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION A. S. LEWIS AND S. J. WRIGHT Abstract. We consider minimization of functions that are compositions of prox-regular functions with smooth vector functions. A

More information

Sequential convex programming,: value function and convergence

Sequential convex programming,: value function and convergence Sequential convex programming,: value function and convergence Edouard Pauwels joint work with Jérôme Bolte Journées MODE Toulouse March 23 2016 1 / 16 Introduction Local search methods for finite dimensional

More information

Definable Extension Theorems in O-minimal Structures. Matthias Aschenbrenner University of California, Los Angeles

Definable Extension Theorems in O-minimal Structures. Matthias Aschenbrenner University of California, Los Angeles Definable Extension Theorems in O-minimal Structures Matthias Aschenbrenner University of California, Los Angeles 1 O-minimality Basic definitions and examples Geometry of definable sets Why o-minimal

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

arxiv: v1 [math.oc] 7 Jan 2019

arxiv: v1 [math.oc] 7 Jan 2019 LOCAL MINIMIZERS OF SEMI-ALGEBRAIC FUNCTIONS TIÊ N-SO. N PHẠM arxiv:1901.01698v1 [math.oc] 7 Jan 2019 Abstract. Consider a semi-algebraic function f: R n R, which is continuous around a point x R n. Using

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Key words. prox-regular functions, polyhedral convex functions, sparse optimization, global convergence, active constraint identification

Key words. prox-regular functions, polyhedral convex functions, sparse optimization, global convergence, active constraint identification A PROXIMAL METHOD FOR COMPOSITE MINIMIZATION A. S. LEWIS AND S. J. WRIGHT Abstract. We consider minimization of functions that are compositions of convex or proxregular functions (possibly extended-valued

More information

SOME PROPERTIES ON THE CLOSED SUBSETS IN BANACH SPACES

SOME PROPERTIES ON THE CLOSED SUBSETS IN BANACH SPACES ARCHIVUM MATHEMATICUM (BRNO) Tomus 42 (2006), 167 174 SOME PROPERTIES ON THE CLOSED SUBSETS IN BANACH SPACES ABDELHAKIM MAADEN AND ABDELKADER STOUTI Abstract. It is shown that under natural assumptions,

More information

Metric regularity and systems of generalized equations

Metric regularity and systems of generalized equations Metric regularity and systems of generalized equations Andrei V. Dmitruk a, Alexander Y. Kruger b, a Central Economics & Mathematics Institute, RAS, Nakhimovskii prospekt 47, Moscow 117418, Russia b School

More information

On Optimality Conditions for Pseudoconvex Programming in Terms of Dini Subdifferentials

On Optimality Conditions for Pseudoconvex Programming in Terms of Dini Subdifferentials Int. Journal of Math. Analysis, Vol. 7, 2013, no. 18, 891-898 HIKARI Ltd, www.m-hikari.com On Optimality Conditions for Pseudoconvex Programming in Terms of Dini Subdifferentials Jaddar Abdessamad Mohamed

More information

DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY

DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY R. T. Rockafellar* Abstract. A basic relationship is derived between generalized subgradients of a given function, possibly nonsmooth and nonconvex,

More information

On Nonconvex Subdifferential Calculus in Banach Spaces 1

On Nonconvex Subdifferential Calculus in Banach Spaces 1 Journal of Convex Analysis Volume 2 (1995), No.1/2, 211 227 On Nonconvex Subdifferential Calculus in Banach Spaces 1 Boris S. Mordukhovich, Yongheng Shao Department of Mathematics, Wayne State University,

More information

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory

More information

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth

More information

ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD

ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD ON PROXIMAL POINT-TYPE ALGORITHMS FOR WEAKLY CONVEX FUNCTIONS AND THEIR CONNECTION TO THE BACKWARD EULER METHOD TIM HOHEISEL, MAXIME LABORDE, AND ADAM OBERMAN Abstract. In this article we study the connection

More information

Quadratic growth and critical point stability of semi-algebraic functions

Quadratic growth and critical point stability of semi-algebraic functions Math. Program., Ser. A DOI 1.17/s117-14-82-y FULL LENGTH PAPER of semi-algebraic functions D. Drusvyatskiy A. D.Ioffe Received: 5 September 213 / Accepted: 12 September 214 Springer-Verlag Berlin Heidelberg

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane Conference ADGO 2013 October 16, 2013 Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde Université des Antilles et de la Guyane Playa Blanca, Tongoy, Chile SUBDIFFERENTIAL

More information

Implications of the Constant Rank Constraint Qualification

Implications of the Constant Rank Constraint Qualification Mathematical Programming manuscript No. (will be inserted by the editor) Implications of the Constant Rank Constraint Qualification Shu Lu Received: date / Accepted: date Abstract This paper investigates

More information

NONSMOOTH ANALYSIS AND PARAMETRIC OPTIMIZATION. R. T. Rockafellar*

NONSMOOTH ANALYSIS AND PARAMETRIC OPTIMIZATION. R. T. Rockafellar* NONSMOOTH ANALYSIS AND PARAMETRIC OPTIMIZATION R. T. Rockafellar* Abstract. In an optimization problem that depends on parameters, an important issue is the effect that perturbations of the parameters

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Weak sharp minima on Riemannian manifolds 1

Weak sharp minima on Riemannian manifolds 1 1 Chong Li Department of Mathematics Zhejiang University Hangzhou, 310027, P R China cli@zju.edu.cn April. 2010 Outline 1 2 Extensions of some results for optimization problems on Banach spaces 3 4 Some

More information

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS S. HOSSEINI Abstract. A version of Lagrange multipliers rule for locally Lipschitz functions is presented. Using Lagrange

More information

arxiv: v1 [math.oc] 13 Mar 2019

arxiv: v1 [math.oc] 13 Mar 2019 SECOND ORDER OPTIMALITY CONDITIONS FOR STRONG LOCAL MINIMIZERS VIA SUBGRADIENT GRAPHICAL DERIVATIVE Nguyen Huy Chieu, Le Van Hien, Tran T. A. Nghia, Ha Anh Tuan arxiv:903.05746v [math.oc] 3 Mar 209 Abstract

More information

Expanding the reach of optimal methods

Expanding the reach of optimal methods Expanding the reach of optimal methods Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with C. Kempton (UW), M. Fazel (UW), A.S. Lewis (Cornell), and S. Roy (UW) BURKAPALOOZA! WCOM

More information

Convex Functions. Pontus Giselsson

Convex Functions. Pontus Giselsson Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity.

Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Characterizations of Lojasiewicz inequalities: subgradient flows, talweg, convexity. Jérôme BOLTE, Aris DANIILIDIS, Olivier LEY, Laurent MAZET. Abstract The classical Lojasiewicz inequality and its extensions

More information

Geometrical interpretation of the predictor-corrector type algorithms in structured optimization problems

Geometrical interpretation of the predictor-corrector type algorithms in structured optimization problems Geometrical interpretation of the predictor-corrector type algorithms in structured optimization problems Aris DANIILIDIS Warren HARE Jérôme MALICK Dedicated to Professor D. Pallaschke for the occasion

More information

Lecture 6: September 12

Lecture 6: September 12 10-725: Optimization Fall 2013 Lecture 6: September 12 Lecturer: Ryan Tibshirani Scribes: Micol Marchetti-Bowick Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Nonsmooth optimization: conditioning, convergence and semi-algebraic models

Nonsmooth optimization: conditioning, convergence and semi-algebraic models Nonsmooth optimization: conditioning, convergence and semi-algebraic models Adrian S. Lewis Abstract. Variational analysis has come of age. Long an elegant theoretical toolkit for variational mathematics

More information

Relationships between upper exhausters and the basic subdifferential in variational analysis

Relationships between upper exhausters and the basic subdifferential in variational analysis J. Math. Anal. Appl. 334 (2007) 261 272 www.elsevier.com/locate/jmaa Relationships between upper exhausters and the basic subdifferential in variational analysis Vera Roshchina City University of Hong

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order

More information

for all u C, where F : X X, X is a Banach space with its dual X and C X

for all u C, where F : X X, X is a Banach space with its dual X and C X ROMAI J., 6, 1(2010), 41 45 PROXIMAL POINT METHODS FOR VARIATIONAL INEQUALITIES INVOLVING REGULAR MAPPINGS Corina L. Chiriac Department of Mathematics, Bioterra University, Bucharest, Romania corinalchiriac@yahoo.com

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

Dedicated to Michel Théra in honor of his 70th birthday

Dedicated to Michel Théra in honor of his 70th birthday VARIATIONAL GEOMETRIC APPROACH TO GENERALIZED DIFFERENTIAL AND CONJUGATE CALCULI IN CONVEX ANALYSIS B. S. MORDUKHOVICH 1, N. M. NAM 2, R. B. RECTOR 3 and T. TRAN 4. Dedicated to Michel Théra in honor of

More information

Barrier Method. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725 Barrier Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: Newton s method For root-finding F (x) = 0 x + = x F (x) 1 F (x) For optimization x f(x) x + = x 2 f(x) 1 f(x) Assume f strongly

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Radius Theorems for Monotone Mappings

Radius Theorems for Monotone Mappings Radius Theorems for Monotone Mappings A. L. Dontchev, A. Eberhard and R. T. Rockafellar Abstract. For a Hilbert space X and a mapping F : X X (potentially set-valued) that is maximal monotone locally around

More information

LINEAR-CONVEX CONTROL AND DUALITY

LINEAR-CONVEX CONTROL AND DUALITY 1 LINEAR-CONVEX CONTROL AND DUALITY R.T. Rockafellar Department of Mathematics, University of Washington Seattle, WA 98195-4350, USA Email: rtr@math.washington.edu R. Goebel 3518 NE 42 St., Seattle, WA

More information

Stochastic subgradient method converges on tame functions

Stochastic subgradient method converges on tame functions Stochastic subgradient method converges on tame functions arxiv:1804.07795v3 [math.oc] 26 May 2018 Damek Davis Dmitriy Drusvyatskiy Sham Kakade Jason D. Lee Abstract This work considers the question: what

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

On a class of nonsmooth composite functions

On a class of nonsmooth composite functions On a class of nonsmooth composite functions Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA e-mail: ashapiro@isye.gatech.edu

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

Key words. saddle-point dynamics, asymptotic convergence, convex-concave functions, proximal calculus, center manifold theory, nonsmooth dynamics

Key words. saddle-point dynamics, asymptotic convergence, convex-concave functions, proximal calculus, center manifold theory, nonsmooth dynamics SADDLE-POINT DYNAMICS: CONDITIONS FOR ASYMPTOTIC STABILITY OF SADDLE POINTS ASHISH CHERUKURI, BAHMAN GHARESIFARD, AND JORGE CORTÉS Abstract. This paper considers continuously differentiable functions of

More information

Sufficient Conditions for Finite-variable Constrained Minimization

Sufficient Conditions for Finite-variable Constrained Minimization Lecture 4 It is a small de tour but it is important to understand this before we move to calculus of variations. Sufficient Conditions for Finite-variable Constrained Minimization ME 256, Indian Institute

More information

Optimality Conditions for Nonsmooth Convex Optimization

Optimality Conditions for Nonsmooth Convex Optimization Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never

More information

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem Charles Byrne (Charles Byrne@uml.edu) http://faculty.uml.edu/cbyrne/cbyrne.html Department of Mathematical Sciences

More information

PARTIAL SECOND-ORDER SUBDIFFERENTIALS IN VARIATIONAL ANALYSIS AND OPTIMIZATION BORIS S. MORDUKHOVICH 1, NGUYEN MAU NAM 2 and NGUYEN THI YEN NHI 3

PARTIAL SECOND-ORDER SUBDIFFERENTIALS IN VARIATIONAL ANALYSIS AND OPTIMIZATION BORIS S. MORDUKHOVICH 1, NGUYEN MAU NAM 2 and NGUYEN THI YEN NHI 3 PARTIAL SECOND-ORDER SUBDIFFERENTIALS IN VARIATIONAL ANALYSIS AND OPTIMIZATION BORIS S. MORDUKHOVICH 1, NGUYEN MAU NAM 2 and NGUYEN THI YEN NHI 3 Abstract. This paper presents a systematic study of partial

More information

A Lyusternik-Graves Theorem for the Proximal Point Method

A Lyusternik-Graves Theorem for the Proximal Point Method A Lyusternik-Graves Theorem for the Proximal Point Method Francisco J. Aragón Artacho 1 and Michaël Gaydu 2 Abstract We consider a generalized version of the proximal point algorithm for solving the perturbed

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Perturbation Analysis of Optimization Problems

Perturbation Analysis of Optimization Problems Perturbation Analysis of Optimization Problems J. Frédéric Bonnans 1 and Alexander Shapiro 2 1 INRIA-Rocquencourt, Domaine de Voluceau, B.P. 105, 78153 Rocquencourt, France, and Ecole Polytechnique, France

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Hölder Metric Subregularity with Applications to Proximal Point Method

Hölder Metric Subregularity with Applications to Proximal Point Method Hölder Metric Subregularity with Applications to Proximal Point Method GUOYIN LI and BORIS S. MORDUKHOVICH Revised Version: October 1, 01 Abstract This paper is mainly devoted to the study and applications

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information