Active sets, steepest descent, and smooth approximation of functions

Size: px

Start display at page:

Download "Active sets, steepest descent, and smooth approximation of functions"

Primrose Bruce
5 years ago
Views:

1 Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian S. Lewis (Cornell) October 23, 2012

2 Outline Outline: Active sets (finite identification). Steepest descent. Smooth approximation of functions. Unifying themes: 1. representation independence. 2. intrinsic geometry (semi-algebraic).

3 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 3/34

4 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Active sets Steepest Descent Smooth approximation 4/34

5 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Active sets Steepest Descent Smooth approximation 4/34

6 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Assuming linear independence of { g i ( x) : i Ī }, The KKT conditions hold: The set f ( x) cone { g i ( x) : i Ī }. M = {x R n : g i (x) = 0 for each i Ī } is a C2 manifold. Active sets Steepest Descent Smooth approximation 4/34

7 Classical active manifold Classical problem min f (x), s.t. g i (x) 0, for all i I := {1,..., m}, where f, g 1,..., g m are C 2 smooth functions on R n. Fix a local minimizer x and the active index set: Ī = {i I : g i ( x) = 0}. Assuming linear independence of { g i ( x) : i Ī }, The KKT conditions hold: The set f ( x) cone { g i ( x) : i Ī }. M = {x R n : g i (x) = 0 for each i Ī } Assuming, in addition, strict complementarity: f ( x) ri cone { g i (x) : i Ī }, is a C2 manifold. the active manifold M is central for algorithms and sensitivity analysis. Active sets Steepest Descent Smooth approximation 4/34

8 Classical active manifold Figure: min{z : z x(1 x) + y 2, z x(1 + x) + y 2 }, M = {(0, y, y 2 ) : y R} Active sets Steepest Descent Smooth approximation 5/34

9 Generic Properties Should we expect such nice conditions to hold typically? Active sets Steepest Descent Smooth approximation 6/34

10 Generic Properties Should we expect such nice conditions to hold typically? Rockafellar-Spingarn 79, considered perturbed problems P(v, u) : min f (x) v, x, s.t. g i (x) u i, for all i I := {1,..., m}, Active sets Steepest Descent Smooth approximation 6/34

11 Generic Properties Should we expect such nice conditions to hold typically? Rockafellar-Spingarn 79, considered perturbed problems P(v, u) : min f (x) v, x, s.t. g i (x) u i, for all i I := {1,..., m}, Theorem (Rockafellar-Spingarn 79) For almost all (v, u) at every local minimizer of P(v, u): Active manifold: active gradients are linearly independent. Strict Complementarity: multipliers are strictly positive. Quadratic growth: objective function grows quadratically. Active sets Steepest Descent Smooth approximation 6/34

12 Goals What is the active set in general independent of functional representation? Active sets Steepest Descent Smooth approximation 7/34

13 Goals What is the active set in general independent of functional representation? What does it generically look like (semi-algebraic setting)? Active sets Steepest Descent Smooth approximation 7/34

14 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Active sets Steepest Descent Smooth approximation 8/34

15 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Can extend the definition of f to nonconvex functions. Generalized critical points: x crit f 0 f ( x) Active sets Steepest Descent Smooth approximation 8/34

16 Subdifferentials The subdifferential of a convex function f : R n R at x is f ( x) := {v R n : f (x) f ( x) v, x x for all x R n }. Can extend the definition of f to nonconvex functions. Generalized critical points: Example: Consider again min f (x), x crit f 0 f ( x) s.t. g i (x) 0, for all i I := {1,..., m}, Assuming { g i ( x) : i Ī } are linearly independent, have ( f + δ {x:gi (x) 0 for i I }) ( x) = f ( x) + cone { gi ( x) : i Ī }. Critical points are the KKT points! Active sets Steepest Descent Smooth approximation 8/34

17 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). Active sets Steepest Descent Smooth approximation 9/34

18 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). What if we knew that for some set M, have x i M eventually? Active sets Steepest Descent Smooth approximation 9/34

19 Finite identification Many algorithms for minimizing f : R n R, Subgradient (Gradient) projection methods, Newton-like methods, Proximal Point Algorithms, produce iterates x k x, along with criticality certificates: v k 0 with v k f (x k ). What if we knew that for some set M, have x i M eventually? Then the problems min f (x) and min f (x). x Rn x M are equivalent for the algorithm. Active sets Steepest Descent Smooth approximation 9/34

20 Finite identification Finite identification considered implicitly by a number of authors: Bertsekas 76, Rockafellar 76, Calamai 87, Burke and Moré 88, Dunn 87, Ferris 91, Wright 93, Lewis and Hare Active sets Steepest Descent Smooth approximation 10/34

21 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Active sets Steepest Descent Smooth approximation 11/34

22 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x M c Q Active sets Steepest Descent Smooth approximation 11/34

23 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x c v i x i M Q x i x v i v Active sets Steepest Descent Smooth approximation 11/34

24 Finite identification Definition (Identifiable sets) A set M R n is identifiable at a critical point x of f if x i x, v i 0 v i f (x i ) } = x i M for all large i Example (Normal cone map) Let f (x) = δ Q (x) v, x. v x M c Q Active sets Steepest Descent Smooth approximation 11/34

25 Sensitivity Analysis What about sensitivity analysis? Active sets Steepest Descent Smooth approximation 12/34

26 Sensitivity Analysis Proposition (D-Lewis) What about sensitivity analysis? Suppose M is an identifiable set at x relative to f. Active sets Steepest Descent Smooth approximation 12/34

27 Sensitivity Analysis Proposition (D-Lewis) What about sensitivity analysis? Suppose M is an identifiable set at x relative to f. x is a (strict) local minimizer of f x is a (strict) local minimizer of f M. f grows quadratically near x f M grows quadratically near x. f is tilt-stable at x f M is tilt-stable at x. Active sets Steepest Descent Smooth approximation 12/34

28 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Active sets Steepest Descent Smooth approximation 13/34

29 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Active sets Steepest Descent Smooth approximation 13/34

30 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Definition An identifiable set M at x is locally minimal if M identifiable at x = M M, locally near x. Active sets Steepest Descent Smooth approximation 13/34

31 Locally minimal identifiable sets Clearly all of R n is identifiable at any x crit f (not interesting). So... Question: What are the smallest possible identifiable sets? Definition An identifiable set M at x is locally minimal if M identifiable at x = M M, locally near x. Locally minimal identifiable sets may fail to exist in general (e.g. f (x, y) = x 4 + y 2 ). Active sets Steepest Descent Smooth approximation 13/34

32 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. Active sets Steepest Descent Smooth approximation 14/34

33 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Active sets Steepest Descent Smooth approximation 14/34

34 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Calculus: A strong chain rule is available for composite functions f (x) = g(f(x)). Active sets Steepest Descent Smooth approximation 14/34

35 Further properties Existence: Locally minimal identifiable sets exist for fully amenable functions: f (x) = g(f(x)) where 1. F is C 2 -smooth, 2. g is (convex) piecewise quadratic, 3. qualification condition holds. E.g. convex polyhedra, max-type functions, standard problems of nonlinear math programming. Calculus: A strong chain rule is available for composite functions f (x) = g(f(x)). Critical cones: The critical cone K Q ( x, v) to a convex set Q has a natural interpretation: K Q ( x, v) = cl conv T M ( x), where M is an appropriate locally minimal identifiable set. Active sets Steepest Descent Smooth approximation 14/34

36 Identifiable manifolds Active sets Steepest Descent Smooth approximation 15/34

37 Identifiable manifolds M is an identifiable manifold at x if M is identifiable, M is a manifold, and f M is smooth. Active sets Steepest Descent Smooth approximation 15/34

38 Identifiable manifolds M is an identifiable manifold at x if M is identifiable, M is a manifold, and f M is smooth. Proposition (D-Lewis) Identifiable manifolds M dom f are automatically locally minimal. Yields a refinement of partly smooth manifolds (Lewis 03) (and Wright 93 in the convex case). Active sets Steepest Descent Smooth approximation 15/34

39 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). Active sets Steepest Descent Smooth approximation 16/34

40 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. Active sets Steepest Descent Smooth approximation 16/34

41 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. (e.g. (f λ)(a) = λ n (A) or (f λ)(a) = i λ i (A) ). Active sets Steepest Descent Smooth approximation 16/34

42 Lifts of identifiable manifolds Consider S n := {n n symmetric matrices} and the eigenvalue map A (λ 1 (A),..., λ n (A)), where λ 1 (A)... λ n (A). For permutation-invariant f : R n R, form the spectral function f λ: S n R. (e.g. (f λ)(a) = λ n (A) or (f λ)(a) = i λ i (A) ). Identifiable manifolds lift : (D-Lewis), (Daniilidis-Malick-Sendov) M identifiable manifold at λ( X) relative to f = λ 1 (M) identifiable manifold at X relative to f λ. Active sets Steepest Descent Smooth approximation 16/34

43 Generic Properties Goal: Eliminate representation dependence. Active sets Steepest Descent Smooth approximation 17/34

44 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. Active sets Steepest Descent Smooth approximation 17/34

45 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by finitely many polynomial inequalities. Active sets Steepest Descent Smooth approximation 17/34

46 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by Theorem (D-Ioffe-Lewis) finitely many polynomial inequalities. For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Active sets Steepest Descent Smooth approximation 17/34

47 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by Theorem (D-Ioffe-Lewis) finitely many polynomial inequalities. For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Then for a typical v R n, at every minimizer x v of f v, Active sets Steepest Descent Smooth approximation 17/34

48 Generic Properties Goal: Eliminate representation dependence. Cost: We consider the semi-algebraic setting. f : R n R is semi-algebraic if epi f can be described by finitely many polynomial inequalities. Theorem (D-Ioffe-Lewis) For semi-algebraic f : R n R, consider the perturbed functions f v (x) := f (x) v, x, Then for a typical v R n, at every minimizer x v of f v, Active manifold: existence of an identifiable manifold Strict complementarity: 0 ri f v (x v ) Quadratic growth: f v grows quadratically near x v. Active sets Steepest Descent Smooth approximation 17/34

49 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Active sets Steepest Descent Smooth approximation 18/34

50 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, Active sets Steepest Descent Smooth approximation 18/34

51 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Active sets Steepest Descent Smooth approximation 18/34

52 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Metric regularity of F : R n R m at ( x, ȳ) d(x, F 1 (y)) κd(y, F(x)) for all (x, y) near ( x, ȳ), is typical (Nonsmooth Sard s theorem). Active sets Steepest Descent Smooth approximation 18/34

53 Variational analysis & semi-algebraic sets Semi-algebraic subdifferential graphs gph f := {(x, v) R n R n : v f (x)} are not too big, not too small, but just right: Theorem (D-Ioffe-Lewis) For lsc, semi-algebraic f : R n R, we have dim gph f = n, even locally around any pair (x, v) gph f. Metric regularity of F : R n R m at ( x, ȳ) d(x, F 1 (y)) κd(y, F(x)) for all (x, y) near ( x, ȳ), is typical (Nonsmooth Sard s theorem). Theorem (Ioffe) For semi-algebraic F : R n R m, for generic y R m, have x F 1 (y) = F is metrically regular at (x, y). Active sets Steepest Descent Smooth approximation 18/34

54 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 19/34

55 Steepest Descent Steepest descent drives algorithm design. Active sets Steepest Descent Smooth approximation 20/34

56 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. Active sets Steepest Descent Smooth approximation 20/34

57 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. What are curves of steepest descent? Active sets Steepest Descent Smooth approximation 20/34

58 Steepest Descent Steepest descent drives algorithm design. Convergence analysis strategy: consider the analogues continuous-time dynamical system. What are curves of steepest descent? In the smooth case, two equivalent approaches: 1. Solution to ẋ(t) = f (x(t)). 2. A curve achieving fastest instantaneous decrease in function value. Active sets Steepest Descent Smooth approximation 20/34

59 Slope Key idea: slope and limiting slope f ( x) := limsup x x x x f ( x) := liminf x x (f ( x) f (x)) + x x f (x). Active sets Steepest Descent Smooth approximation 21/34

60 Slope Key idea: slope and limiting slope If f is smooth, then f ( x) := limsup x x x x f ( x) := liminf x x (f ( x) f (x)) + x x f (x). f ( x) = f ( x) = f ( x). Active sets Steepest Descent Smooth approximation 21/34

61 Steepest Descent Two types of steepest descent: Active sets Steepest Descent Smooth approximation 22/34

62 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). Active sets Steepest Descent Smooth approximation 22/34

63 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). 2. An absolutely continuous curve x : [0, T) R n is a curve of maximal slope if (i) Speed: (ii) Rate of descent: ẋ(t) = f (x(t)), a.e. on [0, T) d(f x) ( 2, (t) = f (x(t))) a.e. on [0, T). dt Active sets Steepest Descent Smooth approximation 22/34

64 Steepest Descent Two types of steepest descent: 1. An absolutely continuous curve x : [0, T) R n is a subgradient descent curve if it satisfies ẋ(t) f (x(t)), a.e. on [0, T). 2. An absolutely continuous curve x : [0, T) R n is a curve of maximal slope if (i) Speed: (ii) Rate of descent: ẋ(t) = f (x(t)), a.e. on [0, T) d(f x) ( 2, (t) = f (x(t))) a.e. on [0, T). dt Replace f (x(t)) with f (x(t)) to get a curve of near-maximal slope (DeGiorgi-Marino-Tosques 80). Active sets Steepest Descent Smooth approximation 22/34

65 Example Figure: f (x, y) = max{x + y, x y } + x(x + 1) + y(y + 1) Active sets Steepest Descent Smooth approximation 23/34

66 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Active sets Steepest Descent Smooth approximation 24/34

67 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... Active sets Steepest Descent Smooth approximation 24/34

68 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... semi-algebraic. Active sets Steepest Descent Smooth approximation 24/34

69 Steepest Descent Theorem (D-Ioffe-Lewis) For a Lipschitz f : R n R satisfying reasonable conditions, we have 1. Existence: For any x R n, there exists a Lipschitz steepest descent curve x : [0, T) R n in either sense, with x(0) = x. 2. Coincidence: The two descent notions coincide. Examples of reasonable conditions: f is convex, subdifferential regular, or... semi-algebraic. Furthermore... Active sets Steepest Descent Smooth approximation 24/34

70 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. Active sets Steepest Descent Smooth approximation 25/34

71 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Active sets Steepest Descent Smooth approximation 25/34

72 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Application: Global convergence of first-order methods with no regularity conditions (Attouch-Bolte-Svaiter 11)! Active sets Steepest Descent Smooth approximation 25/34

73 Steepest Descent Theorem (Ioffe, D-Ioffe-Lewis, Bolte-Daniilidis-Lewis) If f is semi-algebraic, then any bounded steepest descent curve (with maximal domain) has bounded length and converges to a critical point. (Proof based on the Kurdyka-Łojasiewicz inequality) Application: Global convergence of first-order methods with no regularity conditions (Attouch-Bolte-Svaiter 11)! No results for second-order methods yet... Active sets Steepest Descent Smooth approximation 25/34

74 Outline Active sets Steepest Descent Smooth Approximation Active sets Steepest Descent Smooth approximation 26/34

75 Approximation of functions Two standard ways to smooth a function f : R n R. 1. Mollification: f ɛ (x) := (f φ ɛ )(x) = φ ɛ (x y)f (y) dy R n where φ ɛ is a scaled standard bump function φ Active sets Steepest Descent Smooth approximation 27/34

76 Approximation of functions Two standard ways to smooth a function f : R n R. 1. Mollification: f ɛ (x) := (f φ ɛ )(x) = φ ɛ (x y)f (y) dy R n where φ ɛ is a scaled standard bump function φ 2. Moreau envelope: { e λ f (x) := inf f (y) + 1 y x 2} y R n 2λ Active sets Steepest Descent Smooth approximation 27/34

77 Approximation of functions Set-up: Q R n f R. Q Assume Q is a disjoint union of C manifolds Q = M 1 M 2... M k 1 M k Active sets Steepest Descent Smooth approximation 28/34

78 Approximation of functions Motivation: Integration by parts g f dx Q Active sets Steepest Descent Smooth approximation 29/34

79 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Active sets Steepest Descent Smooth approximation 29/34

80 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Active sets Steepest Descent Smooth approximation 29/34

81 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. Active sets Steepest Descent Smooth approximation 29/34

82 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. provided {M i } is a Whitney stratification of Q. Active sets Steepest Descent Smooth approximation 29/34

83 Approximation of functions Motivation: Integration by parts g f dx = g f, ˆn ds Q Q Q g, f dx. Goal: approximate f by a C 2 smooth f so that the f ˆn. Theorem (D-Larsson) Given a continuous f : R n R and any ɛ > 0, there exists a C 1 -smooth f satisfying 1. Closeness: f (x) f (x) < ɛ for all x R n, 2. Neumann Boundary condition: x M i = f (x) T x M i. provided {M i } is a Whitney stratification of Q. For semi-algebraic Q, Whitney stratifications always exist! Active sets Steepest Descent Smooth approximation 29/34

84 Approximation of functions What about higher order of smoothness? Active sets Steepest Descent Smooth approximation 30/34

85 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Active sets Steepest Descent Smooth approximation 30/34

86 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Definition (Normally flat stratification) The stratification {M i } is normally flat if whenever M i cl M j, there are neighborhoods V of M i and U of M j so that P Mi (x) = P Mi P Mj (x) whenever x V U. x P Mj (x) M i P Mi (x) = P Mi P Mj (x) M j Active sets Steepest Descent Smooth approximation 30/34

87 Approximation of functions What about higher order of smoothness? Need a more stringent stratification. Definition (Normally flat stratification) The stratification {M i } is normally flat if whenever M i cl M j, there are neighborhoods V of M i and U of M j so that P Mi (x) = P Mi P Mj (x) whenever x V U. x P Mj (x) M i P Mi (x) = P Mi P Mj (x) M j If {M i } is normally flat, can guarantee f is C smooth! Active sets Steepest Descent Smooth approximation 30/34

88 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Active sets Steepest Descent Smooth approximation 31/34

89 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Active sets Steepest Descent Smooth approximation 31/34

90 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Active sets Steepest Descent Smooth approximation 31/34

91 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Applications: Active sets Steepest Descent Smooth approximation 31/34

92 Approximation of functions Example: partition of polyhedra into open faces is normally flat. Any other examples? Normally flat stratifications lift : (D-Larsson), (Daniilidis-Malick-Sendov 12) A symmetric stratification {M i } of Q R n is normally flat = stratification {λ 1 (M i )} of λ 1 (Q) S n is normally flat. Examples: S n + = {X S n : X 0}, R+ n n = {X R n n : det(x) 0}, B n 2 = {X R n n : σ max 1}. Applications: 1. Probability: properties of the matrix-valued Bessel Process (Larsson 12) 2. PDEs: Geometry of Sobolev spaces. 3. More forthcoming... Active sets Steepest Descent Smooth approximation 31/34

93 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, Active sets Steepest Descent Smooth approximation 32/34

94 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: Active sets Steepest Descent Smooth approximation 32/34

95 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: New tools blending optimization, geometry, and analysis. Active sets Steepest Descent Smooth approximation 32/34

96 Summary Remark: Semi-algebraic results hold generally for o-minimal structures, e.g. bounded sets of the form {x R n : f i (x) 0, g j (x) = 0}, where f i and g j are real-analytic. Summary: New tools blending optimization, geometry, and analysis. Classical ideas active sets, steepest descent, and smooth approximation attain new life in light of intrinsic geometry! Active sets Steepest Descent Smooth approximation 32/34

97 References Optimality, identifiability, and sensitivity, D-Lewis, submitted to Math. Programming Ser. A. The dimension of semi-algebraic subdifferential graphs, D-Ioffe-Lewis. Nonlinear Analysis, 75(3), , Semi-algebraic functions have small subdifferentials, D-Lewis. to appear in Math. Prog. Ser. B. Tilt stability, uniform quadratic growth, and strong metric regularity of the subdifferential, D-Lewis, to appear in SIAM J. on Opt. Approximating functions on stratifiable domains, D-Larsson, submitted to the J. of the London Math. Soc. Generic nondegeneracy in convex optimization, D-Lewis, Proc. Amer. Soc. 139 (2011), Trajectories of subgradient dynamical systems, D-Ioffe-Lewis, submitted to SIAM J. on Control and Opt. Available at Active sets Steepest Descent Smooth approximation 33/34

98 Thank you. Active sets Steepest Descent Smooth approximation 34/34

Tame variational analysis

Tame variational analysis Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Daniilidis (Chile), Ioffe (Technion), and Lewis (Cornell) May 19, 2015 Theme: Semi-algebraic geometry