Optimal Value Function Methods in Numerical Optimization Level Set Methods

Size: px
Start display at page:

Download "Optimal Value Function Methods in Numerical Optimization Level Set Methods"

Transcription

1 Optimal Value Function Methods in Numerical Optimization Level Set Methods James V Burke Mathematics, University of Washington, (jvburke@uw.edu) Joint work with Aravkin (UW), Drusvyatskiy (UW), Friedlander (UBC/Davis), Roy (UW) The Hong Kong Polytechnic University Applied Mathematics Colloquium February 4, 2016

2 Motivation Optimization in Large-Scale Inference A range of large-scale data science applications can be modeled using optimization: - Inverse problems (medical and seismic imaging ) - High dimensional inference (compressive sensing, LASSO, quantile regression) - Machine learning (classification, matrix completion, robust PCA, time series) These applications are often solved using side information: - Sparsity or low rank of solution - Constraints (topography, non-negativity) - Regularization (priors, total variation, dirty data) We need efficient large-scale solvers for nonsmooth programs. 2/36

3 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b 3/36

4 The Prototypical Problem Sparse Data Fitting: Example: Model Selection Find sparse x with Ax b y = a T x where y R k is an observation and a R n are covariates. Suppose y is a disease classifier and a is micro-array data (n 10 4 ). Given data {(y i, a i )} m i=1, find x so that y i a T i x. 3/36

5 The Prototypical Problem Sparse Data Fitting: Example: Model Selection Find sparse x with Ax b y = a T x where y R k is an observation and a R n are covariates. Suppose y is a disease classifier and a is micro-array data (n 10 4 ). Given data {(y i, a i )} m i=1, find x so that y i a T i x. Since m << n, one can always find x such that y i = a T i x, i = 1,..., m. 3/36

6 The Prototypical Problem Sparse Data Fitting: Example: Model Selection Find sparse x with Ax b y = a T x where y R k is an observation and a R n are covariates. Suppose y is a disease classifier and a is micro-array data (n 10 4 ). Given data {(y i, a i )} m i=1, find x so that y i a T i x. Since m << n, one can always find x such that y i = a T i x, i = 1,..., m. This x gives little insight into the role of the covariates a in determining the observations y. We prefer the most parsimonious subset of covariates that can be used to explain the observations. That is, we prefer the sparsest model from the 2 n possible models. Such models are used to further our knowledge of disease mechanisms and to develop efficient disease assays. 3/36

7 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b There are numerous other applications; system identification image segmentation compressed sensing grouped sparsity for remote sensor location... 4/36

8 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b 5/36

9 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b Convex approaches: x 1 as a sparsity surragate (Candes-Tao-Donaho) BPDN LASSO Lagrangian (Penalty) min x s.t. x Ax b 2 2 σ min x s.t. 1 2 Ax b 2 2 x 1 τ min x 1 2 Ax b λ x 1 5/36

10 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b Convex approaches: x 1 as a sparsity surragate (Candes-Tao-Donaho) BPDN LASSO Lagrangian (Penalty) min x s.t. x Ax b 2 2 σ min x s.t. 1 2 Ax b 2 2 x 1 τ min x 1 2 Ax b λ x 1 BPDN: often most natural and transparent. (physical considerations guide σ) 5/36

11 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b Convex approaches: x 1 as a sparsity surragate (Candes-Tao-Donaho) BPDN LASSO Lagrangian (Penalty) min x s.t. x Ax b 2 2 σ min x s.t. 1 2 Ax b 2 2 x 1 τ min x 1 2 Ax b λ x 1 BPDN: often most natural and transparent. (physical considerations guide σ) Lagrangian: ubiquitous in practice. ( no constraints ) 5/36

12 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b Convex approaches: x 1 as a sparsity surragate (Candes-Tao-Donaho) BPDN LASSO Lagrangian (Penalty) min x s.t. x Ax b 2 2 σ min x s.t. 1 2 Ax b 2 2 x 1 τ min x 1 2 Ax b λ x 1 BPDN: often most natural and transparent. (physical considerations guide σ) Lagrangian: ubiquitous in practice. ( no constraints ) All three are (essentially) equivalent computationally! 5/36

13 The Prototypical Problem Sparse Data Fitting: Find sparse x with Ax b Convex approaches: x 1 as a sparsity surragate (Candes-Tao-Donaho) BPDN LASSO Lagrangian (Penalty) min x s.t. x Ax b 2 2 σ min x s.t. 1 2 Ax b 2 2 x 1 τ min x 1 2 Ax b λ x 1 BPDN: often most natural and transparent. (physical considerations guide σ) Lagrangian: ubiquitous in practice. ( no constraints ) All three are (essentially) equivalent computationally! Basis for SPGL1 (van den Berg-Friedlander 08) 5/36

14 Optimal Value or Level Set Framework Problem class: Solve min x X s.t. φ(x) ρ(ax b) σ P(σ) 6/36

15 Optimal Value or Level Set Framework Problem class: Solve min x X s.t. φ(x) ρ(ax b) σ P(σ) Strategy: Consider the flipped problem v(τ) := min x X s.t. ρ(ax b) φ(x) τ Q(τ) 6/36

16 Optimal Value or Level Set Framework Problem class: Solve min x X s.t. φ(x) ρ(ax b) σ P(σ) Strategy: Consider the flipped problem v(τ) := min x X s.t. ρ(ax b) φ(x) τ Q(τ) Then opt-val(p(σ)) is the minimal root of the equation v(τ) = σ 6/36

17 Queen Dido s Problem The intuition behind the proposed framework has a distinguished history, appearing even in antiquity. Perhaps the earliest instance is Queen Dido s problem and the fabled origins of Carthage. In short, the problem is to find the maximum area that can be enclosed by an arc of fixed length and a given line. The converse problem is to find an arc of least length that traps a fixed area between a line and the arc. Although these two problems reverse the objective and the constraint, the solution in each case is a semi-circle. 7/36

18 Queen Dido s Problem The intuition behind the proposed framework has a distinguished history, appearing even in antiquity. Perhaps the earliest instance is Queen Dido s problem and the fabled origins of Carthage. In short, the problem is to find the maximum area that can be enclosed by an arc of fixed length and a given line. The converse problem is to find an arc of least length that traps a fixed area between a line and the arc. Although these two problems reverse the objective and the constraint, the solution in each case is a semi-circle. Other historical examples abound. More recently, these observations provide the basis for the Markowitz Mean-Variance Portfolio Theory. 7/36

19 The Role of Convexity Convex Sets Let C R n. We say that C is convex if (1 λ)x + λy C whenever x, y C and 0 λ 1. 8/36

20 The Role of Convexity Convex Sets Let C R n. We say that C is convex if (1 λ)x + λy C whenever x, y C and 0 λ 1. Convex Functions Let f : R n R := R {+ }. We say that f is convex if the set epi (f ) := { (x, µ) : f (x) µ } is a convex set. 8/36

21 The Role of Convexity Convex Sets Let C R n. We say that C is convex if (1 λ)x + λy C whenever x, y C and 0 λ 1. Convex Functions Let f : R n R := R {+ }. We say that f is convex if the set epi (f ) := { (x, µ) : f (x) µ } is a convex set. (x 1, f (x 1)) (x, f (x )) 2 2 x 1 λx 1 + (1 - λ)x 2 x 2 f ((1 λ)x 1 + λx 2 ) (1 λ)f (x 1 ) + λf (x 2 ) 8/36

22 Convex Functions Convex indicator functions Let C R n. Then the function { 0, if x C, δ C (x) := +, if x / C, is a convex function. 9/36

23 Convex Functions Convex indicator functions Let C R n. Then the function { 0, if x C, δ C (x) := +, if x / C, is a convex function. Addition Non-negative linear combinations of convex functions are convex: f i convex and λ i 0, i = 1,..., k f (x) := k i=1 λ i f i (x). 9/36

24 Convex Functions Convex indicator functions Let C R n. Then the function { 0, if x C, δ C (x) := +, if x / C, is a convex function. Addition Non-negative linear combinations of convex functions are convex: f i convex and λ i 0, i = 1,..., k f (x) := k i=1 λ i f i (x). Infimal Projection If f : R n R m R is convex, then so is v(x) := inf y f (x, y), since epi (v) = { (x, µ) : y s.t. f (x, y) µ }. 9/36

25 Convexity of v When X, ρ, and φ are convex, the optimal value function v is a non-increasing convex function by infimal projection: v(τ) := min x X ρ(ax b) s.t. φ(x) τ = min x ρ(ax b) + δ epi (φ) (x, τ) + δ X (x) 10/36

26 Newton and Secant Methods For f convex and non-increasing, solve f (τ) = 0. 11/36

27 Newton and Secant Methods For f convex and non-increasing, solve f (τ) = /36

28 Newton and Secant Methods For f convex and non-increasing, solve f (τ) = Problem: f is often not differentiable. 11/36

29 Newton and Secant Methods For f convex and non-increasing, solve f (τ) = Problem: f is often not differentiable. Use the convex subdifferential f (x) := { z : f (y) f (x) + z T (y x) y R n } 11/36

30 Superlinear Convergence τ := inf{τ : f (τ) 0} and g := inf { g : g f (τ ) } < 0 (non-degeneracy) 12/36

31 Superlinear Convergence τ := inf{τ : f (τ) 0} and g := inf { g : g f (τ ) } < 0 (non-degeneracy) Initialization: τ 1 < τ 0 < τ { τ k if f (τ k ) = 0, τ k+1 := τ k f (τ k) g k [for g k f (τ k )] otherwise; (Newton) and τ k+1 := { τk if f (τ k ) = 0, τ k τ k τ k 1 f (τ k ) f (τ k 1 ) f (τ k) otherwise. (Secant) 12/36

32 Superlinear Convergence τ := inf{τ : f (τ) 0} and g := inf { g : g f (τ ) } < 0 (non-degeneracy) Initialization: τ 1 < τ 0 < τ { τ k if f (τ k ) = 0, τ k+1 := τ k f (τ k) g k [for g k f (τ k )] otherwise; (Newton) and τ k+1 := { τk if f (τ k ) = 0, τ k τ k τ k 1 f (τ k ) f (τ k 1 ) f (τ k) otherwise. (Secant) If either sequence terminates finitely at some τ k, then τ k = τ ; otherwise, τ τ k+1 (1 g γ k ) τ τ k, k = 1, 2,..., where γ k = g k (Newton) and γ k f (τ k 1 ) (secant). In either case, γ k g and τ k τ globally q-superlinearly. 12/36

33 Inexact Root Finding Problem: Find root of the inexactly known convex function v( ) σ. 13/36

34 Inexact Root Finding Problem: Find root of the inexactly known convex function v( ) σ. Bisection is one approach 13/36

35 Inexact Root Finding Problem: Find root of the inexactly known convex function v( ) σ. Bisection is one approach nonmonotone iterates (bad for warmstarts) at best linear convergence (with perfect information) 13/36

36 Inexact Root Finding Problem: Find root of the inexactly known convex function v( ) σ. Bisection is one approach nonmonotone iterates (bad for warmstarts) at best linear convergence (with perfect information) Solution: modified secant approximate Newton methods 13/36

37 Inexact Root Finding: Secant 14/36

38 Inexact Root Finding: Secant 14/36

39 Inexact Root Finding: Secant 14/36

40 Inexact Root Finding: Secant 14/36

41 Inexact Root Finding: Secant 14/36

42 Inexact Root Finding: Secant 14/36

43 Inexact Root Finding: Newton 14/36

44 Inexact Root Finding: Newton 14/36

45 Inexact Root Finding: Newton 14/36

46 Inexact Root Finding: Newton 14/36

47 Inexact Root Finding: Newton 14/36

48 Inexact Root Finding: Newton 14/36

49 Inexact Root Finding: Convergence 15/36

50 Inexact Root Finding: Convergence 15/36

51 Inexact Root Finding: Convergence 15/36

52 Inexact Root Finding: Convergence Key observation: C = C(τ 0 ) is independent of v (τ ). 15/36

53 Minorants from Duality 16/36

54 Minorants from Duality 16/36

55 Minorants from Duality 16/36

56 Robustness: 1 u/l α, where α [1, 2) and ɛ = f f (a) k = 13, α = 1.3 (b) k = 770, α = 1.99 (c) k = 18, α = f f (d) k = 9, α = 1.3 (e) k = 15, α = 1.99 (f) k = 10, α = 1.3 Figure : Inexact secant (top) and Newton (bottom) for f 1 (τ) = (τ 1) 2 10 (first two columns) and f 2 (τ) = τ 2 (last column). Below each panel, α is the oracle accuracy, and k is the number of iterations needed to converge, i.e., to reach f i (τ k ) ɛ = / f 2 f 2

57 Sensor Network Localization (SNL) Given a weighted graph G = (V, E, d) find a realization: p 1,..., p n R 2 with d ij = p i p j 2 for all ij E. 18/36

58 Sensor Network Localization (SNL) SDP relaxation (Weinberger et al. 04, Biswas et al. 06): max tr (X) s.t. P E K(X) d 2 2 σ Xe = 0, X 0 where [K(X)] i,j = X ii + X jj 2X ij. 19/36

59 Sensor Network Localization (SNL) SDP relaxation (Weinberger et al. 04, Biswas et al. 06): max tr (X) s.t. P E K(X) d 2 2 σ Xe = 0, X 0 where [K(X)] i,j = X ii + X jj 2X ij. Intuition: X = PP T and then tr (X) = 1 with p n + 1 i the ith row of P. n p i p j 2 i,j=1 19/36

60 Sensor Network Localization (SNL) SDP relaxation (Weinberger et al. 04, Biswas et al. 06): max tr (X) s.t. P E K(X) d 2 2 σ Xe = 0, X 0 where [K(X)] i,j = X ii + X jj 2X ij. Intuition: X = PP T and then tr (X) = 1 with p n + 1 i the ith row of P. Flipped problem: min P E K(X) d 2 2 s.t. tr X = τ Xe = 0 X 0. n p i p j 2 i,j=1 19/36

61 Sensor Network Localization (SNL) SDP relaxation (Weinberger et al. 04, Biswas et al. 06): max tr (X) s.t. P E K(X) d 2 2 σ Xe = 0, X 0 where [K(X)] i,j = X ii + X jj 2X ij. Intuition: X = PP T and then tr (X) = 1 with p n + 1 i the ith row of P. Flipped problem: min P E K(X) d 2 2 s.t. tr X = τ Xe = 0 X 0. Perfectly adapted for the Frank-Wolfe method. n p i p j 2 i,j=1 19/36

62 Sensor Network Localization (SNL) SDP relaxation (Weinberger et al. 04, Biswas et al. 06): max tr (X) s.t. P E K(X) d 2 2 σ Xe = 0, X 0 where [K(X)] i,j = X ii + X jj 2X ij. Intuition: X = PP T and then tr (X) = 1 with p n + 1 i the ith row of P. Flipped problem: min P E K(X) d 2 2 s.t. tr X = τ Xe = 0 X 0. Perfectly adapted for the Frank-Wolfe method. n p i p j 2 i,j=1 Key point: Slater failing (always the case) is irrelevant. 19/36

63 Approximate Newton 1.5 Pareto curve Figure : σ = /36

64 Approximate Newton 1.5 Pareto curve 1.4 Pareto curve Figure : σ = 0.25 Figure : σ = 0 20/36

65 Max-trace 0.6 Newton_max 0.6 Refine positions /36

66 Max-trace 0.6 Newton_max 0.6 Refine positions Newton_min Refine positions /36

67 Observations Simple strategy for optimizing over complex domains Rigorous convergence guarantees Insensitivity to ill-conditioning Many applications Sensor Network Localization (Drusvyatskiy-Krislock-Voronin-Wolkowicz 15) Sparse/Robust Estimation and Kalman Smoothing (Aravkin-B-Pillonetto 13) Large scale SDP and LP (cf. Renegar 14) Chromosome reconstruction (Aravkin-Becker-Drusvyatskiy-Lozano 15) Phase retrieval (Aravkin-B-Drusvyatskiy-Friedlander-Roy 16) Generalized linear models (Aravkin-B-Drusvyatskiy-Friedlander-Roy 16)... 22/36

68 Conjugate Functions and Duality Convex Indicator For any convex set C, the convex indicator function for C is { 0, x C, δ (x C ) := +, x / C. 23/36

69 Conjugate Functions and Duality Convex Indicator For any convex set C, the convex indicator function for C is { 0, x C, δ (x C ) := +, x / C. Support Functionals For any set C, the support functional for C is δ (x C ) := sup x, z. z C 23/36

70 Conjugate Functions and Duality Convex Indicator For any convex set C, the convex indicator function for C is { 0, x C, δ (x C ) := +, x / C. Support Functionals For any set C, the support functional for C is Convex Conjugates δ (x C ) := sup x, z. z C For any convex function g(x), the convex conjugate is given by g (y) := δ ((y, 1) epi (g)) = sup[ x, y g(x)]. x 23/36

71 Conjugate s and the Subdifferential g (y) = sup[ x, y g(x)]. x 24/36

72 Conjugate s and the Subdifferential g (y) = sup[ x, y g(x)]. x The Bi-Conjugate Theorem If epi (g) is closed and dom (g), then (g ) = g. 24/36

73 Conjugate s and the Subdifferential g (y) = sup[ x, y g(x)]. x The Bi-Conjugate Theorem If epi (g) is closed and dom (g), then (g ) = g. The Young-Fenchel Inequality g(x) + g (z) z, x for all x, y R n with equality if and only if z g(x) and x g (z). In particular, g(x) = argmax z [ z, x g (z)]. 24/36

74 Conjugate s and the Subdifferential g (y) = sup[ x, y g(x)]. x The Bi-Conjugate Theorem If epi (g) is closed and dom (g), then (g ) = g. The Young-Fenchel Inequality g(x) + g (z) z, x for all x, y R n with equality if and only if z g(x) and x g (z). In particular, g(x) = argmax z [ z, x g (z)]. Maximal Montone Operator If epi (g) is closed and dom (g), then g is a maximal monotone operator with g 1 = g. 24/36

75 Conjugate s and the Subdifferential g (y) = sup[ x, y g(x)]. x The Bi-Conjugate Theorem If epi (g) is closed and dom (g), then (g ) = g. The Young-Fenchel Inequality g(x) + g (z) z, x for all x, y R n with equality if and only if z g(x) and x g (z). In particular, g(x) = argmax z [ z, x g (z)]. Maximal Montone Operator If epi (g) is closed and dom (g), then g is a maximal monotone operator with g 1 = g. Note:The lsc hull of g is cl g := g. 24/36

76 The perspective function epi (g π ) := cl cone (epi (g)) = cl ( λ>0 λepi (g)) 25/36

77 The perspective function epi (g π ) := cl cone (epi (g)) = cl ( λ>0 λepi (g)) λg(λ 1 z) if λ > 0, g π (z, λ) := g (z) if λ = 0, + if λ < 0, where g is the horizon function of g: g (z) := sup [g(x + z) g(x)]. x dom g 25/36

78 Support Functions for epi (g) and lev g (τ) g : R n R be closed proper and convex. Then δ ((y, µ) epi (g)) = (g ) π (y, µ) and where δ (y [g τ]) = cl inf µ 0 [τµ + (g ) π (y, µ)], epi (g) := {(x, µ) g(x) µ} [g τ] := {x g(x) τ } δ (z C ) := sup z, w w C 26/36

79 Perturbation Framework and Duality (Rockafellar (1970)) The perturbation function f (x, b, τ) := ρ(b Ax) + δ ((x, τ) epi (φ)) Its conjugate f (y, u, µ) = (φ ) π (y + A T u, µ) + ρ (u). 27/36

80 Perturbation Framework and Duality (Rockafellar (1970)) The perturbation function Its conjugate f (x, b, τ) := ρ(b Ax) + δ ((x, τ) epi (φ)) f (y, u, µ) = (φ ) π (y + A T u, µ) + ρ (u). The Primal Problem P(b, τ) : infimal projection in x v(b, τ) := min x f (x, b, τ). 27/36

81 Perturbation Framework and Duality (Rockafellar (1970)) The perturbation function Its conjugate f (x, b, τ) := ρ(b Ax) + δ ((x, τ) epi (φ)) f (y, u, µ) = (φ ) π (y + A T u, µ) + ρ (u). The Primal Problem P(b, τ) : The Dual Problem v(b, τ) := min x f (x, b, τ). D(b, τ) : ˆv(b, τ) := sup b, u + τµ f (0, u, µ) u,µ (reduced dual) = sup b, u ρ (u) δ ( A T u [φ τ] ). u 27/36

82 Perturbation Framework and Duality (Rockafellar (1970)) The perturbation function Its conjugate f (x, b, τ) := ρ(b Ax) + δ ((x, τ) epi (φ)) f (y, u, µ) = (φ ) π (y + A T u, µ) + ρ (u). The Primal Problem P(b, τ) : The Dual Problem v(b, τ) := min x f (x, b, τ). D(b, τ) : ˆv(b, τ) := sup b, u + τµ f (0, u, µ) u,µ (reduced dual) = sup b, u ρ (u) δ ( A T u [φ τ] ). u The Subdifferential: If (b, τ) int (dom v), then v(b, τ) = ˆv(b, τ) and v(b, τ) = argmax D(b, τ) u,µ 27/36

83 Piecewise Linear-Quadratic Penalties φ(x) := sup [ x, u 1 u U 2 ut Bu] U R n is nonempty, closed and convex with 0 U (not nec. poly.) B R n n is symmetric positive semi-definite. Examples: 1. Support functionals: B = 0 2. Gauge functionals: γ ( U ) = δ ( U ) 3. Norms: B = closed unit ball, = γ ( B) 4. Least-squares: U = R n, B = I 5. Huber: U = [ ɛ, ɛ] n, B = I 28/36

84 PLQ Densities: Gauss, Laplace, Huber, Vapnik y y x x V (x) = 1 2 x2 V (x) = x Gauss l 1 y y K K x ɛ ɛ x V (x) = Kx 1 2 K2 ; x < K V (x) = 1 2 x2 ; K x K V (x) = Kx 1 2 K2 ; K < x V (x) = x ɛ; x < ɛ V (x) = 0; ɛ x ɛ V (x) = x ɛ; ɛ x Huber Vapnik 29/36

85 Computing v for PLQ Penalties φ φ(x) := sup [ x, u 1 u U 2 ut Bu] P(b, τ) : v(b, τ) := min ρ(b Ax) st φ(x) τ ( ) u v(b, τ) = µ x s.t. 0 A T ρ(b Ax) + µ + φ(x) and { ( ) µ = max γ A T u U, u T ABA T u/ } 2τ. 30/36

86 A Few Special Cases v(τ) := min 1 2 b Ax 2 2 st φ(x) τ Optimal Solution: x Optimal Residual: r := Ax b 1. Support functionals: φ(x) = δ (x U ), 0 U = v (τ) = δ ( A T r U ) = γ ( A T r U ) 2. Gauge functionals: φ(x) = γ (x U ), 0 U = v (τ) = γ ( A T r U ) = δ ( A T r U ) 3. Norms: φ(x) = x = v (τ) = A T r 4. Huber: φ(x) = sup u [ ɛ,ɛ] n [ x, u 1 2 ut u] = v (τ) = max{ɛ A T r, A T r 2 / 2τ} 5. Vapnik: φ(x) = (x ɛ) ( x ɛ) + 1 = v (τ) = ( A T r + ɛ A T r 2 ) 31/36

87 Basis Pursuit with Outliers BP σ : min x 1 st ρ(b Ax) σ Standard least-squares: ρ(z) = z 2 or ρ(z) = z /36

88 Basis Pursuit with Outliers BP σ : min x 1 st ρ(b Ax) σ Standard least-squares: ρ(z) = z 2 or ρ(z) = z 2 2. Quantile Huber: τ r κτ 2 2 if r < τκ, 1 ρ κ,τ (r) = 2κ r 2 if r [ κτ, (1 τ)κ], (1 τ) r κ(1 τ)2 2, if r > (1 τ)κ. Standard Huber when τ = /36

89 Huber 33/36

90 Sparse and Robust Formulation HBP σ : min x 1 st ρ(b Ax) σ Huber2 3 Signal Recovery Problem Specification Truth x 20-sparse spike train in R 512 b measurements in R 120 LS A Measurement matrix satisfying RIP ρ Huber function σ error level set at.01 5 outliers LS Residuals Results In the presence of outliers, the robust formulation recovers the spike train, while the standard formulation does not. Truth Huber /36

91 Sparse and Robust Formulation HBP σ : min 0 x Problem Specification x 1 st ρ(b Ax) σ LS Signal Recovery x 20-sparse spike train in R b measurements in R 120 A Measurement matrix satisfying RIP ρ Huber function σ error level set at.01 5 outliers Results In the presence of outliers, the robust formulation recovers the spike train, while the standard formulation does not. Truth 1 0 Huber 1 LS Truth Residuals Huber /36

92 References Probing the pareto frontier for basis pursuit solutions van der Berg - Friedlander SIAM J. Sci. Comput. 31(2008), Sparse optimization with least-squares constraints van der Berg - Friedlander SIOPT 21(2011), Variational Properties of Value Functions. Aravkin - B - Friedlander SIOPT 23(2013), Level-set methods for convex optimization Aravkin - B - Drusvyatskiy - Friedlander - Roy Preprint, /36

Optimal Value Function Methods in Numerical Optimization Level Set Methods

Optimal Value Function Methods in Numerical Optimization Level Set Methods Optimal Value Function Methods in Numerical Optimization Level Set Methods James V Burke Mathematics, University of Washington, (jvburke@uw.edu) Joint work with Aravkin (UW), Drusvyatskiy (UW), Friedlander

More information

Making Flippy Floppy

Making Flippy Floppy Making Flippy Floppy James V. Burke UW Mathematics jvburke@uw.edu Aleksandr Y. Aravkin IBM, T.J.Watson Research sasha.aravkin@gmail.com Michael P. Friedlander UBC Computer Science mpf@cs.ubc.ca Vietnam

More information

Making Flippy Floppy

Making Flippy Floppy Making Flippy Floppy James V. Burke UW Mathematics jvburke@uw.edu Aleksandr Y. Aravkin IBM, T.J.Watson Research sasha.aravkin@gmail.com Michael P. Friedlander UBC Computer Science mpf@cs.ubc.ca Current

More information

Bayesian Models for Regularization in Optimization

Bayesian Models for Regularization in Optimization Bayesian Models for Regularization in Optimization Aleksandr Aravkin, UBC Bradley Bell, UW Alessandro Chiuso, Padova Michael Friedlander, UBC Gianluigi Pilloneto, Padova Jim Burke, UW MOPTA, Lehigh University,

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Level-set methods for convex optimization

Level-set methods for convex optimization Level-set methods for conve optimization Aleksandr Y. Aravkin James V. Burke Dmitriy Drusvyatskiy Michael P. Friedlander Scott Roy February 3, 206 Abstract Conve optimization problems arising in applications

More information

c 2017 Society for Industrial and Applied Mathematics

c 2017 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 27, No. 4, pp. 2301 2331 c 2017 Society for Industrial and Applied Mathematics NOISY EUCLIDEAN DISTANCE REALIZATION: ROBUST FACIAL REDUCTION AND THE PARETO FRONTIER D. DRUSVYATSKIY,

More information

Matrix Support Functional and its Applications

Matrix Support Functional and its Applications Matrix Support Functional and its Applications James V Burke Mathematics, University of Washington Joint work with Yuan Gao (UW) and Tim Hoheisel (McGill), CORS, Banff 2016 June 1, 2016 Connections What

More information

Optimization for Learning and Big Data

Optimization for Learning and Big Data Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for

More information

Introduction to Alternating Direction Method of Multipliers

Introduction to Alternating Direction Method of Multipliers Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction

More information

Optimizaton and Kalman-Bucy Smoothing. The Chinese University of Hong Kong, March 4, 2016

Optimizaton and Kalman-Bucy Smoothing. The Chinese University of Hong Kong, March 4, 2016 Optimizaton and Kalman-Bucy Smoothing Aleksandr Y. Aravkin University of Washington sasha.aravkin@gmail.com James V. Burke University of Washington jvburke@uw.edu Bradley Bell University of Washington

More information

15. Conic optimization

15. Conic optimization L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

Compressed Sensing: a Subgradient Descent Method for Missing Data Problems

Compressed Sensing: a Subgradient Descent Method for Missing Data Problems Compressed Sensing: a Subgradient Descent Method for Missing Data Problems ANZIAM, Jan 30 Feb 3, 2011 Jonathan M. Borwein Jointly with D. Russell Luke, University of Goettingen FRSC FAAAS FBAS FAA Director,

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

FWI with Compressive Updates Aleksandr Aravkin, Felix Herrmann, Tristan van Leeuwen, Xiang Li, James Burke

FWI with Compressive Updates Aleksandr Aravkin, Felix Herrmann, Tristan van Leeuwen, Xiang Li, James Burke Consortium 2010 FWI with Compressive Updates Aleksandr Aravkin, Felix Herrmann, Tristan van Leeuwen, Xiang Li, James Burke SLIM University of British Columbia Full Waveform Inversion The Full Waveform

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

On the relations between different duals assigned to composed optimization problems

On the relations between different duals assigned to composed optimization problems manuscript No. will be inserted by the editor) On the relations between different duals assigned to composed optimization problems Gert Wanka 1, Radu Ioan Boţ 2, Emese Vargyas 3 1 Faculty of Mathematics,

More information

SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD

SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD EE-731: ADVANCED TOPICS IN DATA SCIENCES LABORATORY FOR INFORMATION AND INFERENCE SYSTEMS SPRING 2016 INSTRUCTOR: VOLKAN CEVHER SCRIBERS: SOROOSH SHAFIEEZADEH-ABADEH, MICHAËL DEFFERRARD STRUCTURED SPARSITY

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

Sparse and Robust Optimization and Applications

Sparse and Robust Optimization and Applications Sparse and and Statistical Learning Workshop Les Houches, 2013 Robust Laurent El Ghaoui with Mert Pilanci, Anh Pham EECS Dept., UC Berkeley January 7, 2013 1 / 36 Outline Sparse Sparse Sparse Probability

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Constrained optimization

Constrained optimization Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

Contents. 1 Introduction 5. 2 Convexity 9. 3 Cones and Generalized Inequalities Conjugate Functions Duality in Optimization 27

Contents. 1 Introduction 5. 2 Convexity 9. 3 Cones and Generalized Inequalities Conjugate Functions Duality in Optimization 27 A N D R E W T U L L O C H C O N V E X O P T I M I Z AT I O N T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Introduction 5 1.1 Setup 5 1.2 Convexity 6 2 Convexity

More information

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS

ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS ALGORITHMS FOR MINIMIZING DIFFERENCES OF CONVEX FUNCTIONS AND APPLICATIONS Mau Nam Nguyen (joint work with D. Giles and R. B. Rector) Fariborz Maseeh Department of Mathematics and Statistics Portland State

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014 Convex optimization Javier Peña Carnegie Mellon University Universidad de los Andes Bogotá, Colombia September 2014 1 / 41 Convex optimization Problem of the form where Q R n convex set: min x f(x) x Q,

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Mathematical Optimization Models and Applications

Mathematical Optimization Models and Applications Mathematical Optimization Models and Applications Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 1, 2.1-2,

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

6. Approximation and fitting

6. Approximation and fitting 6. Approximation and fitting Convex Optimization Boyd & Vandenberghe norm approximation least-norm problems regularized approximation robust approximation 6 Norm approximation minimize Ax b (A R m n with

More information

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization Instructor: Farid Alizadeh Author: Ai Kagawa 12/12/2012

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Active sets, steepest descent, and smooth approximation of functions

Active sets, steepest descent, and smooth approximation of functions Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian

More information

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization Robert M. Freund M.I.T. June, 2010 from papers in SIOPT, Mathematics

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Raphael Louca & Eilyan Bitar School of Electrical and Computer Engineering American Control Conference (ACC) Chicago,

More information

Subdifferential representation of convex functions: refinements and applications

Subdifferential representation of convex functions: refinements and applications Subdifferential representation of convex functions: refinements and applications Joël Benoist & Aris Daniilidis Abstract Every lower semicontinuous convex function can be represented through its subdifferential

More information

Sparse Solutions of an Undetermined Linear System

Sparse Solutions of an Undetermined Linear System 1 Sparse Solutions of an Undetermined Linear System Maddullah Almerdasy New York University Tandon School of Engineering arxiv:1702.07096v1 [math.oc] 23 Feb 2017 Abstract This work proposes a research

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs

Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Adaptive discretization and first-order methods for nonsmooth inverse problems for PDEs Christian Clason Faculty of Mathematics, Universität Duisburg-Essen joint work with Barbara Kaltenbacher, Tuomo Valkonen,

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Sparsity and Compressed Sensing

Sparsity and Compressed Sensing Sparsity and Compressed Sensing Jalal Fadili Normandie Université-ENSICAEN, GREYC Mathematical coffees 2017 Recap: linear inverse problems Dictionary Sensing Sensing Sensing = y m 1 H m n y y 2 R m H A

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Robust high-dimensional linear regression: A statistical perspective

Robust high-dimensional linear regression: A statistical perspective Robust high-dimensional linear regression: A statistical perspective Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics STOC workshop on robustness and nonconvexity Montreal,

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

LINEAR-CONVEX CONTROL AND DUALITY

LINEAR-CONVEX CONTROL AND DUALITY 1 LINEAR-CONVEX CONTROL AND DUALITY R.T. Rockafellar Department of Mathematics, University of Washington Seattle, WA 98195-4350, USA Email: rtr@math.washington.edu R. Goebel 3518 NE 42 St., Seattle, WA

More information

Radial Subgradient Descent

Radial Subgradient Descent Radial Subgradient Descent Benja Grimmer Abstract We present a subgradient method for imizing non-smooth, non-lipschitz convex optimization problems. The only structure assumed is that a strictly feasible

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems

Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems V. Jeyakumar and G. Li Revised Version:August 31, 2012 Abstract An exact semidefinite linear programming (SDP) relaxation

More information

Estimation of Constrained Mixture Densities

Estimation of Constrained Mixture Densities Estimation of Constrained Mixture Densities Yeongcheong Baek Chris Jordan-Squire Jim Burke Optimization, Convex Analysis, & Nonsmooth Analysis Seminar UBC Okanaga February 9, 2011 Mixture Densities Nonparametric

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

A Dual Condition for the Convex Subdifferential Sum Formula with Applications

A Dual Condition for the Convex Subdifferential Sum Formula with Applications Journal of Convex Analysis Volume 12 (2005), No. 2, 279 290 A Dual Condition for the Convex Subdifferential Sum Formula with Applications R. S. Burachik Engenharia de Sistemas e Computacao, COPPE-UFRJ

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Generalized Kernel Classification and Regression

Generalized Kernel Classification and Regression Generalized Kernel Classification and Regression Jason Palmer and Kenneth Kreutz-Delgado Department of Electrical and Computer Engineering University of California San Diego, La Jolla, CA 92093, USA japalmer@ucsd.edu,kreutz@ece.ucsd.edu

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

Lecture 20: November 1st

Lecture 20: November 1st 10-725: Optimization Fall 2012 Lecture 20: November 1st Lecturer: Geoff Gordon Scribes: Xiaolong Shen, Alex Beutel Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

arxiv: v1 [stat.ml] 1 Mar 2016

arxiv: v1 [stat.ml] 1 Mar 2016 DUAL SMOOTHING AND LEVEL SET TECHNIQUES FOR VARIATIONAL MATRIX DECOMPOSITION Dual Smoothing and Level Set Techniques for Variational Matrix Decomposition arxiv:1603.00284v1 [stat.ml] 1 Mar 2016 Aleksandr

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Y R n m : AY = B. McGill University, 805 Sherbrooke St West, Room1114, Montréal, Québec, Canada H3A 0B9

Y R n m : AY = B. McGill University, 805 Sherbrooke St West, Room1114, Montréal, Québec, Canada H3A 0B9 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 VARIATIONAL PROPERTIES OF MATRIX FUNCTIONS VIA THE GENERALIZED MATRIX-FRACTIONAL FUNCTION JAMES V. BURKE, YUAN GAO,

More information

Network Localization via Schatten Quasi-Norm Minimization

Network Localization via Schatten Quasi-Norm Minimization Network Localization via Schatten Quasi-Norm Minimization Anthony Man-Cho So Department of Systems Engineering & Engineering Management The Chinese University of Hong Kong (Joint Work with Senshan Ji Kam-Fung

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency

Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency Applied Mathematics E-Notes, 16(2016), 133-143 c ISSN 1607-2510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ Sequential Pareto Subdifferential Sum Rule And Sequential Effi ciency

More information

Optimality Conditions for Nonsmooth Convex Optimization

Optimality Conditions for Nonsmooth Convex Optimization Optimality Conditions for Nonsmooth Convex Optimization Sangkyun Lee Oct 22, 2014 Let us consider a convex function f : R n R, where R is the extended real field, R := R {, + }, which is proper (f never

More information

Composite nonlinear models at scale

Composite nonlinear models at scale Composite nonlinear models at scale Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with D. Davis (Cornell), M. Fazel (UW), A.S. Lewis (Cornell) C. Paquette (Lehigh), and S. Roy (UW)

More information

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1

SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 SOR- and Jacobi-type Iterative Methods for Solving l 1 -l 2 Problems by Way of Fenchel Duality 1 Masao Fukushima 2 July 17 2010; revised February 4 2011 Abstract We present an SOR-type algorithm and a

More information

Dual Decomposition.

Dual Decomposition. 1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation UIUC CSL Mar. 24 Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation Yuejie Chi Department of ECE and BMI Ohio State University Joint work with Yuxin Chen (Stanford).

More information

Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces

Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 6090 Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces Emil Ernst Michel Théra Rapport de recherche n 2004-04

More information

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information