Majorization-Minimization Algorithm
|
|
- Octavia Conley
- 6 years ago
- Views:
Transcription
1 Majorization-Minimization Algorithm Theory and Ying Sun and Daniel P. Palomar Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology ELEC Convex Optimization Fall , HKUST, Hong Kong
2 Acknowledgment Slides of this lecture are majorly based on the following works: [Hun-Lan J04] D. R. Hunter, and K. Lange, "A Tutorial on MM Algorithms", Amer. Statistician, pp , [Raz-Hon-Luo J13] M. Razaviyayn, M. Hong, and Z. Luo, "A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization", SIAM J. Optim., pp , [Scu-Fac-Son-Pal-Pan J14] G. Scutari, F. Facchinei, Peiran Song,D. P. Palomar, and Jong-Shi Pang, "Decomposition by Partial Linearization: Parallel Optimization of Multi-Agent Systems", IEEE Trans. Signal Processig, vol. 62, no. 3, pp , Feb [Sun-Bab-Pal J17] Y. Sun, P. Babu, and D. P. Palomar, Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning, IEEE Trans. Signal Process, vol. 65, no. 3, pp , Feb
3 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
4 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
5 Problem Statement Consider the following optimization problem minimize f (x) x subject to x X, with X being a closed convex set and f (x) being continuous. f (x) is too complicated to manipulate. Idea: successively minimize an approximating function u ( x,x k) ( x k+1 = arg min u x,x k), x X hoping the sequence of minimizers { x k} will converge to optimal x. Question: how to construct u ( x,x k)? Ying Sun and Daniel P. Palomar MM Algorithm 5 / 65
6 Problem Statement Consider the following optimization problem minimize f (x) x subject to x X, with X being a closed convex set and f (x) being continuous. f (x) is too complicated to manipulate. Idea: successively minimize an approximating function u ( x,x k) ( x k+1 = arg min u x,x k), x X hoping the sequence of minimizers { x k} will converge to optimal x. Question: how to construct u ( x,x k)? Ying Sun and Daniel P. Palomar MM Algorithm 5 / 65
7 Problem Statement Consider the following optimization problem minimize f (x) x subject to x X, with X being a closed convex set and f (x) being continuous. f (x) is too complicated to manipulate. Idea: successively minimize an approximating function u ( x,x k) ( x k+1 = arg min u x,x k), x X hoping the sequence of minimizers { x k} will converge to optimal x. Question: how to construct u ( x,x k)? Ying Sun and Daniel P. Palomar MM Algorithm 5 / 65
8 Terminology Distance from a point to a set: Directional derivative: d (x,s ) = inf x s. s S f (x;d) liminf λ 0 f (x + λd) f (x). λ Stationary point: x is a stationary point if f (x;d) 0, d such that x + d X. Ying Sun and Daniel P. Palomar MM Algorithm 6 / 65
9 Majorization-Minimization Construction rule: u (y,y) = f (y), y X u (x,y) f (x), x,y X u (x,y;d) x=y = f (y;d), d with y + d X u (x,y) is continuous in x and y Pictorially: f (x) u ( x,x k) (A1) (A2) (A3) (A4) x k Ying Sun and Daniel P. Palomar MM Algorithm 7 / 65
10 Algorithm Majorization-Minimization (Successive Upper-Bound Minimization): 1: Find a feasible point x 0 X and set k = 0 2: repeat 3: x k+1 = argmin x X u ( x,x k) (global minimum) 4: k k + 1 5: until some convergence criterion is met Ying Sun and Daniel P. Palomar MM Algorithm 8 / 65
11 Convergence Under assumptions A1-A4, every limit point of the sequence { x k } is a stationary point of the original problem. If further assume that the level set X 0 = { x f (x) f ( x 0)} is compact, then ( lim d x k,x ) = 0, k where X is the set of stationary points. Ying Sun and Daniel P. Palomar MM Algorithm 9 / 65
12 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
13 Construct Surrogate Function The performance of Majorization-Minimization algorithm depends crucially on the surrogate function u ( x,x k). Guideline: the global minimizer of u ( x,x k) should be easy to find. Suppose f (x) = f 1 (x) + κ (x), where f 1 (x) is some nice function and κ (x) is the one needed to be approximated. Ying Sun and Daniel P. Palomar MM Algorithm 11 / 65
14 Construction by Convexity Suppose κ (t) is convex, then ( κ i with α i 0 and α i = 1. ) α i t i α i κ (t i ) i Ying Sun and Daniel P. Palomar MM Algorithm 12 / 65
15 For example: ( ) κ w T x = κ (w ( T x x k) + w T x k) ( ) = κ i α i κ i α i ( wi ( xi x k i α i + w T x k ( ( wi xi xi k ) ) + w T x k α i )) If further assume that w and x are positive (α i = w i x k i /w T x k ): ( ) κ w T x i w i xi k ( w T w T x k κ x k ) x i The surrogate functions are separable (parallel algorithm). x k i Ying Sun and Daniel P. Palomar MM Algorithm 13 / 65
16 Construction by Taylor Expansion Suppose κ (x) is concave and differentiable, then ( κ (x) κ x k) ( + κ x k)( x x k), which is a linear upper-bound. Suppose κ (x) is convex and twice differentiable, then ( κ (x) κ x k) + κ (x k) T ( x x k) + 1 (x x k) T M (x x k) 2 if M 2 κ (x) 0, x. Ying Sun and Daniel P. Palomar MM Algorithm 14 / 65
17 Construction by Inequalities Arithmetic-Geometric Mean Inequality: ( n i=1x i ) 1/n Cauchy-Schwartz Inequality: Jensen s Inequality: with κ ( ) being convex. 1 n x xt x k x k κ (Ex) Eκ (x) n x i i=1 Ying Sun and Daniel P. Palomar MM Algorithm 15 / 65
18 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
19 EM Algorithm Assume the complete data set {x, z} consists of observed variable x and latent variable z. Objective: estimate parameter θ Θ from x. Maximum likelihood estimator: ˆθ = argmin θ Θ logp (x θ) EM (Expectation Maximization) algorithm: E-step: evaluate p ( z x,θ k) guess z from current estimate of θ M-step: update θ as θ k+1 = arg min θ Θ u ( θ,θ k), where ( u θ,θ k) = E z x,θ k logp (x,z θ) update θ from guessed complete data set Ying Sun and Daniel P. Palomar MM Algorithm 17 / 65
20 An MM Interpretation of EM The objective function can be written as logp (x θ) = loge z θ p (x z,θ) ( ( p z x,θ k ) ) p (x z,θ) = loge z θ p (z x,θ k ) ( ) p (x z,θ) = loge z x,θ k p (z x,θ k ) p (z θ) ( ) p (x z,θ) E z x,θ k log p (z x,θ k ) p (z θ) (Jensen s Inequality) = E z x,θ k logp (x,z θ) + E z x,θ k p (z x,θ k) }{{} u(θ,θ k ) Ying Sun and Daniel P. Palomar MM Algorithm 18 / 65
21 Proximal Minimization f (x) is convex. Solve min x f (x) by solving the equivalent problem minimize f (x) + 1 x X,y X 2c x y 2. Objective function is strongly convex in both x and y. Algorithm: { x k+1 = arg min f (x) + 1 } x y k 2 x X 2c y k+1 = x k+1. An MM interpretation: x k+1 = arg min x X { f (x) + 1 2c } x x k 2 Ying Sun and Daniel P. Palomar MM Algorithm 19 / 65
22 DC Programming Consider the unconstrained problem minimize x R n f (x), where f (x) = g (x)+h(x) with g (x) convex and h (x) concave. DC (Difference of Convex) Programming generates { x k} by solving ( g x k+1) ( = h x k). An MM interpretation: x k+1 = arg min x { g (x) + h (x k) T ( x x k)}. Ying Sun and Daniel P. Palomar MM Algorithm 20 / 65
23 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
24 Power Control by GP [Chi-Tan-Pal-O Ne-Jul J07] Problem: maximize system throughput. Essentially we need to solve the following problem: minimize P P j i G ij P j +n i j G ij P j +n i Objective function is the ratio of two posynomials. Minorize a posynomial, denoted by g (x) = i m i (x), by mononial: ( ) mi (x) αi g (x) i α i where α i = m i(x k ). (Arithmetic-Geometric Mean Inequality) g(x k ) Solution: approximate the denominator posynomial j G ij P j + n i by monomial. Ying Sun and Daniel P. Palomar MM Algorithm 22 / 65
25 Reweighted l 1 -norm Sparsity signal recovery problem minimize x subject to x 0 Ax = b l 1 -norm approximation minimize x subject to x 1 Ax = b General form minimize x subject to n i=1 φ ( x i ) Ax = b Ying Sun and Daniel P. Palomar MM Algorithm 23 / 65
26 [Can-Wak-Boy J08] Assume φ (t) is concave nondecreasing, at xi k, φ ( x i ) is majorized by wi k x i with wi k = φ (t) t= xi. At each iteration a weighted l 1 -norm is solved minimize w k 1 x i x i 1 subject to Ax = b x 0 x x log 1 x Ying Sun and Daniel P. Palomar MM Algorithm 24 / 65
27 Sparse Generalized Eigenvalue Problem l 0 -norm regularized generalized eigenvalue problem maximize x T Ax ρ x x 0 subject to x T Bx = 1. Replace x i 0 by some nicely behaved function g p (x i ) x i p, 0 < p 1 log(1 + x i /p)/log (1 + 1/p), p > 0 1 e x i /p, p > 0. Take g p (x i ) = x i p for example. Ying Sun and Daniel P. Palomar MM Algorithm 25 / 65
28 [Son-Bab-Pal J15a] Majorize g p (x i ) at xi k by quadratic function wi k xi 2 + ci k. The surrogate function for g p (x i ) = x i p is defined as ( ) u x i,xi k = p 2 x k i p 2 ( xi p ) x k p 2 i. Solve at each iteration the following GEVP: maximize x T Ax ρx T diag ( w k) x x subject to x T Bx = 1 However, as x i 0, w i +... Ying Sun and Daniel P. Palomar MM Algorithm 26 / 65
29 Smooth approximation of g p (x): { p gp ε (x) = 2 εp 2 x 2, x ε x p ( 1 p ) 2 ε p, x > ε When x ε, w remains to be a constant. Ying Sun and Daniel P. Palomar MM Algorithm 27 / 65
30 objective value IRQM (proposed) DC SGEP iterations Ying Sun and Daniel P. Palomar MM Algorithm 28 / 65
31 Sequence Design Complex unimodular sequence {x n C} N n=1. Autocorrelation: r k = N n=k+1 x nxn k = r k, k = 0,...,N 1. Integrated sidelobe level (ISL): Problem formulation: N 1 ISL = k=1 r k 2. minimize {x n } N n=1 subject to ISL x n = 1, n = 1,...,N. Ying Sun and Daniel P. Palomar MM Algorithm 29 / 65
32 By Fourier transform: ISL 2N p=1 [ a H p x 2 N with x = [x 1,...,x N ] T, a p = [ 1,e jω p,...,e jω p(n 1) ] T and ω p = 2π 2N (p 1). Equivalent problem: ] 2 minimize x subject to 2N ( p=1 a H p xx H ) 2 a p x n = 1, n. Ying Sun and Daniel P. Palomar MM Algorithm 30 / 65
33 [Son-Bab-Pal J15b] Define A = [a 1,...,a 2N ], p k = [ a H 1 x k 2,..., a H 2N x k 2] T, Ã = A ( diag ( p k) p k maxi ) A H. Quadratic surrogate function: const. pmaxx k H AA H x + 2Re (x (Ã ( H 2N 2 x k x k) ) ) H x k Equivalent to with minimize x subject to x y 2 x n = 1, n ( ( y = Ã 2N 2 x k x k) ) H x k Closed-form solution: x n = e j arg(y n). Ying Sun and Daniel P. Palomar MM Algorithm 31 / 65
34 55 50 benchmark proposed method Integrated sidelobe level (db) iteration Ying Sun and Daniel P. Palomar MM Algorithm 32 / 65
35 Covariance Estimation x i elliptical(0,σ) Fitting normalized sample s i = Gaussian distribution x i x i 2 to Angular Central f (s i ) det(σ) 1/2 ( s T i Σ 1 s i ) K/2 [Sun-Bab-Pal J14] Shrinkage penalty Solve the following problem: h (Σ) = log det(σ) + Tr ( Σ 1 T ) minimize log det(σ) + K Σ N log( x T i Σ 1 ) x i + αh (Σ) subject to Σ 0 Ying Sun and Daniel P. Palomar MM Algorithm 33 / 65
36 At Σ k, the objective function is majorized by (1 + α)logdet(σ) + K N N i=1 Surrogate function is convex in Σ 1. x T i x T i Σ 1 x i ( (Σ k) + αtr 1 Σ 1 T ) xi Setting the gradient to zero leads to the weighted sample average Σ k+1 = 1 K 1 + α N x T i x i x T i ( Σ k) + α α T xi Ying Sun and Daniel P. Palomar MM Algorithm 34 / 65
37 20 18 Objective Function Value Iteration Ying Sun and Daniel P. Palomar MM Algorithm 35 / 65
38 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
39 Problem Statement Block Coordinate Descent Consider the following problem minimize x X f (x) Set X possesses Cartesian product structure X = m i=1 X i. Observation: the problem minimize f ( x 0 1 x i X,...,x0 i 1,x i,x 0 ) i+1,...,x0 m i with x 0 i taking some feasible value, is easy to solve. Ying Sun and Daniel P. Palomar MM Algorithm 37 / 65
40 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
41 Block Coordinate Descent (BCD) Block Coordinate Descent Denote x (x 1,...,x m ), f ( x 0 1,...,x0 i 1,x i,x 0 ) ( i+1,...,x0 m f xi,x 0) Block Coordinate Descent (nonlinear Gauss-Seidel) 1: Initialize x 0 X and set k = 0. 2: repeat 3: k = k + 1, i = (k mod n) + 1 4: x k i = argmin xi X i f ( x i,x k 1) 5: x k i x k 1 i, k i 6: until some convergence criterion is met Ying Sun and Daniel P. Palomar MM Algorithm 39 / 65
42 Convergence Block Coordinate Descent [Ber B99] Assume that f (x) is continuously differentiable over the set X. x k i = argmin xi X i f ( x i,x k 1) has a unique solution. Then every limit point of the sequence { x k} is a stationary point. [Gri-Sci J00] Generalizations globally convergent for m = 2. f is component-wise strictly quasi-convex w.r.t. m 2 components. f is pseudo-convex. Ying Sun and Daniel P. Palomar MM Algorithm 40 / 65
43 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
44 BS-MM Algorithm Block Coordinate Descent Combination of MM and BCD Block Succesive Majorization-Minimization (BS-MM): 1: Initialize x 0 X and set k = 0. 2: repeat 3: k = k + 1, i = (k mod n) + 1 4: X k = arg min xi X i u i ( xi,x k 1) 5: Set x k i to be an arbitrary element in X k 6: x k i x k 1 i, k i 7: until some convergence criterion is met Generalization of BCD Ying Sun and Daniel P. Palomar MM Algorithm 42 / 65
45 Convergence Block Coordinate Descent Surrogate function u i (, ) satisfies the following assumptions u i (y i,y) = f (y), y X, i u i (x i,y) f (y 1,...,y i 1,x i,y i+1,...,y n ), x i X i, y X, i u i (x i,y;d i ) xi = f (y;d), =y i d = (0,...,d i,...,0) such that y i + d i X i, i u i (x i,y) is continuous in (x i,y), i (B1) (B2) (B3) (B4) In short, u i ( xi,x k) majorizes f (x) on the ith block. Ying Sun and Daniel P. Palomar MM Algorithm 43 / 65
46 Block Coordinate Descent Under assumptions B1-B4, for simplicity additionally assume that f is continuously differentiable, u i (x i,y) is quasi-convex in x i, each subproblem min xi X i u i ( xi,x k 1) has a unique solution for any x k 1 X, then every limit point of { x k} is a stationary point. level set X 0 = { x f (x) f ( x 0)} is compact, each subproblem min xi X i u i ( xi,x k 1) has a unique solution for any x k 1 X for at least m 1 blocks, then lim k d ( x k,x ) = 0. More restrictive assumption than MM due to the cyclic update behavior. Ying Sun and Daniel P. Palomar MM Algorithm 44 / 65
47 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
48 Alternating Proximal Minimization Block Coordinate Descent Consider the problem minimize f (x 1,...,x m ) x subject to x i X i, with f ( ) being convex in each block. The convergence of BCD is not easy to establish since each subproblem may have multiple solutions. Alternating Proximal Minimization solves minimize x i subject to f ( x k 1,...,xk i 1,x i,x k i+1,...,xk m x i X i Strictly convex objective unique minimizer ) + 1 2c x i x k 2 i Ying Sun and Daniel P. Palomar MM Algorithm 46 / 65
49 Proximal Splitting Algorithm Block Coordinate Descent Consider the following problem minimize x subject to m i=1 f i (x i ) + f m+1 (x 1,...,x m ) x i X i, i = 1,...,m with f i convex and lower semicontinuous, f m+1 convex and Cyclically update: f m+1 (x) f m+1 (y) β i x i y i x k+1 i = prox γfi ( x k i γ xi f m+1 (x k)), with the proximity operator defined as prox f (x) = arg min y X f (y) x y 2. Ying Sun and Daniel P. Palomar MM Algorithm 47 / 65
50 Proximal Splitting Algorithm Block Coordinate Descent BS-MM interpretation: ( u i x i,x k) = f i (x i ) + 1 2γ Check: + f j (x k j j i f m+1 ( x k) + 1 2γ ( f m+1 x k) + β i ) 2 f m+1 (x k i,x i x i x k i ) + f m+1 ( x k i,x i x i x k i x i x k i 2 + xi f m+1 ( ). x k) T (x i x k i 2 ( + xi f m+1 x k) T ) (x i x k i 2 ( + xi f m+1 x k) T ) (x i x k i with γ [ε i,2/β i ε i ] and ε i (0,min{1,1/β i }). ) (Descent lemma) Ying Sun and Daniel P. Palomar MM Algorithm 48 / 65
51 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
52 Block Coordinate Descent Robust Estimation of Location and Scatter x i elliptical(µ,r) [Sun-Bab-Pal J15] Fitting x i to a Cauchy distribution with pdf ( ) f (x) det(r) 1/2 1 + (x i µ) T (K+1)/2 R 1 (x i µ) Shrinkage penalty h (t,t) = K log ( Tr ( R 1 T )) +logdet(r)+log Solve the following problem: minimize µ,r 0 ( ) 1 + (t µ) T R 1 (t µ) ) logdet(r) + K+1 N N i=1 (1 log + (x i µ) T R 1 (x i µ) +αh (t,t) Ying Sun and Daniel P. Palomar MM Algorithm 50 / 65
53 Block Coordinate Descent BS-MM Algorithm update: µ t+1 = (K + 1) N i=1 w i (µ t,r t )x i + Nαw t (µ t,r t )t (K + 1) N i=1 w i (µ t,r t ) + Nαw t (µ t,r t ) R t+1 = K + 1 N ( )( )( ) T N + Nα w i µt+1,r t xi µ t+1 xi µ t+1 i=1 + Nα N + Nα w ( )( t µt+1,r t µt+1 t )( µ t+1 t ) T + NαK N + Nα ( Tr T R 1 t ) T where w i (µ,r) = w t (µ,r) = (x i µ) T R 1 (x i µ) (t µ) T R 1 (t µ). Ying Sun and Daniel P. Palomar MM Algorithm 51 / 65
54 Block Coordinate Descent Objective function value Iterations Ying Sun and Daniel P. Palomar MM Algorithm 52 / 65
55 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
56 Exact Jacobi Successive Convex Approximation Extensions Problem Statement Consider the following problem: minimize x subject to f (x) x i X i where the X i s are closed and convex sets, f (x) = L l=1 f l (x 1,...,x m ). Conditional gradient update (Frank-Wolfe): direction d k x k x k with x k+1 = x k + γ k d k x k i = arg min x i X i xi f (x k) T ( ) x i x k i step-size γ k (0,1], chosen to guarantee convergence. Ying Sun and Daniel P. Palomar MM Algorithm 54 / 65
57 Exact Jacobi Successive Convex Approximation Extensions Exact Jacobi SCA Algorithm Idea: Conditional gradient update linearize all the f l s at x k. Each function f l might be convex w.r.t. some block x i. We want to preserve the convex property of f l ( xi,x k i). Solution: keep the convex f l ( xi,x k i) s and linearize the others. Ying Sun and Daniel P. Palomar MM Algorithm 55 / 65
58 Exact Jacobi Successive Convex Approximation Extensions Define C i as the set of indices of l such that f l ( xi,x k i) is convex. Approximate f (x) on the ith block at point x k : with ( f i x i,x k) ( ) = f l x i,x k i + π i (x k) T ( ) x i x k i l C i }{{}}{{} linear approx. convex terms non-convex terms + τ ( ) i T ( x 2 i x k i Hi x k)( ) x i x k i, }{{} proximal term π i (x k) = xi f l (x k) ( and H i x k) c Hi I. l / C i Ying Sun and Daniel P. Palomar MM Algorithm 56 / 65
59 Exact Jacobi Successive Convex Approximation Extensions Exact Jacobi SCA update: ( ) ( ˆx i x k,τ i = arg min f i x i,x k) x i X i Step-size rule x k+1 = x k + γ k (ˆx x k) constant step-size that depends on the Lipschitz constant of f diminishing step-size Remark: update of the blocks can be done sequentially (Gauss-Seidel SCA Algorithm) Ying Sun and Daniel P. Palomar MM Algorithm 57 / 65
60 Outline 1 The Majorization-Minimization Algorithm 2 Block Coordinate Descent 3 Exact Jacobi Successive Convex Approximation Extensions
61 Exact Jacobi Successive Convex Approximation Extensions Extensions FLEXA non-smooth objective function inexact update direction flexible block update choice HyFLEXA Ying Sun and Daniel P. Palomar MM Algorithm 59 / 65
62 Exact Jacobi Successive Convex Approximation Extensions Comparison BS-MM FLEXA convergence stationary point stationary point objective function continuous continuous may not be smooth may not be smooth constraint set Cartesian Cartesian & convex update rule sequential sequential or parallel global upper-bound local approximation approx. function unique minimizer not required can be non-convex convex approx. Ying Sun and Daniel P. Palomar MM Algorithm 60 / 65
63 Exact Jacobi Successive Convex Approximation Extensions Summary We have studied Majorization-Minimization algorithm Block Coordinate Descent algorithm algorithm We have briefly introduced Distributed Successive Convex Approximation algorithm Ying Sun and Daniel P. Palomar MM Algorithm 61 / 65
64 References I D. R. Hunter and K. Lange. A tutorial on MM algorithms. Amer. Statistician, volume 58, pp , M. Razaviyayn, M. Hong, and Z. Luo. A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J. Optim., volume 23, no. 2, pp , G. Scutari, F. Facchinei, P. Song, D. Palomar, and J.-S. Pang. Decomposition by partial linearization: Parallel optimization of multi-agent systems. IEEE Trans. Signal Process., volume 62, no. 3, pp , Y. Sun, P. Babu, and D. Palomar. Majorization-minimization algorithms in signal processing, communications, and machine learning. IEEE Transactions on Signal Processing, volume 65, no. 3, pp , M. Chiang, C. W. Tan, D. Palomar, D. O Neill, and D. Julian. Power control by geometric programming. IEEE Trans. Wireless Commun, volume 6, no. 7, pp , 2007.
65 References II E. J. Candes, M. Wakin, and S. Boyd. Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl., volume 14, no. 5-6, pp , J. Song, P. Babu, and D. P. Palomar. Sparse generalized eigenvalue problem via smooth optimization. IEEE Trans. Signal Process., volume 63, no. 7, pp , Optimization methods for designing sequences with low autocorrelation sidelobes. IEEE Trans. Signal Process., volume 63, no. 15, pp , Y. Sun, P. Babu, and D. P. Palomar. Regularized Tyler s scatter estimator: Existence, uniqueness, and algorithms. IEEE Trans. Signal Processing, volume 62, no. 19, pp , D. P. Bertsekas. Nonlinear Programming. Athena Scientific, L. Grippo and M. Sciandrone. On the convergence of the block nonlinear gauss seidel method under convex constraints. Oper. Res. Lett., volume 26, no. 3, pp , 2000.
66 References III Y. Sun, P. Babu, and D. P. Palomar. Regularized robust estimation of mean and covariance matrix under heavy-tailed distributions. IEEE Transactions on Signal Processing, volume 63, no. 12, pp , 2015.
67 Thanks For more information visit:
Majorization Minimization (MM) and Block Coordinate Descent (BCD)
Majorization Minimization (MM) and Block Coordinate Descent (BCD) Wing-Kin (Ken) Ma Department of Electronic Engineering, The Chinese University Hong Kong, Hong Kong ELEG5481, Lecture 15 Acknowledgment:
More informationConvex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)
Convex Functions Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Definition convex function Examples
More informationSum-Power Iterative Watefilling Algorithm
Sum-Power Iterative Watefilling Algorithm Daniel P. Palomar Hong Kong University of Science and Technolgy (HKUST) ELEC547 - Convex Optimization Fall 2009-10, HKUST, Hong Kong November 11, 2009 Outline
More informationParallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization
Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization Meisam Razaviyayn meisamr@stanford.edu Mingyi Hong mingyi@iastate.edu Zhi-Quan Luo luozq@umn.edu Jong-Shi Pang jongship@usc.edu
More informationAlternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications
Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization
More informationMajorization Minimization - the Technique of Surrogate
Majorization Minimization - the Technique of Surrogate Andersen Ang Mathématique et de Recherche opérationnelle Faculté polytechnique de Mons UMONS Mons, Belgium email: manshun.ang@umons.ac.be homepage:
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationBlock Coordinate Descent for Regularized Multi-convex Optimization
Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM
ELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM Wing-Kin Ma, Dept. Electronic Eng., The Chinese University of Hong Kong 1 Some Basics A function f : R n R with domf = R n ++, defined
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationRecent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables
Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop
More informationBLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS
BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS Songtao Lu, Ioannis Tsaknakis, and Mingyi Hong Department of Electrical
More informationarxiv: v1 [math.oc] 1 Jul 2016
Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationContraction Methods for Convex Optimization and Monotone Variational Inequalities No.16
XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of
More informationIterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming
Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study
More informationA New Trust Region Algorithm Using Radial Basis Function Models
A New Trust Region Algorithm Using Radial Basis Function Models Seppo Pulkkinen University of Turku Department of Mathematics July 14, 2010 Outline 1 Introduction 2 Background Taylor series approximations
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationA Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization
A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.
More informationDistributed Optimization via Alternating Direction Method of Multipliers
Distributed Optimization via Alternating Direction Method of Multipliers Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato Stanford University ITMANET, Stanford, January 2011 Outline precursors dual decomposition
More informationELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications
ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationCoordinate descent methods
Coordinate descent methods Master Mathematics for data science and big data Olivier Fercoq November 3, 05 Contents Exact coordinate descent Coordinate gradient descent 3 3 Proximal coordinate descent 5
More informationIndex Tracking in Finance via MM
Index Tracking in Finance via MM Prof. Daniel P. Palomar and Konstantinos Benidis The Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong
More informationLARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION
LARGE-SCALE NONCONVEX STOCHASTIC OPTIMIZATION BY DOUBLY STOCHASTIC SUCCESSIVE CONVEX APPROXIMATION Aryan Mokhtari, Alec Koppel, Gesualdo Scutari, and Alejandro Ribeiro Department of Electrical and Systems
More informationA Brief Overview of Practical Optimization Algorithms in the Context of Relaxation
A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation Zhouchen Lin Peking University April 22, 2018 Too Many Opt. Problems! Too Many Opt. Algorithms! Zero-th order algorithms:
More informationOptimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29
Optimization Yuh-Jye Lee Data Science and Machine Intelligence Lab National Chiao Tung University March 21, 2017 1 / 29 You Have Learned (Unconstrained) Optimization in Your High School Let f (x) = ax
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationOn the iterate convergence of descent methods for convex optimization
On the iterate convergence of descent methods for convex optimization Clovis C. Gonzaga March 1, 2014 Abstract We study the iterate convergence of strong descent algorithms applied to convex functions.
More informationVariable Metric Forward-Backward Algorithm
Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with
More informationFast proximal gradient methods
L. Vandenberghe EE236C (Spring 2013-14) Fast proximal gradient methods fast proximal gradient method (FISTA) FISTA with line search FISTA as descent method Nesterov s second method 1 Fast (proximal) gradient
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationProximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization
Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R
More informationComputing the MLE and the EM Algorithm
ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations
More informationDescent methods. min x. f(x)
Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained
More informationNon-smooth Non-convex Bregman Minimization: Unification and New Algorithms
JOTA manuscript No. (will be inserted by the editor) Non-smooth Non-convex Bregman Minimization: Unification and New Algorithms Peter Ochs Jalal Fadili Thomas Brox Received: date / Accepted: date Abstract
More informationProximal Methods for Optimization with Spasity-inducing Norms
Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology
More informationNon-smooth Non-convex Bregman Minimization: Unification and new Algorithms
Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms Peter Ochs, Jalal Fadili, and Thomas Brox Saarland University, Saarbrücken, Germany Normandie Univ, ENSICAEN, CNRS, GREYC, France
More informationMobile Network Energy Efficiency Optimization in MIMO Multi-Cell Systems
1 Mobile Network Energy Efficiency Optimization in MIMO Multi-Cell Systems Yang Yang and Dario Sabella Intel Deutschland GmbH, Neubiberg 85579, Germany. Emails: yang1.yang@intel.com, dario.sabella@intel.com
More informationGeneralized Gradient Descent Algorithms
ECE 275AB Lecture 11 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 11 ECE 275A Generalized Gradient Descent Algorithms ECE 275AB Lecture 11 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San
More informationLasso: Algorithms and Extensions
ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions
More informationARock: an algorithmic framework for asynchronous parallel coordinate updates
ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,
More informationEstimators based on non-convex programs: Statistical and computational guarantees
Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright
More informationOn the Convergence of the Concave-Convex Procedure
On the Convergence of the Concave-Convex Procedure Bharath K. Sriperumbudur and Gert R. G. Lanckriet UC San Diego OPT 2009 Outline Difference of convex functions (d.c.) program Applications in machine
More informationDual Proximal Gradient Method
Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization
More informationA memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration
A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration E. Chouzenoux, A. Jezierska, J.-C. Pesquet and H. Talbot Université Paris-Est Lab. d Informatique Gaspard
More informationDual and primal-dual methods
ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method
More information9. Dual decomposition and dual algorithms
EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple
More informationarxiv: v1 [math.oc] 7 Dec 2018
arxiv:1812.02878v1 [math.oc] 7 Dec 2018 Solving Non-Convex Non-Concave Min-Max Games Under Polyak- Lojasiewicz Condition Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee University of Southern California
More information1 Sparsity and l 1 relaxation
6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the
More informationMinimax Problems. Daniel P. Palomar. Hong Kong University of Science and Technolgy (HKUST)
Mini Problems Daniel P. Palomar Hong Kong University of Science and Technolgy (HKUST) ELEC547 - Convex Optimization Fall 2009-10, HKUST, Hong Kong Outline of Lecture Introduction Matrix games Bilinear
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationOptimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method
Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method Davood Hajinezhad Iowa State University Davood Hajinezhad Optimizing Nonconvex Finite Sums by a Proximal Primal-Dual Method 1 / 35 Co-Authors
More informationDual Decomposition.
1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:
More informationProximal Minimization by Incremental Surrogate Optimization (MISO)
Proximal Minimization by Incremental Surrogate Optimization (MISO) (and a few variants) Julien Mairal Inria, Grenoble ICCOPT, Tokyo, 2016 Julien Mairal, Inria MISO 1/26 Motivation: large-scale machine
More informationCoordinate Descent and Ascent Methods
Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:
More informationOn the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method
Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationLecture 2 February 25th
Statistical machine learning and convex optimization 06 Lecture February 5th Lecturer: Francis Bach Scribe: Guillaume Maillard, Nicolas Brosse This lecture deals with classical methods for convex optimization.
More informationELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications
ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February
More informationMaster 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique
Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some
More informationOptimization for Learning and Big Data
Optimization for Learning and Big Data Donald Goldfarb Department of IEOR Columbia University Department of Mathematics Distinguished Lecture Series May 17-19, 2016. Lecture 1. First-Order Methods for
More informationLagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)
Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual
More informationNonlinear Optimization for Optimal Control
Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]
More informationWritten Examination
Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes
More informationVector Derivatives and the Gradient
ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 10 ECE 275A Vector Derivatives and the Gradient ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego
More informationSparse Approximation via Penalty Decomposition Methods
Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationTaylor-like models in nonsmooth optimization
Taylor-like models in nonsmooth optimization Dmitriy Drusvyatskiy Mathematics, University of Washington Joint work with Ioffe (Technion), Lewis (Cornell), and Paquette (UW) SIAM Optimization 2017 AFOSR,
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationOn the convergence of a regularized Jacobi algorithm for convex optimization
On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi
More informationContraction Methods for Convex Optimization and monotone variational inequalities No.12
XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department
More informationSolving Dual Problems
Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem
More informationAlgorithms for constrained local optimization
Algorithms for constrained local optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Algorithms for constrained local optimization p. Feasible direction methods Algorithms for constrained
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationELE539A: Optimization of Communication Systems Lecture 16: Pareto Optimization and Nonconvex Optimization
ELE539A: Optimization of Communication Systems Lecture 16: Pareto Optimization and Nonconvex Optimization Professor M. Chiang Electrical Engineering Department, Princeton University March 16, 2007 Lecture
More informationConvex optimization problems. Optimization problem in standard form
Convex optimization problems optimization problem in standard form convex optimization problems linear optimization quadratic optimization geometric programming quasiconvex optimization generalized inequality
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More information6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games
6.254 : Game Theory with Engineering Applications Lecture 7: Asu Ozdaglar MIT February 25, 2010 1 Introduction Outline Uniqueness of a Pure Nash Equilibrium for Continuous Games Reading: Rosen J.B., Existence
More informationA New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction
A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization
More informationPenalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques
More informationON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS
ON THE GLOBAL AND LINEAR CONVERGENCE OF THE GENERALIZED ALTERNATING DIRECTION METHOD OF MULTIPLIERS WEI DENG AND WOTAO YIN Abstract. The formulation min x,y f(x) + g(y) subject to Ax + By = b arises in
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:
More informationAn Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods
An Accelerated Hybrid Proximal Extragradient Method for Convex Optimization and its Implications to Second-Order Methods Renato D.C. Monteiro B. F. Svaiter May 10, 011 Revised: May 4, 01) Abstract This
More informationQUADRATIC MAJORIZATION 1. INTRODUCTION
QUADRATIC MAJORIZATION JAN DE LEEUW 1. INTRODUCTION Majorization methods are used extensively to solve complicated multivariate optimizaton problems. We refer to de Leeuw [1994]; Heiser [1995]; Lange et
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationSparsity Regularization
Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation
More information