Majorization Minimization (MM) and Block Coordinate Descent (BCD)

Size: px
Start display at page:

Download "Majorization Minimization (MM) and Block Coordinate Descent (BCD)"

Transcription

1 Majorization Minimization (MM) and Block Coordinate Descent (BCD) Wing-Kin (Ken) Ma Department of Electronic Engineering, The Chinese University Hong Kong, Hong Kong ELEG5481, Lecture 15 Acknowledgment: Qiang Li for helping prepare the slides.

2 Outline Majorization Minimization (MM) Convergence Applications Block Coordinate Descent (BCD) Applications Convergence Summary 1

3 Consider the following problem Majorization Minimization min x f(x) s.t. x X (1) where X is a closed convex set; f( ) may be non-convex and/or nonsmooth. Challenge: For a general f( ), problem (1) can be difficult to solve. Majorization Minimization: Iteratively generate {x r } as follows x r min x u(x, x r 1 ) s.t. x X (2) where u(x, x r 1 ) is a surrogate function of f(x), satisfying 1. u(x, x r ) f(x), x r, x X ; 2. u(x r, x r ) = f(x r ); 2

4 f(x) f(x) u(x;x 1 ) u(x;x 0 ) x? x 2 x 1 x 0 Figure 1: An pictorial illustration of MM algorithm. x Property 1. {f(x r )} is nonincreasing, i.e., f(x r ) f(x r 1 ), r = 1, 2,.... Proof. f(x r ) u(x r, x r 1 ) u(x r 1, x r 1 ) = f(x r 1 ) The nonincreasing property of {f(x r )} implies that f(x r ) f. But how about the convergence of the iterates {x r }? 3

5 Technical Preliminaries Limit point: x is a limit point of {x k } if there exists a subsequence of {x k } that converges to x. Note that every bounded sequence in R n has a limit point (or convergent subsequence); Directional derivative: Let f : D R be a function where D R m is a convex set. The directional derivative of f at point x in direction d is defined by f (x; d) lim inf λ 0 f(x + λd) f(x). λ If f is differentiable, then f (x; d) = d T f(x). Stationary point: x X is a stationary point of f( ) if f (x; d) 0, d such that x + d D. (3) A stationary point may be a local min., a local max. or a saddle point; If D = R n and f is differentiable, then (3) f(x) = 0. 4

6 Convergence of MM Assumption 1 u(, ) satisfies the following conditions u(y, y) = f(y), y X, (4a) u(x, y) f(x), x, y X, (4b) u (x, y; d) x=y = f (y; d), d with y + d X, (4c) u(x, y) is continuous in (x, y), (4d) (4c) means the 1st order local behavior of u(, x r 1 ) is the same as f( ). 5

7 Convergence of MM Theorem 1. [Razaviyayn-Hong-Luo] Assume that Assumption 1 is satisfied. Then every limit point of the iterates generated by MM algorithm is a stationary point of problem (1). Proof. From Property 1, we know that f(x r+1 ) u(x r+1, x r ) u(x, x r ), x X. Now assume that there exists a subsequence {x r j} of {x r } converging to a limit point z, i.e., lim j x r j = z. Then u(x r j+1, x r j+1 ) = f(x r j+1 ) f(x r j+1 ) u(x r j+1, x r j ) u(x, x r j ), x X. Letting j, we obtain u(z, z) u(x, z), x X, which implies that u (x, z; d) x=z 0, z + d X. Combining the above inequality with (4c) (i.e., u (x, y; d) x=y = f (y; d), d with y + d X ), we have f (z; d) 0, z + d X. 6

8 Applications Nonnegative Least Squares In many engineering applications, we encounter the following problem (NLS) min Ax x 0 b 2 2 (5) where b R m +, b 0, and A R m n ++. It s an LS problem with nonnegative constraints, so the conventional LS solution may not be feasible for (5). A simple multiplicative updating algorithm: x r l = c r l x r 1 l, l = 1,..., n (6) where x r l is the lth component of xr, and c r l = [AT b] l. [A T Ax r 1 ] l Starting with x 0 > 0, then all x r generated by (6) are nonnegative. 7

9 Number of iterations Figure 2: Ax r b 2 vs. the number of iterations. Usually the multiplicative update converges within a few tens of iterations. 8

10 MM interpretation: Let f(x) Ax b 2 2. The multiplicative update essentially solves the following problem where x r 1 1 min u(x, x 0 xr 1 ) u(x, x r 1 ) f(x r 1 ) + (x x r 1 ) T f(x r 1 ) (x xr 1 ) T Φ(x r 1 )(x x r 1 ), ( [A Φ(x r 1 T Ax r 1 ] 1 ) = Diag,..., [AT Ax r 1 ) ] n x r 1. n Observations: { u(x, x r 1 ) is quadratic approx. of f(x), Φ(x r 1 ) A T A, = { u(x, x r 1 ) f(x), x R n, u(x r 1, x r 1 ) = f(x r 1 ). The multiplicative update converges to an optimal solution of NLS (by the MM convergence in Theorem 1 and convexity of NLS). 9

11 Applications Convex-Concave Procedure/ DC Programming Suppose that f(x) has the following form f(x) = g(x) h(x), where g(x) and h(x) are convex and differentiable. nonconvex. Thus, f(x) is in general DC Programming: Construct u(, ) as u(x, x r ) = g(x) ( h(x r ) + x h(x r ) T (x x r ) ) }{{} linearization of h at x r By the 1st order condition of h(x), it s easy to show that u(x, x r ) f(x), x X, u(x r, x r ) = f(x r ). 10

12 Sparse Signal Recovery by DC Programming min x 0 s.t. y = Ax (7) x Apart from the popular l 1 approximation, consider the following concave approximation: 2 min x R n n log(1 + x i /ɛ) i=1 s.t. y = Ax, Figure 3: log(1 + x /ɛ) promotes more sparsity than l 1 11

13 min x R n n log(1 + x i /ɛ) i=1 which can be equivalently written as s.t. y = Ax, min x,z R n n log(z i + ɛ) s.t. y = Ax, x i z i, i = 1,..., n (8) i=1 Problem (8) minimizes a concave objective, so it s a special case of DC programming (g(x) = 0). Linearizing the concave function at (x r, z r ) yields (x r+1, z r+1 ) = arg min n i=1 z i z r i + ɛ s.t. y = Ax, x i z i, i = 1,..., n We solve a sequence of reweighted l 1 problems. 12

14 (1) (1) J Fourier Anal Appl (2008) 14: Fig. 2 Sparse signal recovery through reweighted l 1 iterations. (a) Original length n = 512 signal x 0 with 130 spikes. (b) Scatter plot, coefficient-by-coefficient, of x 0 versus its reconstruction x (0) using unweighted l 1 minimization. (c) Reconstruction x (1) after the first reweighted iteration. (d) Reconstruction x (2) after the second reweighted iteration 13

15 Applications l 2 l p Optimization Many problems involve solving the following problem (e.g., basis-pursuit denoising) min x f(x) 1 2 y Ax µ x p (9) where p 1. If A = I or A is unitary, optimal x is computed in closed-form as x = A T y Proj C (A T y) where C {x : x p µ}, p is the dual norm of p and Proj C denotes the projection operator. In particular, for p = 1 x i = soft(y i, µ), i = 1,..., n where soft(u, a) sign(u) max{ u a, 0} denotes a soft-thresholding operation. For general A, there is no simple closed-form solution for (9). 14

16 MM for l 2 l p Problem: Consider a modified l 2 l p problem min x u(x, x r ) f(x) + dist(x, x r ) (10) where dist(x, x r ) c 2 x xr Ax Axr 2 2 and c > λ max (A T A). dist(x, x r ) 0 x = u(x, x r ) majorizes f(x). u(x, x r ) can be reexpressed as u(x, x r ) = c 2 x xr µ x p + const., where x r = 1 c AT (y Ax r ) + x r. The modified l 2 l p problem (10) has a simple soft-thresholding solution. Repeatedly solving problem (10) leads to an optimal solution of the l 2 l p problem (by the MM convergence in Theorem 1 ) 15

17 Applications Expectation Maximization (EM) Consider an ML estimate of θ, given the random observation w ˆθ ML = arg min θ ln p(w θ) Suppose that there are some missing data or hidden variables z in the model. Then, EM algorithm iteratively compute an ML estimate ˆθ as follows: E-step: g(θ, θ r ) E z w,θ r{ln p(w, z θ)} M-step: θ r+1 = arg max g(θ, θ r ) θ repeat the above two steps until convergence. EM algorithm generates a nonincreasing sequence of { ln p(w θ r )}. EM algorithm can be interpreted by MM. 16

18 MM interpretation of EM algorithm: ln p(w θ) = ln E z θ p(w z, θ) [ p(z w, θ r ] )p(w z, θ) = ln E z θ p(z w, θ r ) [ ] p(z θ)p(w z, θ) = ln E z w,θ r p(z w, θ r (interchange the integrations) ) [ ] p(z θ)p(w z, θ) E z w,θ r ln p(z w, θ r (Jensen s inequality) ) = E z w,θ r ln p(w, z θ) + E z w,θ r ln p(z w, θ r ) (11a) u(θ, θ r ) u(θ, θ r ) majorizes ln p(w θ), and ln p(w θ r ) = u(θ r, θ r ); E-step essentially constructs u(θ, θ r ); M-step minimizes u(θ, θ r ) (note θ appears in the 1st term of (11a) only). 17

19 Outline Majorization Minimization (MM) Convergence Applications Block Coordinate Descent (BCD) Applications Convergence Summary 18

20 Block Coordinate Descent Consider the following problem min x f(x) s.t. x X = X 1 X 2... X m R n (12) where each X i R n i is closed, nonempty and convex. BCD Algorithm: 1: Find a feasible point x 0 X and set r = 0 2: repeat 3: r = r + 1, i = (r 1 mod m) + 1 4: Let x i arg min x X i f(x r 1 1,..., x r 1 i 1, x, xr 1 i+1,..., xr 1 m ) 5: Set x r i = x i and xr k = xr 1 k, k i 6: until some convergence criterion is met Merits of BCD 1. each subproblem is much easier to solve, or even has a closed-form solution; 2. The objective value is nonincreasing along the BCD updates; 3. it allows parallel or distributed implementations. 19

21 Applications l 2 l 1 Optimization Problem Let us revisit the l 2 l 1 problem min f(x) 1 x R n 2 y Ax µ x 1 (13) Apart from MM, BCD is another efficient approach to solve (13): Optimize x k while fixing x j = x r j, j k: min x k f k (x k ) 1 2 y j k a j x r j }{{} The optimal x k has a closed form: ȳ a k x k µ x k x k = soft ( a T k ȳ/ a k 2, µ/ a k 2) Cyclically update x k, k = 1,..., n until convergence. 20

22 Applications Iterative Water-filling for MIMO MAC Sum Capacity Maximization MIMO Channel Capacity Maximization MIMO received signal model: y(t) = Hx(t) + n(t) where x(t) C N H C N N n(t) C N Tx signal MIMO channel matrix standard additive Gaussian noise, i.e., n(t) CN (0, I). Tx Rx Figure 4: MIMO system model. 21

23 MIMO channel capacity: C(Q) = log det ( I + HQH H) where Q = E{x(t)x(t) H } is the covariance of the tx signal. MIMO channel capacity maximization: max Q 0 log det ( I + HQH H) s.t. Tr(Q) P where P > 0 is the transmit power budget. The optimal Q is given by the well-known water-filling solution, i.e., Q = VDiag(p )V H where H = UDiag(σ 1,..., σ N )V H is the SVD of H, and p = [p 1,..., p N ] is the power allocation with p i = max(0, µ 1/σi 2 ) and µ 0 being the water-level such that i p i = P. 22

24 MIMO Multiple-Access Channel (MAC) Sum-Capacity Maximization Multiple transmitters simultaneously communicate with one receiver: x 1 (t) H 1 n(t) y(t) x K (t) H K Received signal model: MAC sum capacity: Figure 5: MIMO multiple-access channel (MAC). y(t) = K k=1 H kx k (t) + n(t) C MAC ({Q k } K k=1) = log det ( K k=1 H kq k H H k + I ) 23

25 MAC sum capacity maximization: max {Q k } K k=1 log det ( K k=1 H kq k H H k + I ) s.t. Tr(Q k ) P k, Q k 0, k = 1,..., K (14) Problem (14) is convex w.r.t. {Q k }, but it has no simple closed-form solution. Alternatively, we can apply BCD to (14) and cyclically update Q k while fixing Q j for j k ( ) max Q k log det ( H k Q k H H k + Φ ) s.t. Tr(Q k ) P k, Q k 0, where Φ = j k H jq j H H j + I ( ) has a closed-form water-filling solution, just like the previous single-user MIMO case. 24

26 Applications Low-Rank Matrix Completion In a previous lecture, we have introduced the low-rank matrix completion problem, which has huge potential in sales recommendation. For example, we would like to predict how much someone is going to like a movie based on its movie preferences: M = movies 2 3 1?? 5 5 1? 4 2???? 3 1? 2 2 2??? 3? 1 5 users 2? 4?? 5 3 M is assumed to be of low rank, as only a few factors affect users preferences. min W R m n W rank(w) s.t. W ij = M ij, (i, j) Ω 25

27 An alternative low-rank matrix completion formulation [Wen-Yin-Zhang]: ( ) min X,Y,Z 1 2 XY Z 2 F s.t. Z ij = M ij, (i, j) Ω where X R M L, Y R L N, Z R M N, and L is an estimate of min. rank. Advantage of adopting ( ): When BCD is applied, each subproblem of ( ) has a closed-form solution: X r+1 = Z r Y rt (Y r Y rt ), Y r+1 = (X r+1t X r+1 ) (X r+1t Z r ), { [Z r+1 [X r+1 Y r+1 ] i,j, for (i, j) Ω ] i,j = M i,j, for (i, j) Ω 26

28 Applications Maximizing A Convex Quadratic Function Consider maximizing a convex quadratic problem: ( ) max x where X is a polyhedral set, and Q xt Qx + c T x s.t. x X ( ) is equivalent to the following problem 1 ( ) 1 max x 1,x 2 2 xt 1 Qx ct x ct x 2 s.t. (x 1, x 2 ) X X When fixing either x 1 or x 2, problem ( ) is an LP, thereby efficiently solvable. 1 The equivalence is in the following sense: If x is an optimal solution of ( ), then (x, x ) is optimal for ( ); Conversely, if (x 1, x 2 ) is an optimal solution of ( ), then both x 1, x 2 are optimal for ( ). 27

29 Applications Nonnegative Matrix Factorization (NMF) NMF is concerned with the following problem [Lee-Seung]: where M 0. min M UV 2 U R m k,v R k n F s.t. U 0, V 0 (15) Usually k min(m, n) or mk + nk mn, so NMF can be seen as a linear dimensionality reduction technique for nonnegative data. 28

30 Image Processing: representation. As can be seen from Fig. 1, the NMF basis and encodings contain a large fraction of vanishing coefficients, so both the basis images and image encodings are NMF sparse. Examples The basis images are sparse because they are non-global and contain several versions of mouths, noses and other facial parts, where the various versions are in different locations or forms. The variability of a whole face is generated by combining these different parts. Although all parts are used by at U 0 constraints the basis elements to be nonnegative. V 0 imposes an additive reconstruction. rules for W and H diffe update rules yield resul the technical disadvanta controlling the learnin through trial and error the matrix V is very lar Fig. 2 may be advantag bases. NMF VQ Original The basis elements extract facial features such as eyes, nose and lips. = Figure 1 Non-negative matrix faces, whereas vector quantiz holistic representations. The t m ¼ 2;429 facial images, ea n m matrix V. All three find three different types of constra and methods. As shown in the r ¼ 49 basis images. Positive with red pixels. A particular in represented by a linear superp superposition are shown next superpositions are shown on th learns to represent faces with 29

31 Application Text Mining 2: text mining Basis elements allow allow to recover to recover different the different topics; topics; Weights allow allow to assign to assign each each text text to itstocorresponding corresponding topics. topics. Dagstuhl Robust Near-Separable NMF Using LP 30 5

32 Hyperspectral Unmixing Basis elements U represent different materials; Weights V allow to know which pixel contains which material. 31

33 Let s turn back to the NMF problem: min M UV 2 U R m k,v R k n F s.t. U 0, V 0 (16) Without 0 constraints, the optimal U and V can be obtained by SVD. With 0 constraints, problem (16) is generally NP-hard. When fixing U (resp. V), problem (16) is convex w.r.t. V (resp. U). For example, for a given U, the ith column of V is updated by solving the following NLS problem: min M(:, i) UV(:, i) 2 2, s.t. V(:, i) 0, (17) V(:,i) R k 32

34 BCD Algorithm for NMF: 1: Initialize U = U 0, V = V 0 and r = 0; 2: repeat 3: solve the NLS problem V arg 4: V r+1 = V ; 5: solve the NLS problem U arg 6: U r+1 = U ; 7: r = r + 1; 8: until some convergence criterion is met min M Ur V 2 V R k n F, s.t. V 0 min U R m k M UVr+1 2 F, s.t. U 0 33

35 Outline Majorization Minimization (MM) Convergence Applications Block Coordinate Descent (BCD) Applications Convergence Summary 34

36 BCD Convergence The idea of BCD is to divide and conquer. However, there is no free lunch; BCD may get stuck or converge to some point of no interest. Figure 6: BCD for smooth/non-smooth minimization. 35

37 BCD Convergence min x f(x) s.t. x X = X 1 X 2... X m R n (18) A well-known BCD convergence result due to Bertsekas: Theorem 2. ([Bertsekas]) Suppose that f is continuously differentiable over the convex closed set X. Furthermore, suppose that for each i g i (ξ) f(x 1, x 2,..., x i 1, ξ, x i+1,..., x m ) is strictly convex. Let {x r } be the sequence generated by BCD method. Then every limit point of {x r } is a stationary point of problem (18). If X is (convex) compact, i.e., closed and bounded, then strict convexity of g i (ξ) can be relaxed to having a unique optimal solution. 36

38 Application: Iterative water-filling for MIMO MAC sum capacity max.: ( ) max {Q k } K k=1 log det ( K k=1 H kq k H H k + I), s.t. Tr(Q k ) P k, Q k 0, k Iterative water-filling converges to a global optimal solution of ( ), because BCD subproblem is strictly convex (assuming full column rankness of H k ); X k is a convex closed subset; ( ) is a convex problem, so stationary point = global optimal solution 37

39 Generalization of Bertsekas Convergence Result Generalization 1: Relax Strict Convexity to Strict Quasiconvexity 2 [Grippo-Sciandrone] Theorem 3. Suppose that the function f is continuously differentiable and strictly quasiconvex with respect to x i on X, for each i = 1,..., m 2 and that the sequence {x r } generated by the BCD method has limit points. Then, every limit point is a stationary point of problem (18). Application: Low-Rank Matrix Completion ( ) min X,Y,Z 1 2 XY Z 2 F s.t. Z ij = M ij, (i, j) Ω m = 3 and ( ) is strictly convex w.r.t. Z = BCD converges to a stationary point. 2 f is strictly quasiconvex w.r.t. xi X i on X if for every x X and y i X i with y i x i we have f(x 1,..., tx i + (1 t)y i,..., x m ) < max {f(x), f(x 1,..., y i,..., x m )}, t (0, 1). 38

40 Generalization 2: Without Solution Uniqueness Theorem 4. Suppose that f is pseudoconvex 3 on X and that L 0 X := {x X : f(x) f(x 0 )} is compact. Then, the sequence generated by BCD method has limit points and every limit point is a global minimizer of f. Application: Iterative water-filling for MIMO-MAC sum capacity max. max {Q k } K k=1 log det ( K k=1 H kq k H H k + I) s.t. Tr(Q k ) P k, Q k 0, k = 1,..., K f is convex, thus pseudoconvex; {Q k Tr(Q k ) P k, Q k 0} is compact; iterative water-filling converges to a globally optimal solution. 3 f is pseudoconvex if for all x, y X such that f(x) T (y x) 0, we have f(y) f(x). Notice that convex pseudoconvex quasiconvex. 39

41 Generalization 3: Without Solution Uniqueness, Pseudoconvexity and Compactness Theorem 5. Suppose that f is continuously differentiable, and that X is convex and closed. Moreover, if there are only two blocks, i.e., m = 2, then every limit point generated by BCD is a stationary point of f. Application: NMF min U R m k,v R k n M UV 2 F s.t. U 0, V 0 Alternating NLS converges to a stationary point of the NMF problem, since the objective is continuously differentiable; the feasible set is convex and closed; m = 2. 40

42 Summary MM and BCD have great potential in handling nonconvex problems and realizing fast/distributed implementations for large-scale convex problems; Many well-known algorithms can be interpreted as special cases of MM and BCD; Under some conditions, convergence to stationary point can be guaranteed by MM and BCD. 41

43 References M. Razaviyayn, M. Hong, and Z.-Q. Luo, A unified convergence analysis of block successive minimization methods for nonsmooth optimization, submitted to SIAM Journal on Optimization, available online at L. Grippo and M. Sciandrone, On the convergence of the block nonlinear Gauss- Seidel method under convex constraints, Operation research letter vol. 26, pp , 2000 E. J. Candes, M. B. Wakin, and S. P. Boyd, Enhancing sparsity by reweighted l 1 minimization, J. Fourier Anal. Appl., 14 (2008), pp M. Zibulevsky and M. Elad, l 1 l 2 optimization in signal and image processing, IEEE Signal Process. Magazine, May 2010, pp D. P. Bertsekas, Nonlinear Programming, Athena Scientific, 1st Ed., 1995 W. Yu and J. M. Cioffi, Sum capacity of a Gaussian vector broadcast channel, IEEE Trans. Inf. Theory, vol. 50, no. 1, pp , Jan

44 Z. Wen, W. Yin, and Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Rice CAAM Tech Report Daniel D. Lee and H. Sebastian Seung, Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press. pp ,

Sum-Power Iterative Watefilling Algorithm

Sum-Power Iterative Watefilling Algorithm Sum-Power Iterative Watefilling Algorithm Daniel P. Palomar Hong Kong University of Science and Technolgy (HKUST) ELEC547 - Convex Optimization Fall 2009-10, HKUST, Hong Kong November 11, 2009 Outline

More information

Majorization-Minimization Algorithm

Majorization-Minimization Algorithm Majorization-Minimization Algorithm Theory and Ying Sun and Daniel P. Palomar Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology ELEC 5470 - Convex Optimization

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables

Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Recent Developments of Alternating Direction Method of Multipliers with Multi-Block Variables Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong 2014 Workshop

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Convex Functions Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Definition convex function Examples

More information

1 Non-negative Matrix Factorization (NMF)

1 Non-negative Matrix Factorization (NMF) 2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 29, 2016 Outline Convex vs Nonconvex Functions Coordinate Descent Gradient Descent Newton s method Stochastic Gradient Descent Numerical Optimization

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Proximal Methods for Optimization with Spasity-inducing Norms

Proximal Methods for Optimization with Spasity-inducing Norms Proximal Methods for Optimization with Spasity-inducing Norms Group Learning Presentation Xiaowei Zhou Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology

More information

Solving Dual Problems

Solving Dual Problems Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

More information

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview ENGG5781 Matrix Analysis and Computations Lecture 0: Overview Wing-Kin (Ken) Ma 2018 2019 Term 1 Department of Electronic Engineering The Chinese University of Hong Kong Course Information W.-K. Ma, ENGG5781

More information

A Novel Iterative Convex Approximation Method

A Novel Iterative Convex Approximation Method 1 A Novel Iterative Convex Approximation Method Yang Yang and Marius Pesavento Abstract In this paper, we propose a novel iterative convex approximation algorithm to efficiently compute stationary points

More information

Lecture 2: Convex Sets and Functions

Lecture 2: Convex Sets and Functions Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization

Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization Meisam Razaviyayn meisamr@stanford.edu Mingyi Hong mingyi@iastate.edu Zhi-Quan Luo luozq@umn.edu Jong-Shi Pang jongship@usc.edu

More information

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable Selection Subset Selection Greedy Methods: (Orthogonal) Matching Pursuit Convex Relaxation: LASSO & Elastic Net

More information

Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels

Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels Yang Wen Liang Department of Electrical and Computer Engineering The University of British Columbia April 19th, 005 Outline of Presentation

More information

Linear dimensionality reduction for data analysis

Linear dimensionality reduction for data analysis Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Solving DC Programs that Promote Group 1-Sparsity

Solving DC Programs that Promote Group 1-Sparsity Solving DC Programs that Promote Group 1-Sparsity Ernie Esser Contains joint work with Xiaoqun Zhang, Yifei Lou and Jack Xin SIAM Conference on Imaging Science Hong Kong Baptist University May 14 2014

More information

BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS

BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS BLOCK ALTERNATING OPTIMIZATION FOR NON-CONVEX MIN-MAX PROBLEMS: ALGORITHMS AND APPLICATIONS IN SIGNAL PROCESSING AND COMMUNICATIONS Songtao Lu, Ioannis Tsaknakis, and Mingyi Hong Department of Electrical

More information

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Minimax Problems. Daniel P. Palomar. Hong Kong University of Science and Technolgy (HKUST)

Minimax Problems. Daniel P. Palomar. Hong Kong University of Science and Technolgy (HKUST) Mini Problems Daniel P. Palomar Hong Kong University of Science and Technolgy (HKUST) ELEC547 - Convex Optimization Fall 2009-10, HKUST, Hong Kong Outline of Lecture Introduction Matrix games Bilinear

More information

Introduction to Convex Optimization

Introduction to Convex Optimization Introduction to Convex Optimization Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Optimization

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization Design of Proection Matrix for Compressive Sensing by Nonsmooth Optimization W.-S. Lu T. Hinamoto Dept. of Electrical & Computer Engineering Graduate School of Engineering University of Victoria Hiroshima

More information

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Arizona State University March

More information

Minimax MMSE Estimator for Sparse System

Minimax MMSE Estimator for Sparse System Proceedings of the World Congress on Engineering and Computer Science 22 Vol I WCE 22, October 24-26, 22, San Francisco, USA Minimax MMSE Estimator for Sparse System Hongqing Liu, Mandar Chitre Abstract

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization

Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization Semidefinite Programming Based Preconditioning for More Robust Near-Separable Nonnegative Matrix Factorization Nicolas Gillis nicolas.gillis@umons.ac.be https://sites.google.com/site/nicolasgillis/ Department

More information

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

More information

COM Optimization for Communications 8. Semidefinite Programming

COM Optimization for Communications 8. Semidefinite Programming COM524500 Optimization for Communications 8. Semidefinite Programming Institute Comm. Eng. & Dept. Elect. Eng., National Tsing Hua University 1 Semidefinite Programming () Inequality form: min c T x s.t.

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions 3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

On the Convergence of the Concave-Convex Procedure

On the Convergence of the Concave-Convex Procedure On the Convergence of the Concave-Convex Procedure Bharath K. Sriperumbudur and Gert R. G. Lanckriet Department of ECE UC San Diego, La Jolla bharathsv@ucsd.edu, gert@ece.ucsd.edu Abstract The concave-convex

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Lecture: Matrix Completion

Lecture: Matrix Completion 1/56 Lecture: Matrix Completion http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html Acknowledgement: this slides is based on Prof. Jure Leskovec and Prof. Emmanuel Candes s lecture notes Recommendation systems

More information

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

2 Regularized Image Reconstruction for Compressive Imaging and Beyond EE 367 / CS 448I Computational Imaging and Display Notes: Compressive Imaging and Regularized Image Reconstruction (lecture ) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement

More information

Lecture 4: Linear and quadratic problems

Lecture 4: Linear and quadratic problems Lecture 4: Linear and quadratic problems linear programming examples and applications linear fractional programming quadratic optimization problems (quadratically constrained) quadratic programming second-order

More information

Under sum power constraint, the capacity of MIMO channels

Under sum power constraint, the capacity of MIMO channels IEEE TRANSACTIONS ON COMMUNICATIONS, VOL 6, NO 9, SEPTEMBER 22 242 Iterative Mode-Dropping for the Sum Capacity of MIMO-MAC with Per-Antenna Power Constraint Yang Zhu and Mai Vu Abstract We propose an

More information

First-order methods for structured nonsmooth optimization

First-order methods for structured nonsmooth optimization First-order methods for structured nonsmooth optimization Sangwoon Yun Department of Mathematics Education Sungkyunkwan University Oct 19, 2016 Center for Mathematical Analysis & Computation, Yonsei University

More information

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK) Convex Functions Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK) Course on Convex Optimization for Wireless Comm. and Signal Proc. Jointly taught by Daniel P. Palomar and Wing-Kin (Ken) Ma

More information

On the Convergence of the Concave-Convex Procedure

On the Convergence of the Concave-Convex Procedure On the Convergence of the Concave-Convex Procedure Bharath K. Sriperumbudur and Gert R. G. Lanckriet UC San Diego OPT 2009 Outline Difference of convex functions (d.c.) program Applications in machine

More information

Optimum Power Allocation in Fading MIMO Multiple Access Channels with Partial CSI at the Transmitters

Optimum Power Allocation in Fading MIMO Multiple Access Channels with Partial CSI at the Transmitters Optimum Power Allocation in Fading MIMO Multiple Access Channels with Partial CSI at the Transmitters Alkan Soysal Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland,

More information

Morning Session Capacity-based Power Control. Department of Electrical and Computer Engineering University of Maryland

Morning Session Capacity-based Power Control. Department of Electrical and Computer Engineering University of Maryland Morning Session Capacity-based Power Control Şennur Ulukuş Department of Electrical and Computer Engineering University of Maryland So Far, We Learned... Power control with SIR-based QoS guarantees Suitable

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Majorization Minimization - the Technique of Surrogate

Majorization Minimization - the Technique of Surrogate Majorization Minimization - the Technique of Surrogate Andersen Ang Mathématique et de Recherche opérationnelle Faculté polytechnique de Mons UMONS Mons, Belgium email: manshun.ang@umons.ac.be homepage:

More information

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Coordinate Update Algorithm Short Course Proximal Operators and Algorithms Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 36 Why proximal? Newton s method: for C 2 -smooth, unconstrained problems allow

More information

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization Chen Xu 1, Zhouchen Lin 1,2,, Zhenyu Zhao 3, Hongbin Zha 1 1 Key Laboratory of Machine Perception (MOE), School of EECS, Peking

More information

Lecture 6. Regression

Lecture 6. Regression Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron

More information

A Note on the Complexity of L p Minimization

A Note on the Complexity of L p Minimization Mathematical Programming manuscript No. (will be inserted by the editor) A Note on the Complexity of L p Minimization Dongdong Ge Xiaoye Jiang Yinyu Ye Abstract We discuss the L p (0 p < 1) minimization

More information

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems?

Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Does Alternating Direction Method of Multipliers Converge for Nonconvex Problems? Mingyi Hong IMSE and ECpE Department Iowa State University ICCOPT, Tokyo, August 2016 Mingyi Hong (Iowa State University)

More information

Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming

Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305,

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation Zhouchen Lin Peking University April 22, 2018 Too Many Opt. Problems! Too Many Opt. Algorithms! Zero-th order algorithms:

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems On Penalty and Gap Function Methods for Bilevel Equilibrium Problems Bui Van Dinh 1 and Le Dung Muu 2 1 Faculty of Information Technology, Le Quy Don Technical University, Hanoi, Vietnam 2 Institute of

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Solution Recovery via L1 minimization: What are possible and Why?

Solution Recovery via L1 minimization: What are possible and Why? Solution Recovery via L1 minimization: What are possible and Why? Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Eighth US-Mexico Workshop on Optimization

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Machine Learning for Signal Processing Sparse and Overcomplete Representations Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA

More information

LECTURE NOTE #NEW 6 PROF. ALAN YUILLE

LECTURE NOTE #NEW 6 PROF. ALAN YUILLE LECTURE NOTE #NEW 6 PROF. ALAN YUILLE 1. Introduction to Regression Now consider learning the conditional distribution p(y x). This is often easier than learning the likelihood function p(x y) and the

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Single-User MIMO systems: Introduction, capacity results, and MIMO beamforming

Single-User MIMO systems: Introduction, capacity results, and MIMO beamforming Single-User MIMO systems: Introduction, capacity results, and MIMO beamforming Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Multiplexing,

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

Recent Developments in Compressed Sensing

Recent Developments in Compressed Sensing Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Introduction to Sparsity Xudong Cao, Jake Dreamtree & Jerry 04/05/2012 Outline Understanding Sparsity Total variation Compressed sensing(definition) Exact recovery with sparse prior(l 0 ) l 1 relaxation

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

Math 273a: Optimization Overview of First-Order Optimization Algorithms

Math 273a: Optimization Overview of First-Order Optimization Algorithms Math 273a: Optimization Overview of First-Order Optimization Algorithms Wotao Yin Department of Mathematics, UCLA online discussions on piazza.com 1 / 9 Typical flow of numerical optimization Optimization

More information

On the convergence of a regularized Jacobi algorithm for convex optimization

On the convergence of a regularized Jacobi algorithm for convex optimization On the convergence of a regularized Jacobi algorithm for convex optimization Goran Banjac, Kostas Margellos, and Paul J. Goulart Abstract In this paper we consider the regularized version of the Jacobi

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Convex relaxation. In example below, we have N = 6, and the cut we are considering Convex relaxation The art and science of convex relaxation revolves around taking a non-convex problem that you want to solve, and replacing it with a convex problem which you can actually solve the solution

More information

Transmitter optimization for distributed Gaussian MIMO channels

Transmitter optimization for distributed Gaussian MIMO channels Transmitter optimization for distributed Gaussian MIMO channels Hon-Fah Chong Electrical & Computer Eng Dept National University of Singapore Email: chonghonfah@ieeeorg Mehul Motani Electrical & Computer

More information

Multiuser Downlink Beamforming: Rank-Constrained SDP

Multiuser Downlink Beamforming: Rank-Constrained SDP Multiuser Downlink Beamforming: Rank-Constrained SDP Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture

More information

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels Mehdi Mohseni Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Email: mmohseni@stanford.edu

More information

Block stochastic gradient update method

Block stochastic gradient update method Block stochastic gradient update method Yangyang Xu and Wotao Yin IMA, University of Minnesota Department of Mathematics, UCLA November 1, 2015 This work was done while in Rice University 1 / 26 Stochastic

More information