Reweighted Nuclear Norm Minimization with Application to System Identification

Size: px
Start display at page:

Download "Reweighted Nuclear Norm Minimization with Application to System Identification"

Transcription

1 Reweighted Nuclear Norm Minimization with Application to System Identification Karthi Mohan and Maryam Fazel Abstract The matrix ran minimization problem consists of finding a matrix of minimum ran that satisfies given convex constraints It is NP-hard in general and has applications in control, system identification, and machine learning Reweighted trace minimization has been considered as an iterative heuristic for this problem In this paper, we analyze the convergence of this iterative heuristic, showing that the difference between successive iterates tends to zero Then, after reformulating the heuristic as reweighted nuclear norm minimization, we propose an efficient gradient-based implementation that taes advantage of the new formulation and opens the way to solving largescale problems We apply this algorithm to the problem of loworder system identification from input-output data Numerical examples demonstrate that the reweighted nuclear norm minimization maes model order selection easier and results in lower order models compared to nuclear norm minimization without weights A Bacground I INTRODUCTION The matrix ran minimization problem consists of finding a matrix of minimum ran that satisfies given convex constraints, ie, ran(x) X C, where X R m n is the optimization variable and C is a convex set When C is described by affine equality constraints, () is the matrix extension of the popular sparse signal recovery problem in compressed sensing The ran minimization problem arises in a diverse set of fields, where notions of order, complexity, or dimension are expressed by means of the ran of an appropriate matrix Applications include system identification, low-order controller design, collaborative filtering in machine learning, and Euclidean embedding problems (see [4] and references therein) Problem () is in general NP-hard A common convex heuristic [7] replaces ran with the nuclear norm (also nown as the Schatten -norm or trace norm) of the matrix, denoted by X = r i σ i(x) where σ i (X) are the singular values and r = ran(x) The heuristic solves the convex problem X X C This heuristic and its variants have lately received a lot of interest One reason is the recent progress in both the theory and the algorithms for this heuristic Several conditions have Electrical Engineering Department, University of Washington, Seattle arna@uwashingtonedu, mfazel@uwashingtonedu Research funded in part by NSF CAREER grant ECCS () (2) been studied that guarantee that the heuristic yields an exact solution when the constraints are linear equalities (eg, [4]) Development of various classes of algorithms for this heuristic that exploit specific problem structure is also an active research area (eg, [7]) Finally, new applications have initiated interest in special cases of the ran minimization problem, eg, the low-ran matrix completion problem [4] arising in machine learning We also mention that matrix ran and nuclear norm minimization have a natural connection to vector sparsity and l -norm: the former reduces to the latter if the matrix variable is taen to be diagonal A variation on this basic heuristic that helps reduce the ran of the solution further, is to use a weighted objective (see [3], [6] for the vector version, and [8] for the matrix version) In this paper we study the reweighted trace heuristic, which is based on using a nonconvex surrogate function for the ran and solving the resulting problem locally via a sequence of convex problems First, note that problem () can also be expressed in a positive semidefinite form [9]: [ ran(y ) + ran() ] X T 0, X C, where X, Y R m m, and R n n are the optimization variables Then, replacing ran with trace, we obtain a semidefinite programming problem that is equivalent to (2) [7] The heuristic given in [8] replaces the ran of positive semidefinite matrices Y, by a surrogate function as follows: log det(y [ + δi) ] + log det( + δi) (4) X T 0, X C, where δ > 0 is a small regularization constant Problem (4) can be solved locally by iterative linearization of the objective The th step of this algorithm solves the problem Tr (Y + δi) Y + Tr( + δi) [ ] X T 0, X C, to get X +, Y +, + Throughout this paper, we refer to (5) as the reweighted trace heuristic (RTH) We often tae Y 0 and 0 to be identity, thus the first iteration of the algorithm will simply TrY + Tr B Summary of results We examine the convergence of the reweighted trace heuristic, showing that the difference between the successive iterates of the heuristic tends to zero (3) (5)

2 We give an example of other concave functions that can be used as a surrogate for ran in (3), giving rise to other heuristics that have similar convergence properties RTH as given in (5) would require solving a semidefinite program (SDP) at each iteration We reformulate the iterations in terms of the matrix nuclear norm, to which several first-order gradient-based methods could be applied efficiently We apply the reweighted nuclear norm heuristic to a classic system identification problem, finding a low order system from input-output data We use the gradient projection method to implement the heuristic efficiently We give a numerical example (from the system identification database []) and show that the reweighted nuclear norm heuristic gives a clearer description of the model order and results in lower order models compared to a simple subspace method and also the un-weighted or nuclear norm minimization as in (2) C Related wor The papers [3], [6] study reweighted l minimization as a heuristic to find the sparsest vector in a convex set (a special case of the ran minimization problem) It is shown that this heuristic wors better in practice than l minimization [3] RTH was first proposed in [8] While this paper shows that the objective function value converges, it does not discuss if the iterates themselves converge or if the difference between the successive iterates converge That the difference between iterates goes to zero was shown for the reweighted l heuristic in [6], but the proof there does not extend to the matrix case The paper [7] gives an interior point method for nuclear norm minimization, and applies it for system identification as an alternative approach to subspace based identification methods (see, eg, [2], [5]) It shows that nuclear norm minimization determines the lowest system order better than existing methods The interior point implementation is more efficient than generic SDP solvers, however it does not scale as well as the first-order implementation we discuss here II CONVERGENCE OF THE REWEIGHTED TRACE HEURISTIC Let X R m n, Y R m m, R n n Define the function g : R m n S m + S n + R as g(x, Y, ) = log det(y + δi) + log det( + δi) Note that g(x, Y, ) is a continuously differentiable function over its domain The gradient of g is given by g(x, Y, ) = (0, (Y + δi), ( + δi) ) Let W = (W, W 2, W 3 ), X = (X, Y, ) be two points in the domain of g Define the inner product on the cross product space, R m n S m + S n + as W, X c = W, X + W 2, Y + W 3,, where X, Y is the standard inner product between two matrices Since g is strictly concave on its domain, we have g( X) < g( W) + g( W), X W := h( X, W) c for all W X D, where [ ] D = {(X, Y, ) : X T 0, X C} Since g( X) < h( X, W), W X D and g( X) = h( X, X), X D, the function h majorizes g Let X denote (X, Y, ) RTH is thus a Majorization-Minimization (MM) algorithm: X + = arg min h( X, X ) = arg min g( X ), X X D X D c = arg min Tr (Y + δi) Y + Tr ( + δi) (6) X D Note that if X X +, g( X + ) < h( X +, X ) h( X, X ) = g( X ) (7) We first define a stationary point of a function before giving the theorem on convergence Definition II x is a stationary point of a continuously differentiable function f over a set D if x argmin y D f(x) T y Theorem II2 Every convergent subsequence of the reweighted trace heuristic (5) converges to a stationary point of g over D Further the norm of the difference between succesive iterates tends to zero, X + X F 0 Proof: Let X = (X, Y, ) be the solution to the nuclear norm minimization problem (2) The set { X g( X) g( X )} is bounded, because Y F, F implies g( X) Also, if X F, then Y F F (since the bloc matrix is positive semidefinite), hence g( X) Thus, { X g( X) g( X )} is a compact set Therefore the iterates of RTH are bounded (since g( X + ) g( X ), ) Therefore, the sequence { X } has a convergent subsequence Let { X ni }, i =, 2, denote this subsequence with a limit X Let Xn i+ X a Assume that X X a Now, (7) with the fact that g is continuously differentiable implies that g( X ) = limg( X ni ) > limg( X ni+ ) = g( X a ) (8) (7) also implies that g( X i+ ) g( X i ) i Since g is bounded below, {g( X i )} converges and limg( X i ) = limg( X ni ) = g( X ) = g( X ni+ ) = g( X a ) (9) But (9) contradicts (8), hence X = X a and by definition, this implies that X is a stationary point Now, assume that there exists a subsequence and a δ > 0 such that X ni X ni+ F > δ, i =, 2, Let { X ni } X and { X ni+ } X a X Then, by a similar argument as above we arrive at a contradiction Thus, X = X a and it holds that X + X F 0

3 A Convergence through conditional gradient method For a generic problem with a continuously differentiable objective function f : R n R and a convex, compact constraint set C, the conditional gradient method [] yields the iterates x = argmin x C f(x ), x, x + = x + α ( x x ) If we use a the so-called limited minimization rule to pic the step-size α, we get α = since the objective function f is concave Thus, the heuristic (5) is the same as the conditional gradient method applied to the problem (4) It is nown (eg, []) that every cluster point of the conditional gradient iterates is a stationary point of f over the set C, and we obtain the first part of the convergence result in the previous theorem B Other surrogate functions We considered the concave surrogate function log det(x) in (4) We can similarly use other concave surrogates such as Tr(X ) (see eg [2] for proof of concavity) and apply the conditional gradient method or the Majorizationminimization algorithm to obtain other heuristics for which the same convergence results hold For example, using the surrogate function Tr(X ) yields the following heuristic: X + = arg min Tr(Y + δi) 2 Y + Tr ( + δi) 2 (0) X D with X as defined in section 2 Comparing the performance of these heuristics is a direction for future wor This section establishes the convergence of difference of successive iterates of RTH and shows that the limit point of every convergent subsequence is a stationary point However, we can t say if the trace heuristic can achieve the global minimum of (4) We initialize the weighted iterations with the solution of nuclear norm minimization (2), thus RTH can be thought of as improving on the solution of the nuclear norm heuristic III REWEIGHTED NUCLEAR NORM HEURISTIC Recall from (5) that in the th iteration of the reweighted trace heuristic we solve the following problem: Tr(Y [ + δi) ] Y + Tr ( + δi) X T 0, X C () In this section, we reformulate this problem as a (reweighted) nuclear norm minimization problem This reformulation is algorithmically beneficial: it allows us to exploit the properties of the nuclear norm and the problem structure to obtain an efficient first-order algorithm for solving the heuristic Let W = (Y + δi) 2 and W2 = ( + δi) 2 Since W, W2 are positive definite for any feasible Y, and δ > 0, the constraint in () is equivalent to [ ] [ ] [ ] W 0 W W2 X T 0 W2 0 Thus, problem () is equivalent to 2 (Tr W Y W + TrW 2 W 2 ) [ W Y W 2 W XW ] 2 W2 X T W W2 W2 0 X C (2) Using the following characterization of the nuclear norm (see eg [7], [4]), X = 2 min(tr Y [ + Tr ) ] X T 0, we can write problem (2) as W XW 2 X C, (3) which is a (weighted) nuclear norm minimization Once the optimal solution X + is found, the weights W + and W2 + are updated as follows Let W X+ W2 = UΣV T be the reduced singular value decomposition of W X + W2, where U R m r, Σ R r r and V R n r It can be checed [4] that the optimal Y + and + in (2) are given by Y + = (W ) UΣU T (W ), + = (W 2 ) V ΣV T (W 2 ), (4) so the weights can be updated as W + = (Y + + δi) /2, W + 2 = ( + + δi) /2 (5) The update equations (4),(5) together with (3) describe the reweighted nuclear norm heuristic If the set C in (3) is described by convex constraints f i (X) 0, i =,,m, we can write the problem in the regularized form X + = argmin i λ i f i (X) + W XW 2 (6) with W, W 2 defined above, and a suitable choice of λ i We note that if in addition to X C we have the constraint that X be positive semidefinite, then the reweighted nuclear norm heuristic reduces to, X + = arg min X C,X 0 Tr(X + δi) X (7) In the next section, we apply this regularized reweighted nuclear norm heuristic (that we abbreviate as RRNH) to a system identification problem using an efficient first-order method IV EFFICIENT IMPLEMENTATION OF THE RRNH FOR SYSTEM IDENTIFICATION Consider the problem of identifying a discrete-time, linear time-invariant state-space model, x(t + ) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t),

4 given a set of inputs u(t) R m and noisy measured outputs y meas (t) R p, for t = 0,,,N Here x(t) R n is the state of the system at time t, and n is the order of the model We would lie to find the matrices A, B, C, D, the initial state x(0), and the lowest possible order n that satisfy y(t) y meas (t) Let Ŷ = [y(0),,y(n )] R p N, Y meas = [y meas (0),,y meas (N )] R p N, Û = [u(0),,u(n )] R m N Define the linear operator H r as follows: H r (Ŷ ) = y(0) y() y(n r ) y() y(2) y(n r) y(r) y(r + ) y(n ), (8) which is a bloc-hanel matrix, with Ŷ defined as earlier The adjoint operator, Hr (W), is as below: w w 2 w,n r Hr (W) = w 2 w 22 w 2,N r H r w r+, w r+,n r = [ ] w w 2 + w 2 w 3 + w 22 + w 3 w r+,n r Define X = [x(0), x(),, x(n r )], U = H r (Û), Y = H r (Ŷ ) with ˆX, Û, and Ŷ as defined earlier, and let F = G = [ C T (CA) T (CA r ) T] T, D CB D 0 0 CAB CB D 0 CA r B CA r 2 B CA r 3 B D It is easy to see that Y = GX + FU, and thus Y U = GXU, where U R N r+ q is a set of orthonormal basis vectors for null-space of U If X has a ran n and there is no ran cancellation in XU, one can find the system order from the ran of Y U (see, eg, [7], [6] for more details) Liu et al [6], [7] propose a nuclear norm heuristic for minimizing the ran of Y U as Ŷ H r (Ŷ )U + λ 2 Ŷ Y meas 2 F, (9) with λ a positive parameter We apply the RRNH to find a minimum order system giving the following iterative minimization: Ŷ + = arg min Ŷ λ 2 Ŷ Y meas 2 F + W H r(ŷ )U W 2, (20) where we define f = 2 Ŷ Y meas 2 F in (6) and W, W2 are as in (3) Once we obtain an optimal Ŷ, we compute the ran of H r (Ŷ )U by looing at its singular values The thumbrule we use to obtain the ran is the number of singular values after which there is a sharp drop (differentiating the significant singular values from the non-significant ones) If we don t observe a sharp drop, the thumbrule used is to choose the ran of the matrix to be the number of singular values that are within 0 percent of the largest singular value (as in ([7]) Once we identify the ran of Y U, the matrices A, B, C, D of the LTI state-space model can be estimated as detailed in section 5 of [7] A Problem Reformulation To solve (20) efficiently, we reformulate it by maing use of the structure of the regularized constraint in (9) and the fact that the nuclear norm is the dual of the spectral norm, Y = max { Y, : F } (2) Using (2) the primal problem in (20) at the th iteration can be formulated as (note that switching to does not change (2)): min y max : T I λ 2 Ŷ Y meas 2 F, W H r (Ŷ )U W 2 (22) The dual problem corresponding to (22) is obtained by interchanging the min and max in the primal as follows: λ max min : T I y 2 Ŷ Ȳ 2 F, W H r(ŷ )U W2 Define the operator Φ : R p N R (r+)p (N r), = 0,, 2,, with Φ (Ŷ ) = W H r(ŷ )U W2 It is easy to chec that the adjoint operator Φ : R(r+)p (N r) R p N is given by Φ () = H r (W W 2 U T ) The dual problem can now be reframed as: max min λ : T I Ŷ 2 Ŷ Y meas 2 F, Φ (Ŷ ) (23) Minimizing over Ŷ, the optimality conditions give λ(ŷ Y meas) Φ () = 0 (24) Note that the primal (22) is a convex problem and obeys Slater s conditions, hence the duality gap between (22) and (23) is zero Thus the primal optimal solution can be obtained from the dual optimal solution, which is the basis for the implementation described later Substituting Ŷ from (24) bac into (23), the dual problem reduces to: min : T I 2λ Φ () 2 F + Y meas, Φ () (25) The dual objective is scaled by λ so that the objective is independent of it: min :λ 2 ( T ) I 2 Φ () 2 F + Y meas, Φ () (26) The RRNH for the System Identification application using an efficient first order method (ie, the Gradient projection method applied to the dual, see eg [5]) can be summarized as follows:

5 ) Set = 0 Initialize W 0 = I, W 0 2 = I 2) Solve the dual problem (26) using the gradient projection algorithm, obtain + 3) Obtain Ŷ + = Y meas + λ Φ (λ+ ) (using 24) 4) Let Y + = H r (Ŷ + ), let UΣV T be the reduced SVD of Ȳ + = W Y + W 2 Set W + = ((W ) UΣU T (W ) + δi) /2, W + 2 = ((W 2 ) V ΣV T (W 2 ) + δi) /2 5) Stop if termination criterion is satisfied, else set = + and go to step 2 Define D () = 2 Φ () 2 F + Y meas, Φ () to be the dual objective in (26) Step 2 of the above algorithm applies the gradient projection method to solve the dual (26) We note that the projection method wors well when the step size in the gradient-descent step of the method is chosen to be inversely proportional to the lipschitz constant of D ()(see eg [5]) An estimate of the Lipschitz constant of D can be obtained as L = rλ 2 max (W )λ2 max (W 2 ) with the details given in the Appendix B Numerical results In [6], Liu et al mention that the main advantage of nuclear norm technique is that it maes the selection of an appropriate model order easier We present an example, where we show that the RRNH improves on the nuclear norm technique for model order selection We apply the RRNH implementation (algorithm described at the end of subsection B) to one of the data sets (96-006, []) available from ( smc/daisy/) The parameter r, which is the number of row-blocs in the bloc Hanel matrices H r (Û), Y = H r(ŷ ), is chosen so that the number of rows is greater than the expected system order We choose an r sufficiently large, ie r : rp = 60, where p is the size of the output of the system The parameter λ is chosen to give approximately the smallest identification error when RRNH is run for one iteration (ie just the nuclear norm heuristic as in (9)) The identification error is given by e I = ( ) Y measi Ỹ 2 F Y measi ȲI 2, (27) F where Ỹ = [ỹ(0),,ỹ(n I )] denotes the output of the identified state-space model, and ȲI has each of its N I columns equal to Y measi Y measi R p NI denotes the first N I output measurements Similarly the validation error,e V, can be obtained by replacing N I by N V in the computations in (27) The number of data points used for the identification experiment is N I = 50 and for computing validation error is N V = 400 The trade-off curve between the identification Identification/Validation error Identification error Validation error Nuclear norm of YU Fig Trade off curve between Identification error and Nuclear norm for the data-set The bigger dot on the identification error curve corresponds to a very small identification error of and corresponds to a λ = 6 Normalized Singular Values original nucnorm Singular Value Index Fig 2 Normalized singular values of Y U with Y = H r(ŷ ) obtained through nuclear norm heuristic for the data-set A plot of the normalized singular values of Y measi U (original) is also shown error,validation error and nuclear norm for is shown in Fig We pic λ = 6, that corresponds with approximately the smallest identification error of as indicated in Fig The normalized singular values of Y U (obtained by setting maximum singular value to ) using just nuclear norm heuristic as in (9) are shown in Fig 2 As can be seen from Fig 2, there is no sharp drop in singular values that would clearly indicate the ran of Y U, therefore we use the thumbrule described earlier to obtain the ran (and order of the system) to be 6 The termination criterion we use for RRNH is to stop after 4 iterations since we observe empirically that there is no significant change in the optimized variable after 4 iterations For the dual-gradient method used in each iteration of RRNH, the termination criterion used is such that the number of iterations, Q = min(q, Q 2 ), where Q is the number of iterations for the duality gap to fall below a tolerance of 0 4 and Q 2 = 4000 Fig 3 shows the results of RRNH for λ = 6 and different values of the parameter δ The identification error and validation error were obtained as 0069 and 054 respectively, which is comparable to the errors (0069 and 02 respectively) obtained for this data set in [7] The parameter δ, which is used as a regularization term in the weights, W, W 2 seems to have an influence on the singular values of H r (Ŷ )U and thus its ran We observe empirically that as λ increases, smaller values of

6 Normalized Singular Values original δ = 005 δ = 0 δ = 05 δ = Singular Value Index Fig 3 Normalized singular values of Y U with Y = H r(ŷ ) obtained through regularized reweighted nuclear norm heuristic(rrnh) for the dataset for different values of δ A plot of the normalized singular values of Y measi U (original) is also shown δ give a clearer ran description for H r (Ŷ )U As can be seen from Fig 3,δ = 05 gives a clearer description of the ran, ie, equal to 4 Smaller values of δ (less than 00) don t seem to provide a clear description of the ran and this observation mirrors the observations made in [3] about the choice of ɛ for which the iterative weighted l algorithm recovers sparse solutions Thus, for the data-set considered, we obtain a reduction in model order (from 6 to 4) as well as a much clearer description of model order by using RRNH as compared to the nuclear norm heuristic V CONCLUSIONS We explored the convergence properties of the reweighed trace heuristic, showing that the difference between the successive iterates of this heuristic goes to zero, and that every convergent subsequence converges to a stationary point of the concave surrogate function We gave a reformulation of this heuristic as the reweighted nuclear norm heuristic (RNH), which allows for efficient and scalable implementation through first-order gradient methods such as the gradient projection method and conditional gradient method, as compared to the reweighted trace formulation which requires solving an SDP at each iteration We apply the RRNH to a System Identification application and show that the RRNH provides a clearer description of the matrix ran (and hence system order) through a sharp fall in singular values in the singular value plot of H r (Ŷ )U We also observe that the RRNH gives a lower system order as compared to nuclear norm heuristic (without weighting) for the data set considered, with the identification and validation errors comparable to those obtained for this data set in [7] We observe empirically that as λ increases, smaller values of δ give a clearer ran description for H r (Ŷ )U It would be useful to understand precisely how δ plays a role in providing a clear ran description as λ varies We mentioned that the RRNH allows for an efficient implementation of the reweighted trace heuristic It would be useful to quantify the efficiency and scalability of RRNH and compare it with the nuclear norm heuristic implemented using the interior point method detailed in [7] ACKNOWLEDGEMENTS We would lie to than Paul Tseng for very helpful discussions and for his implementation of the dual gradient projection method, which we adapted for our algorithm APPENDIX Estimate of Lipschitz constant For any, 2 R rp q, D ( ) D ( 2 ) F = Φ Φ ( 2 ) F (28) Note that H r Hr and Φ Φ are self-adjoint operators Φ Φ is a compact operator since its range is finite dimensional, and it is positive since, Φ Φ () = Φ (), Φ () = Φ () 2 F 0 We apply the Rayleigh-Ritz method for self-adjoint, positive, compact operators [3] to express the maximum eigenvalue of Φ Φ as λ max(φ Φ ) = sup W: W F = W, Φ Φ (W) and get Φ Φ ( 2 ) 2 F 2 2 F sup W Φ Φ (W) 2 F W 2 F = λ 2 max (Φ Φ ) Thus an upper bound on λ max can be used to find an estimate L of the Lipschitz constant of the gradient of the dual objective, D We obtain an estimate of λ max (Φ Φ ) below From (9), it is easy to see by using the properties of norms that Hr (X) 2 F r X 2 F Also by using the properties of trace, we have Φ () 2 F r W V, W V Let A = (W2 )2, with eigenvalues ρ 2 ρ2 2 ρ2 q, and let γ 2 γ2 2 γ2 rp be the eigenvalues of (W )2 Using Von Neumann s Trace inequality (see eg [0]), it can be shown that W W2, W W2 ρ 2 γ 2 2 F Thus we have ΦΦ ( 2 ) 2 F 2 2 F λ max (ΦΦ ) 2 ( Φ (W) 2 ) 2 F = sup W W 2 F r 2 ρ 4 4 γ (29) Thus, L = rρ 2 γ 2 (upper-bound on λ max (ΦΦ )) is an estimate of the lipschitz constant of D REFERENCES [] DP Bertseas Nonlinear Programming 999 [2] J Brinhuis, Luo, and S hang Matrix convex functions with applications to weighted centres for semidefinite programming 2005 [3] EJ Candes, MB Wain, and S Boyd Enhancing sparsity by reweighted l minimization Journal of Fourier Analysis and Applications, 4: , 2008 [4] EJCandes and BRecht Exact matrix completion via convex optimization [5] L Ljung System Identification - Theory for the User 2002 [6] MS Lobo, M Fazel, and S Boyd Portfolio optimization with linear and fixed transaction costs [7] MFazel, HHindi, and SBoyd A ran minimization heuristic with application to minimum order system approximation In Proceedings American Control Conference, 200 [8] MFazel, HHindi, and SBoyd Log-det heuristic for matrix ran minimization with applications to hanel and euclidean distance matrices In Proceedings American Control Conference, 2003 [9] MFazel, HHindi, and SBoyd Ran minimization and applications in system theory In Proceedings American Control Conference, 2004 [0] L Mirsy A trace inequality of john von neumann Monatshefte fur Mathemati, 975

7 [] B Moor, P Gersem, B Schutter, and W Favoreel Daisy: A database for the identification of systems [2] BD Moor, M Moonen, LVandenberghe, and J Vandewalle A geometrical approach for the identification of state space models with the singular value decomposition In Proceedings of the 988 IEEE International Conference on Acoustics, Speech, and Signal Processing, 988 [3] AW Naylor and GR Sell Linear Operator Theory in Engineering and Science 982 [4] B Recht, M Fazel, and P A Parrilo Guaranteed minimum ran solutions to linear matrix equations via nuclear norm minimization Accepted for publication, SIAM Review [5] TKPong, P Tseng, S Ji, and J Ye Trace norm regularization: Formulations, algorithms, and mult-tas learning 2009 [6] Liu and LVandenberghe Semidefinite programming methods for system realization and identification In Proceedings Conference on Decision and Control, 2009 [7] Liu and L Vandenberghe Interior-point method for nuclear norm approximation with application to system identification Submitted to SIAM Journal on Matrix Analysis and Applications, 2009

System Identification by Nuclear Norm Minimization

System Identification by Nuclear Norm Minimization Dept. of Information Engineering University of Pisa (Italy) System Identification by Nuclear Norm Minimization eng. Sergio Grammatico grammatico.sergio@gmail.com Class of Identification of Uncertain Systems

More information

Matrix Rank Minimization with Applications

Matrix Rank Minimization with Applications Matrix Rank Minimization with Applications Maryam Fazel Haitham Hindi Stephen Boyd Information Systems Lab Electrical Engineering Department Stanford University 8/2001 ACC 01 Outline Rank Minimization

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Rank Minimization and Applications in System Theory

Rank Minimization and Applications in System Theory Rank Minimization and Applications in System Theory M. Fazel, H. Hindi, and S. Boyd Abstract In this tutorial paper, we consider the problem of minimizing the rank of a matrix over a convex set. The Rank

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

A Greedy Framework for First-Order Optimization

A Greedy Framework for First-Order Optimization A Greedy Framework for First-Order Optimization Jacob Steinhardt Department of Computer Science Stanford University Stanford, CA 94305 jsteinhardt@cs.stanford.edu Jonathan Huggins Department of EECS Massachusetts

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Network Localization via Schatten Quasi-Norm Minimization

Network Localization via Schatten Quasi-Norm Minimization Network Localization via Schatten Quasi-Norm Minimization Anthony Man-Cho So Department of Systems Engineering & Engineering Management The Chinese University of Hong Kong (Joint Work with Senshan Ji Kam-Fung

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Fidelity and trace norm

Fidelity and trace norm Fidelity and trace norm In this lecture we will see two simple examples of semidefinite programs for quantities that are interesting in the theory of quantum information: the first is the trace norm of

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Rank-Constrained Optimization and its Applications

Rank-Constrained Optimization and its Applications Ran-Constrained Optimization and its Applications Chuangchuang Sun and Ran Dai a, a Department of Aerospace Engineering, Iowa State University, Ames, IA, 50011, United States Abstract This paper investigates

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

Robust Subspace System Identification via Weighted Nuclear Norm Optimization

Robust Subspace System Identification via Weighted Nuclear Norm Optimization Robust Subspace System Identification via Weighted Nuclear Norm Optimization Dorsa Sadigh Henrik Ohlsson, S. Shankar Sastry Sanjit A. Seshia University of California, Berkeley, Berkeley, CA 9470 USA. {dsadigh,ohlsson,sastry,sseshia}@eecs.berkeley.edu

More information

12.4 Known Channel (Water-Filling Solution)

12.4 Known Channel (Water-Filling Solution) ECEn 665: Antennas and Propagation for Wireless Communications 54 2.4 Known Channel (Water-Filling Solution) The channel scenarios we have looed at above represent special cases for which the capacity

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Forty-Fifth Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 26-28, 27 WeA3.2 Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization Benjamin

More information

arxiv: v1 [math.oc] 1 Jul 2016

arxiv: v1 [math.oc] 1 Jul 2016 Convergence Rate of Frank-Wolfe for Non-Convex Objectives Simon Lacoste-Julien INRIA - SIERRA team ENS, Paris June 8, 016 Abstract arxiv:1607.00345v1 [math.oc] 1 Jul 016 We give a simple proof that the

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

A priori bounds on the condition numbers in interior-point methods

A priori bounds on the condition numbers in interior-point methods A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Fast Linear Iterations for Distributed Averaging 1

Fast Linear Iterations for Distributed Averaging 1 Fast Linear Iterations for Distributed Averaging 1 Lin Xiao Stephen Boyd Information Systems Laboratory, Stanford University Stanford, CA 943-91 lxiao@stanford.edu, boyd@stanford.edu Abstract We consider

More information

A Fast Algorithm for Reconstruction of Spectrally Sparse Signals in Super-Resolution

A Fast Algorithm for Reconstruction of Spectrally Sparse Signals in Super-Resolution A Fast Algorithm for Reconstruction of Spectrally Sparse Signals in Super-Resolution Jian-Feng ai a, Suhui Liu a, and Weiyu Xu b a Department of Mathematics, University of Iowa, Iowa ity, IA 52242; b Department

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

The convex algebraic geometry of rank minimization

The convex algebraic geometry of rank minimization The convex algebraic geometry of rank minimization Pablo A. Parrilo Laboratory for Information and Decision Systems Massachusetts Institute of Technology International Symposium on Mathematical Programming

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

Collaborative Filtering Matrix Completion Alternating Least Squares

Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 19, 2016

More information

Transmitter optimization for distributed Gaussian MIMO channels

Transmitter optimization for distributed Gaussian MIMO channels Transmitter optimization for distributed Gaussian MIMO channels Hon-Fah Chong Electrical & Computer Eng Dept National University of Singapore Email: chonghonfah@ieeeorg Mehul Motani Electrical & Computer

More information

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17 Logistics Week 8: Friday, Oct 17 1. HW 3 errata: in Problem 1, I meant to say p i < i, not that p i is strictly ascending my apologies. You would want p i > i if you were simply forming the matrices and

More information

Iteration-complexity of first-order penalty methods for convex programming

Iteration-complexity of first-order penalty methods for convex programming Iteration-complexity of first-order penalty methods for convex programming Guanghui Lan Renato D.C. Monteiro July 24, 2008 Abstract This paper considers a special but broad class of convex programing CP)

More information

A note on rank constrained solutions to linear matrix equations

A note on rank constrained solutions to linear matrix equations A note on rank constrained solutions to linear matrix equations Shravan Mohan September 5, 218 Abstract This preliminary note presents a heuristic for determining rank constrained solutions to linear matrix

More information

Low-Rank Matrix Recovery

Low-Rank Matrix Recovery ELE 538B: Mathematics of High-Dimensional Data Low-Rank Matrix Recovery Yuxin Chen Princeton University, Fall 2018 Outline Motivation Problem setup Nuclear norm minimization RIP and low-rank matrix recovery

More information

Sparse and Low-Rank Matrix Decompositions

Sparse and Low-Rank Matrix Decompositions Forty-Seventh Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 30 - October 2, 2009 Sparse and Low-Rank Matrix Decompositions Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo,

More information

EE Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko, JPL Third Term

EE Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko, JPL Third Term EE 150 - Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko JPL Third Term 2011-2012 Due on Thursday May 3 in class. Homework Set #4 1. (10 points) (Adapted

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

Relaxations and Randomized Methods for Nonconvex QCQPs

Relaxations and Randomized Methods for Nonconvex QCQPs Relaxations and Randomized Methods for Nonconvex QCQPs Alexandre d Aspremont, Stephen Boyd EE392o, Stanford University Autumn, 2003 Introduction While some special classes of nonconvex problems can be

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm J. K. Pant, W.-S. Lu, and A. Antoniou University of Victoria August 25, 2011 Compressive Sensing 1 University

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo January 29, 2012 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Convex Optimization of Graph Laplacian Eigenvalues

Convex Optimization of Graph Laplacian Eigenvalues Convex Optimization of Graph Laplacian Eigenvalues Stephen Boyd Abstract. We consider the problem of choosing the edge weights of an undirected graph so as to maximize or minimize some function of the

More information

Sparse Approximation via Penalty Decomposition Methods

Sparse Approximation via Penalty Decomposition Methods Sparse Approximation via Penalty Decomposition Methods Zhaosong Lu Yong Zhang February 19, 2012 Abstract In this paper we consider sparse approximation problems, that is, general l 0 minimization problems

More information

Solving large Semidefinite Programs - Part 1 and 2

Solving large Semidefinite Programs - Part 1 and 2 Solving large Semidefinite Programs - Part 1 and 2 Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Singapore workshop 2006 p.1/34 Overview Limits of Interior

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs 2015 American Control Conference Palmer House Hilton July 1-3, 2015. Chicago, IL, USA Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs Raphael Louca and Eilyan Bitar

More information

Lecture 20: November 1st

Lecture 20: November 1st 10-725: Optimization Fall 2012 Lecture 20: November 1st Lecturer: Geoff Gordon Scribes: Xiaolong Shen, Alex Beutel Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Spectral k-support Norm Regularization

Spectral k-support Norm Regularization Spectral k-support Norm Regularization Andrew McDonald Department of Computer Science, UCL (Joint work with Massimiliano Pontil and Dimitris Stamos) 25 March, 2015 1 / 19 Problem: Matrix Completion Goal:

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

Constrained optimization

Constrained optimization Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

More information

Near-Potential Games: Geometry and Dynamics

Near-Potential Games: Geometry and Dynamics Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics

More information

Key words. Rank minimization, nuclear norm, Hankel matrix, first-order method, system identification, system realization

Key words. Rank minimization, nuclear norm, Hankel matrix, first-order method, system identification, system realization HANKEL MATRIX RANK MINIMIZATION WITH APPLICATIONS TO SYSTEM IDENTIFICATION AND REALIZATION MARYAM FAZEL, TING KEI PONG, DEFENG SUN, AND PAUL TSENG In honor of Professor Paul Tseng, who went missing while

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Summer School: Semidefinite Optimization

Summer School: Semidefinite Optimization Summer School: Semidefinite Optimization Christine Bachoc Université Bordeaux I, IMB Research Training Group Experimental and Constructive Algebra Haus Karrenberg, Sept. 3 - Sept. 7, 2012 Duality Theory

More information

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones

An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones An Algorithm for Solving the Convex Feasibility Problem With Linear Matrix Inequality Constraints and an Implementation for Second-Order Cones Bryan Karlovitz July 19, 2012 West Chester University of Pennsylvania

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Semidefinite Programming Duality and Linear Time-invariant Systems

Semidefinite Programming Duality and Linear Time-invariant Systems Semidefinite Programming Duality and Linear Time-invariant Systems Venkataramanan (Ragu) Balakrishnan School of ECE, Purdue University 2 July 2004 Workshop on Linear Matrix Inequalities in Control LAAS-CNRS,

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

Self-Calibration and Biconvex Compressive Sensing

Self-Calibration and Biconvex Compressive Sensing Self-Calibration and Biconvex Compressive Sensing Shuyang Ling Department of Mathematics, UC Davis July 12, 2017 Shuyang Ling (UC Davis) SIAM Annual Meeting, 2017, Pittsburgh July 12, 2017 1 / 22 Acknowledgements

More information

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

Canonical Problem Forms. Ryan Tibshirani Convex Optimization Canonical Problem Forms Ryan Tibshirani Convex Optimization 10-725 Last time: optimization basics Optimization terology (e.g., criterion, constraints, feasible points, solutions) Properties and first-order

More information

Three Generalizations of Compressed Sensing

Three Generalizations of Compressed Sensing Thomas Blumensath School of Mathematics The University of Southampton June, 2010 home prev next page Compressed Sensing and beyond y = Φx + e x R N or x C N x K is K-sparse and x x K 2 is small y R M or

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

arxiv: v1 [math.oc] 11 Jun 2009

arxiv: v1 [math.oc] 11 Jun 2009 RANK-SPARSITY INCOHERENCE FOR MATRIX DECOMPOSITION VENKAT CHANDRASEKARAN, SUJAY SANGHAVI, PABLO A. PARRILO, S. WILLSKY AND ALAN arxiv:0906.2220v1 [math.oc] 11 Jun 2009 Abstract. Suppose we are given a

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Applied Machine Learning for Biomedical Engineering. Enrico Grisan Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination

More information