Algorithms for tensor approximations

Size: px
Start display at page:

Download "Algorithms for tensor approximations"

Transcription

1 Algorithms for tensor approximations Lek-Heng Lim MSRI Summer Graduate Workshop July 7 18, 2008 L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

2 Synopsis Naïve: the Gauss-Seidel heuristic. Harmonic analysis: pursuits algorithms. Real algebraic geometry: semi-definite programming. Riemannian geometry: Grassman-Newton method. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

3 Recap: best low rank approximation of a hypermatrix Outer product rank: A R l m n. Want u i R l, v i R m, w i R n unit vectors, σ i R, that minimize A r σ iu i v i w i. i=1 Symmetric outer product rank: A S 3 (R n ). Want v i unit vector, λ i R, that minimize A r λ iv i v i v i. i=1 Nonnegative outer product rank: A R l m n +. Want x i R l +, y i R m +, z i R n + unit vectors, δ i R +, that minimize A r i=1 δ ix i y i z i. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

4 Recap: best low rank approximation of a hypermatrix Multilinear rank: A R l m n. Want U R l r 1, V R m r 2, W R n r 3 matrices with orthonormal columns, C R r 1 r 2 r 3, that minimize A (U, V, W ) C. Hybrid: A R l m n. Want B 1,..., B r R l m n with rank (B i ) (r 1, r 2, r 3 ), B i = 1, that minimize A r i=1 σ ib i. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

5 Gauss-Seidel method Optimal solution B to argmin rank (B) r A B F not easy to compute since the objective function is non-convex. A widely used strategy is a nonlinear Gauss-Seidel algorithm, better known as the Alternating Least Squares algorithm: Algorithm: ALS for optimal rank-r approximation initialize X (0) R l r, Y (0) R m r, Z (0) R n r ; initialize s (0), ε > 0, k = 0; while ρ (k+1) /ρ (k) > ε; X (k+1) argmin X R l r T r α=1 x(k+1) α Y (k+1) argmin Ȳ R m r T r α=1 x(k+1) α Z (k+1) argmin Z R n r T r α=1 x(k+1) α ρ (k+1) r α=1 [x(k+1) a y (k+1) α z (k+1) α y (k) α ȳ (k+1) α y (k+1) α x (k) α k k + 1; Coordinate cycling heuristic. May not converge. z (k) α 2 F ; z (k) α 2 F ; z (k+1) α 2 F ; y α (k) z (k) α ] 2 F ; L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

6 Best r-term approximation f α 1 f 1 + α 2 f α r f r. Target function f H vector space, cone, etc. f 1,..., f r D H dictionary. α 1,..., α r R or C (linear), R + (convex), R { } (tropical). with respect to ϕ : H H R, some measure of nearness between pairs of points (e.g. norms, metric, volumes, expectation, entropy, Brègman divergences, etc), want argmin{ϕ(f, α 1 f α r f r ) f i D}. For concreteness, H separable Hilbert space; measure of nearness is a norm, but not necessarily the one induced by its inner product. Reference: various papers by A. Cohen, R. DeVore, V. Temlyakov. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

7 Recap: dictionaries Discrete cosine: D = { 2 N cos(k )(n ) π N } k [N 1] C N. Taylor: Fourier: D = {x n n N {0}} C ω (R). D = {cos(nx), sin(nx) n Z} L 2 ( π, π). Peter-Weyl: D = { π(x)e i, e j π Ĝ, i, j [d π]} L 2 (G). L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

8 Recap: dictionaries Paley-Wiener: Gabor: Wavelet: D = {sinc(x n) n Z} H 2 (R). D = {e iαnx e (x mβ)2 /2 (m, n) Z Z} L 2 (R). D = {2 n/2 ψ(2 n x m) (m, n) Z Z} L 2 (R). Friends of wavelets: D L 2 (R 2 ) beamlets, brushlets, curvelets, ridgelets, wedgelets, multiwavelets. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

9 Approximants Definition Dictionary D H. For r N, the set of r-term approximants is { r } Σ r (D) := α if i H α i C, f i D. i=1 Let f H. The error of r-term approximation is σ n (f ) := inf g Σr (D) f g. Linear combination of two r-term approximants may have more than r non-zero terms. Σ r (D) not a subspace of H. Hence nonlinear approximation. In contrast with usual (linear) approximation, ie. inf g span(d) f g. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

10 Small is beautiful f i I D α if i Want good approximation, ie. f i I D α if i small. Want sparse/concentrated representation, ie. I small. Sparsity depends on choice of D. D 10 = {10 n n Z}, D 3 = {3 n n Z} R, 1 3 = [ ] 10 = n= n = [0.1] 3 = D fourier = {cos(nx), sin(nx) n Z}, Dtaylor = {x n n N {0}}, 1 2 x = sin(x) 1 2 sin(2x) sin(3x). sin(x) = x 1 6 x x 5. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

11 Bigger is better Union of dictionaries: allows for efficient (sparse) representation of different features D = Dfourier D wavelets, D = Dspikes D sinusoids D splines, D = D wavelets D curvelets D beamlets D ridgelets. D overcomplete or redundant dictionary. Trade off: computational complexity. Rule of thumb: the larger and more diverse the dictionary, the more efficient/sparser the representation. Observation: D above all zero dimensional (at most countably infinite). Question: What about dictionaries with a continuously varying families of functions? L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

12 Dictionaries of positive dimensions Neural networks: D = {σ(w x + w 0 ) L 2 (R n ) (w 0, w) R R n } where σ : R R sigmoid function, eg. σ(x) = [1 + exp( x)] 1. Exponential: Separable: D = {e tx t R + } or D = {e τx τ C}. D = {g L 2 (R 3 ) g(x, y, z) = ϑ(x)ϕ(y)ψ(z)} where ϑ, ϕ, ψ : R R. Symmetric separable: where ϕ : R R. D = {g L 2 (R 3 ) g(x, y, z) = ϕ(x)ϕ(y)ϕ(z)} L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

13 Same thing different names rth secant (quasiprojective) variety of the Segre variety is the set of r term approximants. If D = Seg(R l, R m, R n ), then Σ r (D) = {A R l m n rank (A) r}. Outer product decomposition: D = {u v w (u, v, w) R l R m R n } = {A R l m n rank (A) 1}. Symmetric outer product decomposition: D = {v v v v R n } = {A S 3 (R n ) rank S (A) 1}. Nonnegative outer product decomposition: D = {x y z (x, y, z) R l + R m + R n +} = {A R l m n + rank + (A) 1}. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

14 Pursuit algorithms Stepwise projection: g k = argmin g D { f h h span{g 1,..., g k 1, g}}, f k = proj span{g1,...,g k }(f ). Orthonormal matching pursuit: g k = argmax g D f f k 1, g, f k = proj span{g1,...,g k }(f ). Pure greedy: g k = argmax g D f f k 1, g, f k = f k 1 + f f k 1, g k g k. Relaxed greedy: g k = argmin g D { f h h span{f k 1, g}}, f k = α k f k 1 + β k g k. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

15 Recap: hypermatrices are functions on finite sets Totally ordered finite sets: [n] = {1 < 2 < < n}, n N. Hypermatrix (order 3) f : [l] [m] [n] R. If f (i, j, k) = a ijk, then f is represented by A = a ijk l,m,n i,j,k=1 Rl m n. l 2 ([l] [m] [n]) = l 2 ([l]) l 2 ([m]) l 2 ([n]): A, B R l m n, A, B = l,m,n i,j,k=1 a ijkb ijk. Frobenius norm A 2 F = l,m,n i,j,k=1 a2 ijk. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

16 Pursuit algorithms for tensor approximations Tensor approximation. Target function f : [l] [m] [n] R. Dictionary of separable functions, D = {g : [l] [m] [n] R g(i, j, k) = ϑ(i)ϕ(j)ψ(k)}, where ϑ : [l] R, ϕ : [m] R, ψ : [n] R. Symmetric tensor approximation. Target function: f : [n] [n] [n] R with f (i, j, k) = f (j, i, k) = = f (k, j, i). Dictionary of symmetric separable functions: D S = {g : [n] [n] [n] R g(i, j, k) = ϑ(i)ϑ(j)ϑ(k)}, where ϑ : [l] R. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

17 Pursuit algorithms for tensor approximations Nonnegative tensor approximation. Target function f : [l] [m] [n] R +. Dictionary of nonnegative separable functions, D + = {g : [l] [m] [n] R + g(i, j, k) = ϑ(i)ϕ(j)ψ(k)}, where ϑ : [l] R +, ϕ : [m] R +, ψ : [n] R +. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

18 Some history f polynomial in variables x = (x 1,..., x N ). Suppose f : R N R non-negative valued, ie. f (x) 0 for all x R N. Question: Can we write f as a sum of squares of polynomials, f (x) = M j=1 p j(x) 2? Answer (Hilbert): Not in general, eg. f (w, x, y, z) = w 4 + x 2 y 2 + y 2 z 2 + z 2 x 2 4xyzw. Hilbert s 17th Problem: Can we write f as a sum of squares of rational functions, Answer (Artin): Yes! f (x) = M j=1 ( ) pj (x) 2? q j (x) L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

19 SDP based algorithms Observation 1: F (x 11,..., z nr ) = A r α=1 x α y α z α 2 F = l,m,n i,j,k=1 (a ijk r α=1 x iαy jα z kα ) 2 is a polynomial of total degree 6 (resp. 2k for order k-tensors) in variables x 11,..., z nr. Multivariate polynomial optimization: non-convex problem argmin F (x 11,..., z nr ) may be relaxed to a convex problem (thus global optima is guranteed) which can in turn be solved using semidefinite programming (SDP). [Lasserre; 2001], [Parrilo; 2003], [Parrilo, Sturmfels; 2003]. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

20 How it works Observation 2: If F λ can be expressed as a sum of squares of polynomials F (x 11,..., z nr ) λ = n then λ is a global lower bound for F, ie. for all x 11,..., z nr R. F (x 11,..., z nr ) λ i=1 P i(x 11,..., z nr ) 2, Simple strategy: Find the largest λ such that F λ is a sum of squares. Then λ is often min F (x 11,..., z nr ). L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

21 Sketch Write v = (1, x 11,..., z nr,..., x l1 y m1 z n1,..., znr 6 ), the D-tuple of monomials of total degree 6, where ( ) r(l + m + n) + 3 D :=. 3 Write F (x 11,..., z nr ) = α v where α = (α 1,..., α D ) R D are the coefficients of the respective monomials. Since deg(f ) is even, F may also be written as for some M R D D. So where E 11 = e 1 e 1 RD D. F (x 11,..., z nr ) = v Mv F (x 11,..., z nr ) λ = v (M λe 11 )v L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

22 Sketch Observation 3: The rhs is a sum of squares iff M λe 11 is positive semidefinite (since M λe 11 = B B). Hence we have This is an SDP problem minimize λ subjected to v (S + λe 11 )v = F, S 0. minimize 0 S λ subjected to S B 1 + λ = α 1, S B k = α k, k = 2,..., D S 0, λ R. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

23 Properties May be solved in polynomial time. Like all SDP-based algorithms, duality produces a certificate that tells us whether we have arrived at a globally optimal solution. The duality gap, ie. difference between the values of the primal and dual objective functions, is 0 at a global minima. Complexity: For rank-r approximations to order-k tensors A R d 1 d k, ( ) r(d1 + + d k ) + k D = k is large even for moderate d i, r and k. Sparsity to the rescue: The polynomials that we are interested in are always sparse (eg. for k = 3, only terms of the form xyz or x 2 y 2 z 2 or uvwxyz appear). L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

24 Newton polytope Newton polytope of a polynomial f is the convex hull of the powers of the monomials in f. Example Newton polytope of f (x, y) = 3.67x 4 y x 3 y x y is the convex hull of the points (4, 10), (3, 3), (3, 0), (2, 0), (0, 0) in R 2. Newton polytope of g(x, y, z) = 1.7x 4 y 6 z x 3 z 5 3.0y yz 2 is the convex hull of the points (4, 6, 2), (3, 0, 5), (0, 4, 0), (0, 1, 2) in R 3. Theorem (Reznick) If f (x) = m i=1 p i(x) 2, then the powers of the monomials in p i must lie in Newton(f ). 1 2 L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

25 Multilinear polynomial The Newton polytope for a polynomial of the form f (x 11,..., z nr ) = λ + l,m,n i,j,k=1 (a ijk r α=1 x iαy jα z kα ) 2 is spanned by 1 and monomials of the form xiα 2 y jα 2 z2 kα (ie. monomials of the form x iα y jα z kα and x iα y jα z kα x iβ y jβ z kβ may all be dropped). So if f (x 11,..., z nr ) = N j=1 p j(x 11,..., z nr ) 2, then only 1 and monomials of the form x iα y jα z kα may occur in p 1,..., p N. In other words, ) we have reduced the size of the problem from to rlmn + 1. ( r(l+m+n)+3 3 L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

26 Global convergence If polynomials of the form λ + l,m,n i,j,k=1 (a ijk r α=1 x iαy jα z kα ) 2 can always be written as a sum of polynomials (we don t know), then the SDP algorithm for optimal low-rank tensor approximation will always converge globally. Numerical experiments performed by Parrilo on general polynomials yield λ = min F in all cases. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

27 Best multilinear rank approximation Given A R l m n, want rank (B) = (r 1, r 2, r 3 ) with min A B F = min A (X, Y, Z) C F C R r 1 r 2 r 3, X R l r 1, Y R m r 2, Z R n r 3 orthonormal. Problem overparameterized and equivalent to max (X, Y, Z ) A = max A (X, Y, Z) F, F X X = I, Y Y = I, Z Z = I. Problem defined on a product of Grassmann manifolds since A (X, Y, Z) F = A (XQ 1, YQ 2, ZQ 3 ) F, for any (Q 1, Q 2, Q 3 ) O(l) O(m) O(n). Only the subspaces spanned by X, Y, Z matters. Problem reformulated as max (X,Y,Z) Gr(l,r1 ) Gr(m,r 2 ) Gr(n,r 3 ) Φ(X, Y, Z). L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

28 Newton and Quasi-Newton algorithms on manifolds T X tangent space at X Gr(n, r) R n r T X X = 0 1 Compute Grassmann gradient Φ T (X,Y,Z). 2 Compute Hessian or update Hessian approximation H : T (X,Y,Z) H T (X,Y,Z). 3 At (X, Y, Z) Gr(l, r 1 ) Gr(m, r 2 ) Gr(n, r 3 ), solve H = Φ for search direction. 4 Update iterate (X, Y, Z): Move along geodesic from (X, Y, Z) in the direction given by. Optimize over a product of three (or more) Grassmannians. [Gabay, 1982], [Arias, Edelman, Smith; 1999], [Eldén, Savas; 2008]. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

29 Picture L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

30 Quasi-Newton and BFGS update The BFGS update H k+1 = H k H ks k s k H k s k H ks k + y ky k y k y k where s k = x k+1 x k = t k p k, y k = f k+1 f k. On Grassmann manifold the vectors are defined on different points belonging to different tangent spaces. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

31 Different ways of parallel transporting vectors X Gr(n, r), 1, 2 T X and X (t) geodesic path along 1 Parallel transport using global coordinates 2 (t) = T 1 (t) 2 we have also 1 = X D 1 and 2 = X D 2 where X basis for T X. Let X (t) be basis for T X (t). Parallel transport using local coordinates 2 (t) = X (t) D 2. L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

32 Parallel transport in local coordinates All transported tangent vectors have the same coordinate representation in the basis X (t) at all points on the path X (t). Plus No need to transport the gradient or the Hessian. Minus Need to compute X (t). In global coordinate we compute T k+1 s k = t k T k (t k )p k T k+1 y k = f k+1 T k (t k ) f k T k (t k )H k T 1 k (t k ) : T k+1 T k+1 H k+1 = H k H ks k s k H k s k H ks k + y ky k y k y k L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

33 Limited memory BFGS Compact representation of BFGS in Euclidean space: H k = H 0 + [ S k where ] [ R H 0 Y k k S k = [s 0,..., s k 1 ], (D k + Y k H 0Y k )R 1 R k k R 1 k 0 Y k = [y 0,..., y k 1 ], [ ] D k = diag s 0 y 0,..., s k 1 y k 1, s 0 y 0 s 0 y 1 s 0 y k 1 0 s 1 R k = y 1 s 1 y k s k 1 y k 1 ] [ ] S k Yk H 0 L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

34 Limited memory BFGS Limited memory BFGS [Byrd et al; 1994]. Replace H 0 by γ k I and keep the m most resent s j and y j, H k = γ k I + [ S k where ] [ R γ k Y k k S k = [s k m,..., s k 1 ], (D k + γ k Y k Y k)r 1 R k k R 1 k 0 ] [ S k γ k Y k Y k = [y k m,..., y k 1 ], [ ] D k = diag s k m y k m,..., s k 1 y k 1, s k m y k m s k m y k m+1 s k m y k 1 0 s k m+1 R k = y k m+1 s k m+1 y k s k 1 y k 1 ] L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

35 L-BFGS on the Grassmann manifold In each iteration, parallel transport vectors in S k and Y k to T k, ie. perform S k = TS k, Ȳ k = TY k where T is the transport matrix. No need to modify R k or D k u, v = T u, T v where u, v T k and T u, T v T k+1. H k nonsingular, Hessian is singular. No problem T k at x k is invariant subspace of H k, ie. if v T k then H k v T k. [Savas, L.; 2008] L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, / 35

Globally convergent algorithms for PARAFAC with semi-definite programming

Globally convergent algorithms for PARAFAC with semi-definite programming Globally convergent algorithms for PARAFAC with semi-definite programming Lek-Heng Lim 6th ERCIM Workshop on Matrix Computations and Statistics Copenhagen, Denmark April 1 3, 2005 Thanks: Vin de Silva,

More information

Algebraic models for higher-order correlations

Algebraic models for higher-order correlations Algebraic models for higher-order correlations Lek-Heng Lim and Jason Morton U.C. Berkeley and Stanford Univ. December 15, 2008 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations

More information

Numerical Multilinear Algebra II

Numerical Multilinear Algebra II Numerical Multilinear Algebra II Lek-Heng Lim University of California, Berkeley January 5 7, 2009 L.-H. Lim (ICM Lecture) Numerical Multilinear Algebra II January 5 7, 2009 1 / 61 Recap: tensor ranks

More information

Conditioning and illposedness of tensor approximations

Conditioning and illposedness of tensor approximations Conditioning and illposedness of tensor approximations and why engineers should care Lek-Heng Lim MSRI Summer Graduate Workshop July 7 18, 2008 L.-H. Lim (MSRI SGW) Conditioning and illposedness July 7

More information

QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS

QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS BERKANT SAVAS AND LEK-HENG LIM Abstract. In this paper we proposed quasi-newton and limited memory quasi-newton methods for

More information

Principal Cumulant Component Analysis

Principal Cumulant Component Analysis Principal Cumulant Component Analysis Lek-Heng Lim University of California, Berkeley July 4, 2009 (Joint work with: Jason Morton, Stanford University) L.-H. Lim (Berkeley) Principal Cumulant Component

More information

Hyperdeterminants, secant varieties, and tensor approximations

Hyperdeterminants, secant varieties, and tensor approximations Hyperdeterminants, secant varieties, and tensor approximations Lek-Heng Lim University of California, Berkeley April 23, 2008 Joint work with Vin de Silva L.-H. Lim (Berkeley) Tensor approximations April

More information

Spectrum and Pseudospectrum of a Tensor

Spectrum and Pseudospectrum of a Tensor Spectrum and Pseudospectrum of a Tensor Lek-Heng Lim University of California, Berkeley July 11, 2008 L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 1 / 27 Matrix eigenvalues

More information

Numerical multilinear algebra in data analysis

Numerical multilinear algebra in data analysis Numerical multilinear algebra in data analysis (Ten ways to decompose a tensor) Lek-Heng Lim Stanford University April 5, 2007 Lek-Heng Lim (Stanford University) Numerical multilinear algebra in data analysis

More information

Eigenvalues of tensors

Eigenvalues of tensors Eigenvalues of tensors and some very basic spectral hypergraph theory Lek-Heng Lim Matrix Computations and Scientific Computing Seminar April 16, 2008 Lek-Heng Lim (MCSC Seminar) Eigenvalues of tensors

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Low-rank approximation of tensors and the statistical analysis of multi-indexed data

Low-rank approximation of tensors and the statistical analysis of multi-indexed data Low-rank approximation of tensors and the statistical analysis of multi-indexed data Lek-Heng Lim Institute for Computational and Mathematical Engineering Stanford University NERSC Scientific Computing

More information

c 2010 Society for Industrial and Applied Mathematics

c 2010 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 32, No. 6, pp. 3352 3393 c 2010 Society for Industrial and Applied Mathematics QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS BERKANT SAVAS AND

More information

Numerical Multilinear Algebra in Data Analysis

Numerical Multilinear Algebra in Data Analysis Numerical Multilinear Algebra in Data Analysis Lek-Heng Lim Stanford University Computer Science Colloquium Ithaca, NY February 20, 2007 Collaborators: O. Alter, P. Comon, V. de Silva, I. Dhillon, M. Gu,

More information

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005 3 Numerical Solution of Nonlinear Equations and Systems 3.1 Fixed point iteration Reamrk 3.1 Problem Given a function F : lr n lr n, compute x lr n such that ( ) F(x ) = 0. In this chapter, we consider

More information

Most tensor problems are NP hard

Most tensor problems are NP hard Most tensor problems are NP hard (preliminary report) Lek-Heng Lim University of California, Berkeley August 25, 2009 (Joint work with: Chris Hillar, MSRI) L.H. Lim (Berkeley) Most tensor problems are

More information

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Semi-definite representibility. For fun and profit

Semi-definite representibility. For fun and profit Semi-definite representation For fun and profit March 9, 2010 Introduction Any convex quadratic constraint x T C T Cx a + b T x Can be recast as the linear matrix inequality (LMI): ( ) In Cx (Cx) T a +

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

Semidefinite Programming

Semidefinite Programming Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has

More information

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013 Convex Optimization (EE227A: UC Berkeley) Lecture 28 (Algebra + Optimization) 02 May, 2013 Suvrit Sra Admin Poster presentation on 10th May mandatory HW, Midterm, Quiz to be reweighted Project final report

More information

Representations of Positive Polynomials: Theory, Practice, and

Representations of Positive Polynomials: Theory, Practice, and Representations of Positive Polynomials: Theory, Practice, and Applications Dept. of Mathematics and Computer Science Emory University, Atlanta, GA Currently: National Science Foundation Temple University

More information

MATHEMATICS. Course Syllabus. Section A: Linear Algebra. Subject Code: MA. Course Structure. Ordinary Differential Equations

MATHEMATICS. Course Syllabus. Section A: Linear Algebra. Subject Code: MA. Course Structure. Ordinary Differential Equations MATHEMATICS Subject Code: MA Course Structure Sections/Units Section A Section B Section C Linear Algebra Complex Analysis Real Analysis Topics Section D Section E Section F Section G Section H Section

More information

(3) Let Y be a totally bounded subset of a metric space X. Then the closure Y of Y

(3) Let Y be a totally bounded subset of a metric space X. Then the closure Y of Y () Consider A = { q Q : q 2 2} as a subset of the metric space (Q, d), where d(x, y) = x y. Then A is A) closed but not open in Q B) open but not closed in Q C) neither open nor closed in Q D) both open

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Applied Machine Learning for Biomedical Engineering. Enrico Grisan Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination

More information

Application of Numerical Algebraic Geometry to Geometric Data Analysis

Application of Numerical Algebraic Geometry to Geometric Data Analysis Application of Numerical Algebraic Geometry to Geometric Data Analysis Daniel Bates 1 Brent Davis 1 Chris Peterson 1 Michael Kirby 1 Justin Marks 2 1 Colorado State University - Fort Collins, CO 2 Air

More information

A new look at nonnegativity on closed sets

A new look at nonnegativity on closed sets A new look at nonnegativity on closed sets LAAS-CNRS and Institute of Mathematics, Toulouse, France IPAM, UCLA September 2010 Positivstellensatze for semi-algebraic sets K R n from the knowledge of defining

More information

Optimization over Polynomials with Sums of Squares and Moment Matrices

Optimization over Polynomials with Sums of Squares and Moment Matrices Optimization over Polynomials with Sums of Squares and Moment Matrices Monique Laurent Centrum Wiskunde & Informatica (CWI), Amsterdam and University of Tilburg Positivity, Valuations and Quadratic Forms

More information

Convex Optimization & Parsimony of L p-balls representation

Convex Optimization & Parsimony of L p-balls representation Convex Optimization & Parsimony of L p -balls representation LAAS-CNRS and Institute of Mathematics, Toulouse, France IMA, January 2016 Motivation Unit balls associated with nonnegative homogeneous polynomials

More information

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

Sparsity in system identification and data-driven control

Sparsity in system identification and data-driven control 1 / 40 Sparsity in system identification and data-driven control Ivan Markovsky This signal is not sparse in the "time domain" 2 / 40 But it is sparse in the "frequency domain" (it is weighted sum of six

More information

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION   henrion COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS Didier HENRION www.laas.fr/ henrion October 2006 Geometry of LMI sets Given symmetric matrices F i we want to characterize the shape in R n of the LMI set F

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Numerical solutions of nonlinear systems of equations

Numerical solutions of nonlinear systems of equations Numerical solutions of nonlinear systems of equations Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan E-mail: min@math.ntnu.edu.tw August 28, 2011 Outline 1 Fixed points

More information

distances between objects of different dimensions

distances between objects of different dimensions distances between objects of different dimensions Lek-Heng Lim University of Chicago joint work with: Ke Ye (CAS) and Rodolphe Sepulchre (Cambridge) thanks: DARPA D15AP00109, NSF DMS-1209136, NSF IIS-1546413,

More information

Approximation of Multivariate Functions

Approximation of Multivariate Functions Approximation of Multivariate Functions Vladimir Ya. Lin and Allan Pinkus Abstract. We discuss one approach to the problem of approximating functions of many variables which is truly multivariate in character.

More information

Convex algebraic geometry, optimization and applications

Convex algebraic geometry, optimization and applications Convex algebraic geometry, optimization and applications The American Institute of Mathematics The following compilation of participant contributions is only intended as a lead-in to the AIM workshop Convex

More information

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS Sercan Yıldız syildiz@samsi.info in collaboration with Dávid Papp (NCSU) OPT Transition Workshop May 02, 2017 OUTLINE Polynomial optimization and

More information

Weak sharp minima on Riemannian manifolds 1

Weak sharp minima on Riemannian manifolds 1 1 Chong Li Department of Mathematics Zhejiang University Hangzhou, 310027, P R China cli@zju.edu.cn April. 2010 Outline 1 2 Extensions of some results for optimization problems on Banach spaces 3 4 Some

More information

A solution approach for linear optimization with completely positive matrices

A solution approach for linear optimization with completely positive matrices A solution approach for linear optimization with completely positive matrices Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with M. Bomze (Wien) and F.

More information

Qualifying Exams I, Jan where µ is the Lebesgue measure on [0,1]. In this problems, all functions are assumed to be in L 1 [0,1].

Qualifying Exams I, Jan where µ is the Lebesgue measure on [0,1]. In this problems, all functions are assumed to be in L 1 [0,1]. Qualifying Exams I, Jan. 213 1. (Real Analysis) Suppose f j,j = 1,2,... and f are real functions on [,1]. Define f j f in measure if and only if for any ε > we have lim µ{x [,1] : f j(x) f(x) > ε} = j

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with I.M. Bomze (Wien) and F. Jarre (Düsseldorf) IMA

More information

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS) Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb

More information

Numerical Minimization of Potential Energies on Specific Manifolds

Numerical Minimization of Potential Energies on Specific Manifolds Numerical Minimization of Potential Energies on Specific Manifolds SIAM Conference on Applied Linear Algebra 2012 22 June 2012, Valencia Manuel Gra f 1 1 Chemnitz University of Technology, Germany, supported

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares

Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares Journal of Global Optimization (2005) 33: 511 525 Springer 2005 DOI 10.1007/s10898-005-2099-2 Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares JIAWANG NIE 1 and JAMES W.

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Algorithms for Constrained Convex Optimization Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems

More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,

More information

Tensors. Lek-Heng Lim. Statistics Department Retreat. October 27, Thanks: NSF DMS and DMS

Tensors. Lek-Heng Lim. Statistics Department Retreat. October 27, Thanks: NSF DMS and DMS Tensors Lek-Heng Lim Statistics Department Retreat October 27, 2012 Thanks: NSF DMS 1209136 and DMS 1057064 L.-H. Lim (Stat Retreat) Tensors October 27, 2012 1 / 20 tensors on one foot a tensor is a multilinear

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

1 Quantum states and von Neumann entropy

1 Quantum states and von Neumann entropy Lecture 9: Quantum entropy maximization CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: February 15, 2016 1 Quantum states and von Neumann entropy Recall that S sym n n

More information

Mathematical Analysis

Mathematical Analysis Mathematical Analysis A Concise Introduction Bernd S. W. Schroder Louisiana Tech University Program of Mathematics and Statistics Ruston, LA 31CENTENNIAL BICENTENNIAL WILEY-INTERSCIENCE A John Wiley &

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Advanced SDPs Lecture 6: March 16, 2017

Advanced SDPs Lecture 6: March 16, 2017 Advanced SDPs Lecture 6: March 16, 2017 Lecturers: Nikhil Bansal and Daniel Dadush Scribe: Daniel Dadush 6.1 Notation Let N = {0, 1,... } denote the set of non-negative integers. For α N n, define the

More information

SEMIDEFINITE PROGRAM BASICS. Contents

SEMIDEFINITE PROGRAM BASICS. Contents SEMIDEFINITE PROGRAM BASICS BRIAN AXELROD Abstract. A introduction to the basics of Semidefinite programs. Contents 1. Definitions and Preliminaries 1 1.1. Linear Algebra 1 1.2. Convex Analysis (on R n

More information

Robust and Optimal Control, Spring 2015

Robust and Optimal Control, Spring 2015 Robust and Optimal Control, Spring 2015 Instructor: Prof. Masayuki Fujita (S5-303B) G. Sum of Squares (SOS) G.1 SOS Program: SOS/PSD and SDP G.2 Duality, valid ineqalities and Cone G.3 Feasibility/Optimization

More information

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES II. DIFFERENTIABLE MANIFOLDS Washington Mio Anuj Srivastava and Xiuwen Liu (Illustrations by D. Badlyans) CENTER FOR APPLIED VISION AND IMAGING SCIENCES Florida State University WHY MANIFOLDS? Non-linearity

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Symmetric eigenvalue decompositions for symmetric tensors

Symmetric eigenvalue decompositions for symmetric tensors Symmetric eigenvalue decompositions for symmetric tensors Lek-Heng Lim University of California, Berkeley January 29, 2009 (Contains joint work with Pierre Comon, Jason Morton, Bernard Mourrain, Berkant

More information

Global Optimization with Polynomials

Global Optimization with Polynomials Global Optimization with Polynomials Deren Han Singapore-MIT Alliance National University of Singapore Singapore 119260 Email: smahdr@nus.edu.sg Abstract The class of POP (Polynomial Optimization Problems)

More information

Three Generalizations of Compressed Sensing

Three Generalizations of Compressed Sensing Thomas Blumensath School of Mathematics The University of Southampton June, 2010 home prev next page Compressed Sensing and beyond y = Φx + e x R N or x C N x K is K-sparse and x x K 2 is small y R M or

More information

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

1. If 1, ω, ω 2, -----, ω 9 are the 10 th roots of unity, then (1 + ω) (1 + ω 2 ) (1 + ω 9 ) is A) 1 B) 1 C) 10 D) 0

1. If 1, ω, ω 2, -----, ω 9 are the 10 th roots of unity, then (1 + ω) (1 + ω 2 ) (1 + ω 9 ) is A) 1 B) 1 C) 10 D) 0 4 INUTES. If, ω, ω, -----, ω 9 are the th roots of unity, then ( + ω) ( + ω ) ----- ( + ω 9 ) is B) D) 5. i If - i = a + ib, then a =, b = B) a =, b = a =, b = D) a =, b= 3. Find the integral values for

More information

Functional Analysis I

Functional Analysis I Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC 2003 2003.09.02.10 6. The Positivstellensatz Basic semialgebraic sets Semialgebraic sets Tarski-Seidenberg and quantifier elimination Feasibility

More information

Combinatorics and geometry of E 7

Combinatorics and geometry of E 7 Combinatorics and geometry of E 7 Steven Sam University of California, Berkeley September 19, 2012 1/24 Outline Macdonald representations Vinberg representations Root system Weyl group 7 points in P 2

More information

Unbounded Convex Semialgebraic Sets as Spectrahedral Shadows

Unbounded Convex Semialgebraic Sets as Spectrahedral Shadows Unbounded Convex Semialgebraic Sets as Spectrahedral Shadows Shaowei Lin 9 Dec 2010 Abstract Recently, Helton and Nie [3] showed that a compact convex semialgebraic set S is a spectrahedral shadow if the

More information

QUALIFYING EXAMINATION Harvard University Department of Mathematics Tuesday 10 February 2004 (Day 1)

QUALIFYING EXAMINATION Harvard University Department of Mathematics Tuesday 10 February 2004 (Day 1) Tuesday 10 February 2004 (Day 1) 1a. Prove the following theorem of Banach and Saks: Theorem. Given in L 2 a sequence {f n } which weakly converges to 0, we can select a subsequence {f nk } such that the

More information

A ten page introduction to conic optimization

A ten page introduction to conic optimization CHAPTER 1 A ten page introduction to conic optimization This background chapter gives an introduction to conic optimization. We do not give proofs, but focus on important (for this thesis) tools and concepts.

More information

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

REAL AND COMPLEX ANALYSIS

REAL AND COMPLEX ANALYSIS REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any

More information

Alexander Ostrowski

Alexander Ostrowski Ostrowski p. 1/3 Alexander Ostrowski 1893 1986 Walter Gautschi wxg@cs.purdue.edu Purdue University Ostrowski p. 2/3 Collected Mathematical Papers Volume 1 Determinants Linear Algebra Algebraic Equations

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Lecture 9 Sequential unconstrained minimization

Lecture 9 Sequential unconstrained minimization S. Boyd EE364 Lecture 9 Sequential unconstrained minimization brief history of SUMT & IP methods logarithmic barrier function central path UMT & SUMT complexity analysis feasibility phase generalized inequalities

More information

Approximation theory in neural networks

Approximation theory in neural networks Approximation theory in neural networks Yanhui Su yanhui su@brown.edu March 30, 2018 Outline 1 Approximation of functions by a sigmoidal function 2 Approximations of continuous functionals by a sigmoidal

More information

1 Non-negative Matrix Factorization (NMF)

1 Non-negative Matrix Factorization (NMF) 2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,

More information

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra Tensors Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra This lecture is divided into two parts. The first part,

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Network Localization via Schatten Quasi-Norm Minimization

Network Localization via Schatten Quasi-Norm Minimization Network Localization via Schatten Quasi-Norm Minimization Anthony Man-Cho So Department of Systems Engineering & Engineering Management The Chinese University of Hong Kong (Joint Work with Senshan Ji Kam-Fung

More information

recent developments of approximation theory and greedy algorithms

recent developments of approximation theory and greedy algorithms recent developments of approximation theory and greedy algorithms Peter Binev Department of Mathematics and Interdisciplinary Mathematics Institute University of South Carolina Reduced Order Modeling in

More information

Matrix stabilization using differential equations.

Matrix stabilization using differential equations. Matrix stabilization using differential equations. Nicola Guglielmi Universitá dell Aquila and Gran Sasso Science Institute, Italia NUMOC-2017 Roma, 19 23 June, 2017 Inspired by a joint work with Christian

More information

Semidefinite programming lifts and sparse sums-of-squares

Semidefinite programming lifts and sparse sums-of-squares 1/15 Semidefinite programming lifts and sparse sums-of-squares Hamza Fawzi (MIT, LIDS) Joint work with James Saunderson (UW) and Pablo Parrilo (MIT) Cornell ORIE, October 2015 Linear optimization 2/15

More information

Copositive Programming and Combinatorial Optimization

Copositive Programming and Combinatorial Optimization Copositive Programming and Combinatorial Optimization Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria joint work with M. Bomze (Wien) and F. Jarre (Düsseldorf) and

More information

Distances between spectral densities. V x. V y. Genesis of this talk. The key point : the value of chordal distances. Spectrum approximation problem

Distances between spectral densities. V x. V y. Genesis of this talk. The key point : the value of chordal distances. Spectrum approximation problem Genesis of this talk Distances between spectral densities The shortest distance between two points is always under construction. (R. McClanahan) R. Sepulchre -- University of Cambridge Celebrating Anders

More information

Convex algebraic geometry, optimization and applications

Convex algebraic geometry, optimization and applications Convex algebraic geometry, optimization and applications organized by William Helton and Jiawang Nie Workshop Summary We start with a little bit of terminology. A linear pencil is a function L : R g d

More information

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived

More information

Numerical Multilinear Algebra I

Numerical Multilinear Algebra I Numerical Multilinear Algebra I Lek-Heng Lim University of California, Berkeley January 5 7, 2009 L.-H. Lim (ICM Lecture) Numerical Multilinear Algebra I January 5 7, 2009 1 / 55 Hope Past 50 years, Numerical

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

9. Interpretations, Lifting, SOS and Moments

9. Interpretations, Lifting, SOS and Moments 9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample

More information

Orthogonal tensor decomposition

Orthogonal tensor decomposition Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.

More information

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department

More information