Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Similar documents
THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Properties of Matrices and Operations on Matrices

2 Tikhonov Regularization and ERM

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Linear Algebra Massoud Malek

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

SIGNAL AND IMAGE RESTORATION: SOLVING

STA141C: Big Data & High Performance Statistical Computing

Math 671: Tensor Train decomposition methods II

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

Linear Algebra Methods for Data Mining

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

STA141C: Big Data & High Performance Statistical Computing

Linear Algebra. Session 12

Bindel, Fall 2013 Matrix Computations (CS 6210) Week 2: Friday, Sep 6

Numerical Analysis Comprehensive Exam Questions

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Spectral Regularization

Manning & Schuetze, FSNLP (c) 1999,2000

IV. Matrix Approximation using Least-Squares

Regularization via Spectral Filtering

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

1. Introduction. We consider linear discrete ill-posed problems of the form

Inverse Singular Value Problems

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Advanced Numerical Linear Algebra: Inverse Problems

SVD, PCA & Preprocessing

Singular Value Decompsition

A new truncation strategy for the higher-order singular value decomposition

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Dynamics of Quantum Many-Body Systems from Monte Carlo simulations

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

Linear Algebra (Review) Volker Tresp 2017

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Inverse Theory. COST WaVaCS Winterschool Venice, February Stefan Buehler Luleå University of Technology Kiruna

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

Matrix decompositions

FACTORIZATION STRATEGIES FOR THIRD-ORDER TENSORS


Lecture 4. CP and KSVD Representations. Charles F. Van Loan

Linear Algebra (Review) Volker Tresp 2018

Mathematics and Computer Science

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Computational Linear Algebra

Lecture 5 Singular value decomposition

Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Linear Inverse Problems

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Problems. Looks for literal term matches. Problems:

CS6964: Notes On Linear Systems

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

Renormalization of Tensor Network States

Sub-Stiefel Procrustes problem. Krystyna Ziętak

Preliminary Examination, Numerical Analysis, August 2016

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Singular Value Decomposition

From Matrix to Tensor. Charles F. Van Loan

COMP 558 lecture 18 Nov. 15, 2010

Lecture 6: Geometry of OLS Estimation of Linear Regession

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Parallel Singular Value Decomposition. Jiaxing Tan

Lecture 2: Linear Algebra Review

Summary of Week 9 B = then A A =

Singular Value Decomposition

Matrix Factorizations

Faloutsos, Tong ICDE, 2009

Review of linear algebra

Principal Component Analysis

Linear Algebra for Machine Learning. Sargur N. Srihari

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

Regularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution

Foundations of Matrix Analysis

Probabilistic Latent Semantic Analysis

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro

Matrix-Product-States/ Tensor-Trains

Rank revealing factorizations, and low rank approximations

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Fundamentals of Multilinear Subspace Learning

BlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas

arxiv: v1 [cs.lg] 26 Jul 2017

Lecture 18 Nov 3rd, 2015

Mathematical foundations - linear algebra

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Homework 1. Yuan Yao. September 18, 2011

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Numerical Linear Algebra and. Image Restoration

Big Data Analytics: Optimization and Randomization

Principal Component Analysis

Discrete Ill Posed and Rank Deficient Problems. Alistair Boyle, Feb 2009, SYS5906: Directed Studies Inverse Problems 1

Spectrum-Revealing Matrix Factorizations Theory and Algorithms

Linear Algebra Review. Vectors

COMPUTATIONAL ISSUES RELATING TO INVERSION OF PRACTICAL DATA: WHERE IS THE UNCERTAINTY? CAN WE SOLVE Ax = b?

Singular Value Decomposition

Transcription:

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005

Multi-Linear Integral Equations of the First Kind R 1 k(x 1, x 2, x 3, x 4 )f(x 3, x 4 )dx 3 dx 4 = g(x 1, x 2 ), (x 1, x 2 ) R 2, R 1 and R 2 are rectangular domains in R 2. The kernel function is smooth, e.g. k L 2 [R 1 R 2 ]. Ill-posed problem: the solution does not depend continuously on the data. Applications: restoration of blurred images, inverse problems,... ERCIM April 2005 1

1 k,l n Discretization Corresponding least squares problem: min f kl k ijkl f kl = g ij, 1 i, j n, 1 i,j n 1 k,l n k ijkl f kl g ij 2. For simplicity: all subscripts i, j, k, l run from 1 to n. Ill-posedness = the linear system is extremely ill-conditioned (if n is large enough). ERCIM April 2005 2

Tensor Matrix Stack the columns: F vec(f ) = f = f 1., G vec(f ) = g = f n g 1. g n Organize the coefficients k ijkl accordingly. Standard linear system Kf = g, K R n2 n 2. ERCIM April 2005 3

Numerical Solution: SVD Kf = g, K R n2 n 2. Compute the singular value decomposition (SVD) of K: K = UΣV T = n 2 i=1 σ i u i v T i, σ 1 σ 2 σ n 0. Formal solution: f = n 2 i=1 u T i g σ i v i σ n 0 = unstable: small perturbations of data cause large errors. ERCIM April 2005 4

Truncated SVD: TSVD Stabilize by truncating the expansion, k << n 2 : f k = k i=1 u T i g σ i v i Computation of the SVD of K: Cn 6 flops, C 5 Prohibitively costly if n is large. Alternative: Iterative methods (sparsity) Tensor problem: Is it possible to do better using TENSOR methods? ERCIM April 2005 5

Basic Tensor Concepts Mode I multiplication of a tensor by a matrix (A 1 U)(j, i 2, i 3, i 4 ) = n i 1 =1 a i1,i 2,i 3,i 4 u j,i1. All column vectors in the 4-tensor are multiplied by the matrix U. Mode 2 multiplication by a matrix V : all row vectors are multiplied by the matrix V Mode 3 and mode 4 multiplication are analogous. ERCIM April 2005 6

K M, mode K and mode M multiplication commute: A K U M V = A M V K U. Useful identity: A K F K G = A K GF. ERCIM April 2005 7

Matricizing of a 3 mode tensor A: Matricizing the Tensor R n n2 A (1) = ( A(:, 1, :) A(:, 2, :) A(:, n, :) ), R n n2 A (2) = ( A(:, :, 1) T A(:, :, 2) T A(:, :, n) T ), R n n2 A (3) = ( A(1, :, :) T A(2, :, :) T A(n, :, :) T ). Mode M multiplication B = A M M: B (M) = MA (M), followed by a reorganization of B (M) to the tensor B. ERCIM April 2005 8

Inner Product Two tensors A and B of the same dimensions: A, B = a i1 i 2 i 3 i 4 b i1 i 2 i 3 i 4. i 1,i 2,i 3,i 4 Special case of contracted product of two tensors The linear system 1 k,l n k ijklf kl = g ij, 1 i, j n, K, F {3,4;1,2} = G, The matrices F and G are identified with tensors F and G. ERCIM April 2005 9

Least squares problem K, F {3,4;1,2} G, is the matrix Frobenius norm, or, equivalently, the tensor Frobenius norm, G = G = G, G 1/2. ERCIM April 2005 10

Some Obvious Facts The tensor Frobenius norm is invariant under mode M multiplication by an orthogonal matrix: Proposition 1. Let U R n n be orthogonal. Then A M U = A. ERCIM April 2005 11

Tensor-matrix equivalent of the identity SV T x = Sy, for y = V T x. Proposition 2. Assume that S is a 4 tensor, U (3) and U (4) are square matrices, and that F is a matrix. Then S 3 U (3) 4 U (4), F {3,4;1,2} = S, F {3,4;1,2}, where F = (U (3) ) T F U (4). ERCIM April 2005 12

Matrix case: USx = U(Sx) Proposition 3. S 1 U (1) 2 U (2), F {3,4;1,2} = U (1) ( S, F {3,4;1,2} ) (U (2) ) T. ERCIM April 2005 13

Tensor SVD (HOSVD) The singular value decomposition of a 4 tensor 1 A = S 1 U (1) 2 U (2) 3 U (3) 4 U (4), where U (i) R n n are orthogonal matrices. Core tensor S has the same dimensions as A; S is all-orthogonal: slices along any mode are orthogonal. Let i j; then S(i, :, :, :), S(j, :, :, :) = S(:, i, :, :), S(:, j, :, :) = S(:, :, i, :), S(:, :, j, :) = S(:, :, :, i), S(:, :, :, j) = 0. 1 Equivalent to the Tucker-3 decomposition in psychometrics and chemometrics. ERCIM April 2005 14

HOSVD U (3) A = U (1) S U (2) ERCIM April 2005 15

Singular Values Mode 1 singular values The singular values are ordered, σ (1) j = S(j, :, :, :), j = 1,..., n. σ (i) 1 σ (i) 2 σ (i) n 0, i = 1, 2, 3, 4. ERCIM April 2005 16

Energy The singular values are measures of the energy of the tensor Proposition 4. A 2 = S 2 = n ( i=1 σ (1) i ) 2 n ( = = i=1 σ (4) i ) 2. The energy (mass) is concentrated at the (1, 1, 1, 1) corner of the tensor We can truncate the HOSVD (in analogy to TSVD) ERCIM April 2005 17

Approximation Proposition 5. Define a tensor  by discarding the smallest singular values along each mode, (σ (1) i 1 ) n i 1 =k 1 +1,... (σ(4) i 4 ) n i 4 =k 4 +1, i.e. setting the corresponding parts of the core tensor S equal to zero. Then the approximation error is bounded, A  = S Ŝ n i 1 =k 1 +1 (σ (1) i 1 ) 2 + + n i 4 =k 4 +1 (σ (4) i 4 ) 2. No optimality property corresponding to Eckart-Young theorem for matrices. The tensor  defined in Proposition 5 is not the best rank-(k 1,..., k 4 ) approximation of A ERCIM April 2005 18

Computation of HOSVD U (i) are the left singular matrices of A (i) R n n3. 1. for j = 1, 2, 3, 4 (a) Compute the LQ decomposition A (j) = (L j 0)Q j, without forming Q j explicitly. (b) Compute the SVD: L j = U (j) Σ (j) (V (j) ) T, without forming V (j) explicitly. 2. Compute the core tensor: S = A 1 (U (1) ) T 2 (U (2) ) T 3 (U (3) ) T 4 (U (4) ) T ERCIM April 2005 19

Computation The mode j singular values are the diagonal elements of Σ (j) : Σ (j) = diag(σ (j) 1, σ(j) 2,..., σ(j) n ). Cost for computing HOSVD: Cn 5 flops. ERCIM April 2005 20

Discrete Ill-Posed Problems Linear Case: Truncated SVD Solution Solution (formally) min f Kf g 2, K R n2 n 2. f = V Σ 1 U T g = n 2 i=1 u T i g σ i v i, ERCIM April 2005 21

Solution for perturbed data, ĝ = g + δ, (measurement or round off errrors) ˆf = n 2 i=1 u T i ĝ σ i v i = n 2 i=1 u T i g σ i v i + n 2 i=1 u T i δ σ i v i. The second term will explode. Stabilization by TSVD: f ˆf k = k i=1 u T i ĝ σ i v i, ERCIM April 2005 22

HOSVD Multi-Linear Case: Truncated HOSVD Reduced least squares problem: min e F K = S 1 U (1) 2 U (2) 3 U (3) 4 U (4). S, F {3,4;1,2} G, F = (U (3) ) T F U (4), G = (U (1) ) T GU (2). Truncate the core tensor and solve: min e F S k, F {3,4;1,2} G, where S(k + 1 : n, k + 1 : n, k + 1 : n, k + 1 : n) = 0. ERCIM April 2005 23

Truncated HOSVD U (3) k A S k U (1) k U (2) k ERCIM April 2005 24

Algorithm 1. Compute the HO singular values and vectors: O(n 5 ) flops 2. Truncate to k based on singular values, compute only S k : O(kn 4 ) flops 3. Solve min S k, F k {3,4;1,2} G k, min S k f k g k F k f k (a) Compute SVD of S k and solve by TSVD: O(k 6 ) flops (b) Transform to original coordinates: O(kn 2 ) flops ERCIM April 2005 25

Numerical Example Perturbation of data: N(0, 1) Level of truncation of tensor: k = 30 Linear system level of truncation: k 0 = 8 ERCIM April 2005 26

Singular Values 10 2 SINGULAR VALUES 10 1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 0 10 20 30 40 50 60 70 80 ERCIM April 2005 27

Unregularized Solution UNREGULARIZED SOLUTION x 10 9 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 28

Regularized Solution, k = 8 REGULARIZED SOLUTION 80 60 40 20 0 20 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 29

Error ERROR 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 30

Conclusions One order of magnitude faster than straightforward approach Relation HO singular values Linear system singular values? Other applications: Tensor regression? ERCIM April 2005 31