Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005

Multi-Linear Integral Equations of the First Kind R 1 k(x 1, x 2, x 3, x 4 )f(x 3, x 4 )dx 3 dx 4 = g(x 1, x 2 ), (x 1, x 2 ) R 2, R 1 and R 2 are rectangular domains in R 2. The kernel function is smooth, e.g. k L 2 [R 1 R 2 ]. Ill-posed problem: the solution does not depend continuously on the data. Applications: restoration of blurred images, inverse problems,... ERCIM April 2005 1

1 k,l n Discretization Corresponding least squares problem: min f kl k ijkl f kl = g ij, 1 i, j n, 1 i,j n 1 k,l n k ijkl f kl g ij 2. For simplicity: all subscripts i, j, k, l run from 1 to n. Ill-posedness = the linear system is extremely ill-conditioned (if n is large enough). ERCIM April 2005 2

Tensor Matrix Stack the columns: F vec(f ) = f = f 1., G vec(f ) = g = f n g 1. g n Organize the coefficients k ijkl accordingly. Standard linear system Kf = g, K R n2 n 2. ERCIM April 2005 3

Numerical Solution: SVD Kf = g, K R n2 n 2. Compute the singular value decomposition (SVD) of K: K = UΣV T = n 2 i=1 σ i u i v T i, σ 1 σ 2 σ n 0. Formal solution: f = n 2 i=1 u T i g σ i v i σ n 0 = unstable: small perturbations of data cause large errors. ERCIM April 2005 4

Truncated SVD: TSVD Stabilize by truncating the expansion, k << n 2 : f k = k i=1 u T i g σ i v i Computation of the SVD of K: Cn 6 flops, C 5 Prohibitively costly if n is large. Alternative: Iterative methods (sparsity) Tensor problem: Is it possible to do better using TENSOR methods? ERCIM April 2005 5

Basic Tensor Concepts Mode I multiplication of a tensor by a matrix (A 1 U)(j, i 2, i 3, i 4 ) = n i 1 =1 a i1,i 2,i 3,i 4 u j,i1. All column vectors in the 4-tensor are multiplied by the matrix U. Mode 2 multiplication by a matrix V : all row vectors are multiplied by the matrix V Mode 3 and mode 4 multiplication are analogous. ERCIM April 2005 6

K M, mode K and mode M multiplication commute: A K U M V = A M V K U. Useful identity: A K F K G = A K GF. ERCIM April 2005 7

Matricizing of a 3 mode tensor A: Matricizing the Tensor R n n2 A (1) = ( A(:, 1, :) A(:, 2, :) A(:, n, :) ), R n n2 A (2) = ( A(:, :, 1) T A(:, :, 2) T A(:, :, n) T ), R n n2 A (3) = ( A(1, :, :) T A(2, :, :) T A(n, :, :) T ). Mode M multiplication B = A M M: B (M) = MA (M), followed by a reorganization of B (M) to the tensor B. ERCIM April 2005 8

Inner Product Two tensors A and B of the same dimensions: A, B = a i1 i 2 i 3 i 4 b i1 i 2 i 3 i 4. i 1,i 2,i 3,i 4 Special case of contracted product of two tensors The linear system 1 k,l n k ijklf kl = g ij, 1 i, j n, K, F {3,4;1,2} = G, The matrices F and G are identified with tensors F and G. ERCIM April 2005 9

Least squares problem K, F {3,4;1,2} G, is the matrix Frobenius norm, or, equivalently, the tensor Frobenius norm, G = G = G, G 1/2. ERCIM April 2005 10

Some Obvious Facts The tensor Frobenius norm is invariant under mode M multiplication by an orthogonal matrix: Proposition 1. Let U R n n be orthogonal. Then A M U = A. ERCIM April 2005 11

Tensor-matrix equivalent of the identity SV T x = Sy, for y = V T x. Proposition 2. Assume that S is a 4 tensor, U (3) and U (4) are square matrices, and that F is a matrix. Then S 3 U (3) 4 U (4), F {3,4;1,2} = S, F {3,4;1,2}, where F = (U (3) ) T F U (4). ERCIM April 2005 12

Matrix case: USx = U(Sx) Proposition 3. S 1 U (1) 2 U (2), F {3,4;1,2} = U (1) ( S, F {3,4;1,2} ) (U (2) ) T. ERCIM April 2005 13

Tensor SVD (HOSVD) The singular value decomposition of a 4 tensor 1 A = S 1 U (1) 2 U (2) 3 U (3) 4 U (4), where U (i) R n n are orthogonal matrices. Core tensor S has the same dimensions as A; S is all-orthogonal: slices along any mode are orthogonal. Let i j; then S(i, :, :, :), S(j, :, :, :) = S(:, i, :, :), S(:, j, :, :) = S(:, :, i, :), S(:, :, j, :) = S(:, :, :, i), S(:, :, :, j) = 0. 1 Equivalent to the Tucker-3 decomposition in psychometrics and chemometrics. ERCIM April 2005 14

HOSVD U (3) A = U (1) S U (2) ERCIM April 2005 15

Singular Values Mode 1 singular values The singular values are ordered, σ (1) j = S(j, :, :, :), j = 1,..., n. σ (i) 1 σ (i) 2 σ (i) n 0, i = 1, 2, 3, 4. ERCIM April 2005 16

Energy The singular values are measures of the energy of the tensor Proposition 4. A 2 = S 2 = n ( i=1 σ (1) i ) 2 n ( = = i=1 σ (4) i ) 2. The energy (mass) is concentrated at the (1, 1, 1, 1) corner of the tensor We can truncate the HOSVD (in analogy to TSVD) ERCIM April 2005 17

Approximation Proposition 5. Define a tensor Â by discarding the smallest singular values along each mode, (σ (1) i 1 ) n i 1 =k 1 +1,... (σ(4) i 4 ) n i 4 =k 4 +1, i.e. setting the corresponding parts of the core tensor S equal to zero. Then the approximation error is bounded, A Â = S Ŝ n i 1 =k 1 +1 (σ (1) i 1 ) 2 + + n i 4 =k 4 +1 (σ (4) i 4 ) 2. No optimality property corresponding to Eckart-Young theorem for matrices. The tensor Â defined in Proposition 5 is not the best rank-(k 1,..., k 4 ) approximation of A ERCIM April 2005 18

Computation of HOSVD U (i) are the left singular matrices of A (i) R n n3. 1. for j = 1, 2, 3, 4 (a) Compute the LQ decomposition A (j) = (L j 0)Q j, without forming Q j explicitly. (b) Compute the SVD: L j = U (j) Σ (j) (V (j) ) T, without forming V (j) explicitly. 2. Compute the core tensor: S = A 1 (U (1) ) T 2 (U (2) ) T 3 (U (3) ) T 4 (U (4) ) T ERCIM April 2005 19

Computation The mode j singular values are the diagonal elements of Σ (j) : Σ (j) = diag(σ (j) 1, σ(j) 2,..., σ(j) n ). Cost for computing HOSVD: Cn 5 flops. ERCIM April 2005 20

Discrete Ill-Posed Problems Linear Case: Truncated SVD Solution Solution (formally) min f Kf g 2, K R n2 n 2. f = V Σ 1 U T g = n 2 i=1 u T i g σ i v i, ERCIM April 2005 21

Solution for perturbed data, ĝ = g + δ, (measurement or round off errrors) ˆf = n 2 i=1 u T i ĝ σ i v i = n 2 i=1 u T i g σ i v i + n 2 i=1 u T i δ σ i v i. The second term will explode. Stabilization by TSVD: f ˆf k = k i=1 u T i ĝ σ i v i, ERCIM April 2005 22

HOSVD Multi-Linear Case: Truncated HOSVD Reduced least squares problem: min e F K = S 1 U (1) 2 U (2) 3 U (3) 4 U (4). S, F {3,4;1,2} G, F = (U (3) ) T F U (4), G = (U (1) ) T GU (2). Truncate the core tensor and solve: min e F S k, F {3,4;1,2} G, where S(k + 1 : n, k + 1 : n, k + 1 : n, k + 1 : n) = 0. ERCIM April 2005 23

Truncated HOSVD U (3) k A S k U (1) k U (2) k ERCIM April 2005 24

Algorithm 1. Compute the HO singular values and vectors: O(n 5 ) flops 2. Truncate to k based on singular values, compute only S k : O(kn 4 ) flops 3. Solve min S k, F k {3,4;1,2} G k, min S k f k g k F k f k (a) Compute SVD of S k and solve by TSVD: O(k 6 ) flops (b) Transform to original coordinates: O(kn 2 ) flops ERCIM April 2005 25

Numerical Example Perturbation of data: N(0, 1) Level of truncation of tensor: k = 30 Linear system level of truncation: k 0 = 8 ERCIM April 2005 26

Singular Values 10 2 SINGULAR VALUES 10 1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 0 10 20 30 40 50 60 70 80 ERCIM April 2005 27

Unregularized Solution UNREGULARIZED SOLUTION x 10 9 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 28

Regularized Solution, k = 8 REGULARIZED SOLUTION 80 60 40 20 0 20 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 29

Error ERROR 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 30

Conclusions One order of magnitude faster than straightforward approach Relation HO singular values Linear system singular values? Other applications: Tensor regression? ERCIM April 2005 31