Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005
Multi-Linear Integral Equations of the First Kind R 1 k(x 1, x 2, x 3, x 4 )f(x 3, x 4 )dx 3 dx 4 = g(x 1, x 2 ), (x 1, x 2 ) R 2, R 1 and R 2 are rectangular domains in R 2. The kernel function is smooth, e.g. k L 2 [R 1 R 2 ]. Ill-posed problem: the solution does not depend continuously on the data. Applications: restoration of blurred images, inverse problems,... ERCIM April 2005 1
1 k,l n Discretization Corresponding least squares problem: min f kl k ijkl f kl = g ij, 1 i, j n, 1 i,j n 1 k,l n k ijkl f kl g ij 2. For simplicity: all subscripts i, j, k, l run from 1 to n. Ill-posedness = the linear system is extremely ill-conditioned (if n is large enough). ERCIM April 2005 2
Tensor Matrix Stack the columns: F vec(f ) = f = f 1., G vec(f ) = g = f n g 1. g n Organize the coefficients k ijkl accordingly. Standard linear system Kf = g, K R n2 n 2. ERCIM April 2005 3
Numerical Solution: SVD Kf = g, K R n2 n 2. Compute the singular value decomposition (SVD) of K: K = UΣV T = n 2 i=1 σ i u i v T i, σ 1 σ 2 σ n 0. Formal solution: f = n 2 i=1 u T i g σ i v i σ n 0 = unstable: small perturbations of data cause large errors. ERCIM April 2005 4
Truncated SVD: TSVD Stabilize by truncating the expansion, k << n 2 : f k = k i=1 u T i g σ i v i Computation of the SVD of K: Cn 6 flops, C 5 Prohibitively costly if n is large. Alternative: Iterative methods (sparsity) Tensor problem: Is it possible to do better using TENSOR methods? ERCIM April 2005 5
Basic Tensor Concepts Mode I multiplication of a tensor by a matrix (A 1 U)(j, i 2, i 3, i 4 ) = n i 1 =1 a i1,i 2,i 3,i 4 u j,i1. All column vectors in the 4-tensor are multiplied by the matrix U. Mode 2 multiplication by a matrix V : all row vectors are multiplied by the matrix V Mode 3 and mode 4 multiplication are analogous. ERCIM April 2005 6
K M, mode K and mode M multiplication commute: A K U M V = A M V K U. Useful identity: A K F K G = A K GF. ERCIM April 2005 7
Matricizing of a 3 mode tensor A: Matricizing the Tensor R n n2 A (1) = ( A(:, 1, :) A(:, 2, :) A(:, n, :) ), R n n2 A (2) = ( A(:, :, 1) T A(:, :, 2) T A(:, :, n) T ), R n n2 A (3) = ( A(1, :, :) T A(2, :, :) T A(n, :, :) T ). Mode M multiplication B = A M M: B (M) = MA (M), followed by a reorganization of B (M) to the tensor B. ERCIM April 2005 8
Inner Product Two tensors A and B of the same dimensions: A, B = a i1 i 2 i 3 i 4 b i1 i 2 i 3 i 4. i 1,i 2,i 3,i 4 Special case of contracted product of two tensors The linear system 1 k,l n k ijklf kl = g ij, 1 i, j n, K, F {3,4;1,2} = G, The matrices F and G are identified with tensors F and G. ERCIM April 2005 9
Least squares problem K, F {3,4;1,2} G, is the matrix Frobenius norm, or, equivalently, the tensor Frobenius norm, G = G = G, G 1/2. ERCIM April 2005 10
Some Obvious Facts The tensor Frobenius norm is invariant under mode M multiplication by an orthogonal matrix: Proposition 1. Let U R n n be orthogonal. Then A M U = A. ERCIM April 2005 11
Tensor-matrix equivalent of the identity SV T x = Sy, for y = V T x. Proposition 2. Assume that S is a 4 tensor, U (3) and U (4) are square matrices, and that F is a matrix. Then S 3 U (3) 4 U (4), F {3,4;1,2} = S, F {3,4;1,2}, where F = (U (3) ) T F U (4). ERCIM April 2005 12
Matrix case: USx = U(Sx) Proposition 3. S 1 U (1) 2 U (2), F {3,4;1,2} = U (1) ( S, F {3,4;1,2} ) (U (2) ) T. ERCIM April 2005 13
Tensor SVD (HOSVD) The singular value decomposition of a 4 tensor 1 A = S 1 U (1) 2 U (2) 3 U (3) 4 U (4), where U (i) R n n are orthogonal matrices. Core tensor S has the same dimensions as A; S is all-orthogonal: slices along any mode are orthogonal. Let i j; then S(i, :, :, :), S(j, :, :, :) = S(:, i, :, :), S(:, j, :, :) = S(:, :, i, :), S(:, :, j, :) = S(:, :, :, i), S(:, :, :, j) = 0. 1 Equivalent to the Tucker-3 decomposition in psychometrics and chemometrics. ERCIM April 2005 14
HOSVD U (3) A = U (1) S U (2) ERCIM April 2005 15
Singular Values Mode 1 singular values The singular values are ordered, σ (1) j = S(j, :, :, :), j = 1,..., n. σ (i) 1 σ (i) 2 σ (i) n 0, i = 1, 2, 3, 4. ERCIM April 2005 16
Energy The singular values are measures of the energy of the tensor Proposition 4. A 2 = S 2 = n ( i=1 σ (1) i ) 2 n ( = = i=1 σ (4) i ) 2. The energy (mass) is concentrated at the (1, 1, 1, 1) corner of the tensor We can truncate the HOSVD (in analogy to TSVD) ERCIM April 2005 17
Approximation Proposition 5. Define a tensor  by discarding the smallest singular values along each mode, (σ (1) i 1 ) n i 1 =k 1 +1,... (σ(4) i 4 ) n i 4 =k 4 +1, i.e. setting the corresponding parts of the core tensor S equal to zero. Then the approximation error is bounded, A  = S Ŝ n i 1 =k 1 +1 (σ (1) i 1 ) 2 + + n i 4 =k 4 +1 (σ (4) i 4 ) 2. No optimality property corresponding to Eckart-Young theorem for matrices. The tensor  defined in Proposition 5 is not the best rank-(k 1,..., k 4 ) approximation of A ERCIM April 2005 18
Computation of HOSVD U (i) are the left singular matrices of A (i) R n n3. 1. for j = 1, 2, 3, 4 (a) Compute the LQ decomposition A (j) = (L j 0)Q j, without forming Q j explicitly. (b) Compute the SVD: L j = U (j) Σ (j) (V (j) ) T, without forming V (j) explicitly. 2. Compute the core tensor: S = A 1 (U (1) ) T 2 (U (2) ) T 3 (U (3) ) T 4 (U (4) ) T ERCIM April 2005 19
Computation The mode j singular values are the diagonal elements of Σ (j) : Σ (j) = diag(σ (j) 1, σ(j) 2,..., σ(j) n ). Cost for computing HOSVD: Cn 5 flops. ERCIM April 2005 20
Discrete Ill-Posed Problems Linear Case: Truncated SVD Solution Solution (formally) min f Kf g 2, K R n2 n 2. f = V Σ 1 U T g = n 2 i=1 u T i g σ i v i, ERCIM April 2005 21
Solution for perturbed data, ĝ = g + δ, (measurement or round off errrors) ˆf = n 2 i=1 u T i ĝ σ i v i = n 2 i=1 u T i g σ i v i + n 2 i=1 u T i δ σ i v i. The second term will explode. Stabilization by TSVD: f ˆf k = k i=1 u T i ĝ σ i v i, ERCIM April 2005 22
HOSVD Multi-Linear Case: Truncated HOSVD Reduced least squares problem: min e F K = S 1 U (1) 2 U (2) 3 U (3) 4 U (4). S, F {3,4;1,2} G, F = (U (3) ) T F U (4), G = (U (1) ) T GU (2). Truncate the core tensor and solve: min e F S k, F {3,4;1,2} G, where S(k + 1 : n, k + 1 : n, k + 1 : n, k + 1 : n) = 0. ERCIM April 2005 23
Truncated HOSVD U (3) k A S k U (1) k U (2) k ERCIM April 2005 24
Algorithm 1. Compute the HO singular values and vectors: O(n 5 ) flops 2. Truncate to k based on singular values, compute only S k : O(kn 4 ) flops 3. Solve min S k, F k {3,4;1,2} G k, min S k f k g k F k f k (a) Compute SVD of S k and solve by TSVD: O(k 6 ) flops (b) Transform to original coordinates: O(kn 2 ) flops ERCIM April 2005 25
Numerical Example Perturbation of data: N(0, 1) Level of truncation of tensor: k = 30 Linear system level of truncation: k 0 = 8 ERCIM April 2005 26
Singular Values 10 2 SINGULAR VALUES 10 1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 0 10 20 30 40 50 60 70 80 ERCIM April 2005 27
Unregularized Solution UNREGULARIZED SOLUTION x 10 9 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 28
Regularized Solution, k = 8 REGULARIZED SOLUTION 80 60 40 20 0 20 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 29
Error ERROR 1 0.5 0 0.5 1 80 60 40 20 0 0 20 40 60 80 ERCIM April 2005 30
Conclusions One order of magnitude faster than straightforward approach Relation HO singular values Linear system singular values? Other applications: Tensor regression? ERCIM April 2005 31