TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets)

WHAT ARE TENSORS? Tensors = d-dimensional arrays: A = [a ij...k ] i I, j J,..., k K Tensor A has: dimensionality (order) d = number of indices (modes, axes, directions, ways) size n 1... n d (number of nodes along each axis)

WHAT IS A PROBLEM? NUMBER OF TENSOR ELEMENTS = n d GROWS EXPONENTIALLY IN d WATER AND UNIVERSE H 2 O molecule has 18 electrons. Each electron has 3 coordinates. Thus we have 18 3 = 54 axes. If we take 32 nodes on each axis, we obtain 32 54 10 81 points, which is close to the number of atoms in the universe. CURSE OF DIMENSIONALITY

WE SURVIVE WITH COMPACT (LOW-PARAMETRIC) REPRESENTATIONS FOR TENSORS METHODS FOR COMPUTATIONS IN COMPACT REPRESENTATIONS

TUCKER DECOMPOSITION a(i 1,..., i d ) = r 1 α 1 =1... r d α d =1 g(α 1,..., α d ) q 1 (i 1, α 1 )... q d (i d, α d ) L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika, V. 31, P. 279 311 (1966). COMPONENTS: 2D arrays q 1,..., q d with dnr entries d-dimensional array g(α 1,..., α d ) with r d entries CURSE OF DIMENSIONALITY REMAINS

CANONICAL DECOMPOSITION (PARAFAC, CANDECOMP) a(i 1,..., i d ) = R α=1 u 1 (i 1, α)... u d (i d, α) Number of defining parameters is drn. DRAWBACKS: INSTABILITY (cf. Lim, de Silva) x 1,..., x d, y 1,..., y d d a = z t 1... zt d, t=1 linearly independent zt k = { xk, k t y k, k = t a = 1 ε (x 1 + εy 1 )... (x d + εy d ) 1 ε x 1... x d + O(ε) EVENTUAL LACK OF ROBUST ALGORITHMS

a(i 1,..., i d ) = r 1 α 1 =1... r d α d =1 g(α 1,..., α d ) q 1 (i 1, α 1 )... q d (i d, α d ) TUCKER DECOMPOSITION

a(i 1,..., i d ) = R α=1 u 1 (i 1, α)... u d (i d, α) CANONICAL DECOMPOSITION (PARAFAC, CANDECOMP)

a(i 1,..., i d ) = α 1,..., α d 1 g 1 (i 1, α 1 ) g 2 (α 1, i 2, α 2 )...... g d 1 (α d 2, i d 1, α d 1 ) g d (α d 1, i d ) TENSOR-TRAIN DECOMPOSITION

TENSORS AND MATRICES Let A = [a ijklm ]. Take up a pair of mutually complementary long indices (ij) and (klm) (kl) and (ijm)... Tensor A gives rise to unfolding matrices: B 1 = [b (ij),(klm) ] By definition, B 2 = [b (kl),(ijm) ]... b (ij),(klm) = b (kl),(ijm) =... = a ijklm

DIMENSIONALITY CAN BE DECREASED a(i 1,..., i d ) = a(i 1,..., i k ; i k+1,..., i s ) r = u(i 1,..., i k ; s) v(i k+1,..., i d ; s) s=1 Dimension d reduces to dimensions k + 1 and d k + 1. Proceed by recursion. Binary tree arises.

TUCKER VIA RECURSION 2 3 4 5 α 2 3 4 5 α 2 α 2 3 4 5 α α 2 3 α 3 4 5 α α 2 α 3 4 α 4 5 α α 2 α 3 α 4 5 α 5 α α 2 α 3 α 4 α 5 a(i 1, i 2, i 3, i 4, i 5 ) = α 1,α 2,α 3,α 4,α 5 g(α 1, α 2, α 3, α 4, α 5 ) q 1 (i 1, α 1 ) q 2 (i 2, α 2 ) q 3 (i 3, α 3 ) q 4 (i 4, α 4 ) q 5 (i 5, α 5 )

BINARY TREE IMPLIES Any auxiliary index belongs to exactly two leaf tensors. Tensor is the sum over all auxiliary indices of the product of elements of the leaf tensors. HOW TO AVOID r d PARAMETERS Let any leaf tensor have at most one spatial index. Let any leaf tensor have at most two (three) auxiliary indices.

TREE WITHOUT TUCKER 2 3 4 5 α 2 3 4 5 α 2 α 2 3 4 5 α α 2 3 α α 3 4 5 α 2 α 3 4 α 2 α 4 5 α 3 α 4 TENSOR-TRAIN DECOMPOSITION a(i 1, i 2, i 3, i 4, i 5 ) = α 1,α 2,α 3,α 4 g 1 (i 1, α 1 ) g 2 (α 1, i 3, α 3 ) g 3 (α 3, i 5, α 4 ) g 4 (α 4, i 4, α 2 ) g 5 (α 2, i 2 )

HOW MANY PARAMETERS NUMBER OF TT PARAMETERS = 2nr + (d 2)nr 2 EXTENDED TT DECOMPOSITION 2 3 4 5 2 α 3 4 5 α α 2 2 α α 2 3 α 3 4 5 α α 3 2 α 5 α α 2 α 5 4 α α 4 5 α 3 α 4 4 α 6 α α 4 α 6 5 α 7 α 3 α 4 α 7 NUMBER OF EXTENDED TT PARAMETERS = dnr + (d 2)r 3

TREE IS NOT NEEDED! ALL IS DEFINED BY A PERMUTATION OF SPATIAL INDICES TENSOR-TRAIN DECOMPOSITION a(i 1, i 2, i 3, i 4, i 5 ) = β 1,β 2,β 3,β 4 g 1 (i σ(1), β 1 ) g 2 (β 1, i σ(2), β 2 ) g 3 (β 2, i σ(3), β 4 ) g 4 (β 4, i σ(5), β 5 ) g 5 (β 5, i σ(5) ) TT = Tree Tucker neither Tree, nor Tucker TENSOR TRAIN

MINIMAL TT DECOMPOSITION Let 1 β k r k. What are minimal values for compression ranks r k? r k ranka σ k A σ k = [ a σ (i σ(1),..., i σ(k) ; i σ(k+1),..., i σ(d) ) ] a σ (i σ(1),..., i σ(k) ; i σ(k+1),..., i σ(d) ) = a(i 1,..., i d )

GENERAL PROPERTIES THEOREM 1. Assume that a tensor a(i 1,..., i d ) possesses a canonical decomposition with R terms. Then a(i 1,..., i d ) admits a TT decomposition of rank R or less. THEOREM 2. Assume that a tensor a(i 1,..., i d ), when ε-perturbed, with any small ε possesses a canonical decomposition with R terms. Then a(i 1,..., i d ) admits a TT decomposition of rank R or less.

FROM CANONICAL TO TENSOR TRAIN a(i 1,..., i d ) = R s=1 u(i 1, s)... u(i d, s) = α 1,...,α d 1 u(i 1, α 1 ) δ(α 1, α 2 )u(i 2, α 2 )...... δ(α d 2, α d 1 )u(i d 1, α d 1 ) u(i d, α d 1 ) FREE!

EFFECTIVE RANK OF A TENSOR ERank(a) = lim sup ε +0 min RANK(b) b a ε b C(n 1,..., n d ) F(n 1,..., n d ): all tensors of size n 1... n d with entries from F. Let a F(n 1,..., n d ) C(n 1,..., n d ). Then canonical rank over F depends on F, effective rank does not. Close to border rank concept (Bini-Capovani). Which still depends on F. THEOREM 2 (reformulated) Let a F(n 1,..., n d ). Then for this tensor there exists a TT decomposition of rank r ERank(a) with entries of all tensors belonging to F.

EXAMPLE 1 d-dimensional tensor in the matrix form A = Λ I... I + I Λ... I +... + I... I Λ P(h) d s=1 (I + hλ) = I + ha + O(h2 ) A = 1 h P(h) 1 P(0) + O(h) h ERank(A) = 2

EXAMPLE 2 Real-valued tensor F by the function f(x 1,..., x d ) = sin(x 1 +... + x d ) on some 1D grids for x 1,..., x d. Beylkin et al: canonical rank over R of F does not exceed d (it is likely to be exactly d). However, sin x = exp(ix) exp( ix) 2i ERank(F) = 2

EXAMPLE 3 d-dimensional tensor A from discretization of operator A = 1 i j d a ij x i x j on a tensor grid for variables x 1,..., x d. Canonical rank d 2 /2. However, ERank(A) 3 2 d + 1 (N. Zamarashkin, I. Oseledets, E. Tyrtyshnikov)

TENSOR TRAIN DECOMPOSITION a(i 1,...i d ) = α 0,...,α d g 1 (α 0, i 1, α 1 ) g 2 (α 1, i 2, α 2 )... g d (α d 1, i d, α d ) MATRIX FORM a(i 1,..., i d ) = G i 1 1 G i 2 2... G i d d MINIMAL TT COMPRESSION RANKS: r k = ranka k, A k = [a (i1...i k )(i k+1...i d )], 0 k d size(g i k k ) = r k 1 r k

THE KEY TO EVERYTHING PROBLEM OF RECOMPRESSION: Given a tensor train, but with large ranks. Let us try to find in ε-vicinity a tensor train with lesser compression ranks. METHOD OF TT RECOMPRESSION (I.V.Oseledets): NUMBER OF OPERATIONS IS LINEAR IN DIMENSIONALITY d AND MODE SIZE n THE RESULT HAS GUARANTEED APPROXIMATION ACCURACY

METHOD OF TENSOR TRAIN RECOMPRESSION Minimal TT compression ranks = ranks of unfolding matrices A k Matrices A k are of size n k n d k, but never appear as full arrays of n d elements. Nevertheless, the SVD for A k are constructed with orthogonal (unitary) matrices in a compact factorized form. When neglecting smallest singular values, we provide GUARANTEED ACCURACY. To show the idea, consider a TT decomposition a(i 1, i 2, i 3 ) = α 1,α 2 g 1 (i 1, α 1 ) g 2 (α 1, i 2, α 2 ) g 3 (α 2, i 3 )

TENSOR TRAIN RECOMPRESSION RIGHT TO LEFT by QR a(i 1, i 2, i 3 ) = g 1 (i 1, α 1 ) g 2 (α 1, i 2, α 2 ) g 3 (α 2 ; i 3 ) α 1,α 2 = g 1 (i 1, α 1 ) g 2 (α 1, i 2 ; α 2 ) q 3(α 2 ; i 3) α 1,α 2 = ĝ 1 (i 1 ; α 1 ) q 2(α 1, i 2; α 2 ) q 3(α 2 ; i 3) α 1,α 2 Matrices q 2 (α 1 ; i 2, α 2 ), q 3(α 2 ; i 3) obtain orthonormal rows. g 3 (α 2 ; i 3 ) = r 3 (α 2 ; α α 2 ) q 3(α 2 ; i 3) 2 g 2 (α 1, i 2 ; α 2 ) = g 2 (α 1, i 2 ; α 2 ) r 3 (α 2, α 2 ) α 2 g 2 (α 1 ; i 2, α 2 ) = r 2 (α 1 ; α α 1 ) q 2(α 1 ; i 2, α 2 ) 1 ĝ 1 (i 1 ; α 1 ) = g 1 (i 1 ; α 1 ) r 2 (α 1 ; α 2 ) α 1 QR QR

TENSOR TRAIN RECOMPRESSION LEFT TO RIGHT by SVD a(i 1, i 2, i 3 ) = ĝ 1 (i 1 ; α 1 ) q 2(α 1, i 2, α 2 ) q 3(α 2, i 3) α 1,α 2 = α 1,α 2 = α 1,α 2 z 1 (i 1 ; α 1 ) ĝ 2(α 1 ; i 2, α 2 ) q 3(α 2, i 3) z 1 (i 1 ; α 1 ) z 2(α 1 ; i 2, α 2 ) ĝ 3(α 2, i 3) Matrices z 1 (i 1 ; α 1 ), z 2(α 1, i 2; α 2 ) obtain orthonormal columns.

LEMMA ON ORTHONORMALITY Let k l and matrices q k (α k 1 ; i k, α k ),..., q l (α l 1 ; i l, α l ) have orthonormal rows. Then the matrix Q k (α k ; i) Q k (α k 1 ; i k,..., i l, α l ) α k,...,α l 1 q k (α k 1 ; i k, α k )... q l (α l 1 ; i l, α l ) has orthonormal rows as well. PROOF BY INDUCTION. Q k (α k 1 ; i k, i) = α k q k (α k 1 ; i k, i) Q k+1 (α k ; i) Q k (α ; i k, i) Q k (β ; i k, i) = i k,i q k (α ; i k, µ) Q k+1 (µ ; i)q k (β ; i k, ν) Q k+1 (ν ; i) = i k,i µ,ν q k (α, ; i k, µ) q k (β ; i k, ν) δ(µ, ν) = i k µ,ν q k (α, ; i k, α k ) q k (β ; i k, α k ) = δ(α, β) i k,α k

TENSOR TRAIN RECOMPRESSION a(i 1, i 2, i 3 ) = ĝ 1 (i 1, α 1 ) q 2(α 1, i 2, α 2 ) q 3(α 2, i 3) α 1,α 2 = α 1,α 2 = α 1,α 2 z 1 (i 1, α 1 ) ĝ 2(α 1, i 2, α 2 ) q 3(α 2, i 3) z 1 (i 1, α 1 ) z 2(α 1, i 2, α 2 ) ĝ 3(α 2, i 3) ranka 1 = rank [ ĝ 1 (α 0, i 1; α 1 )] ranka 2 = rank [ ĝ 2 (α 1, i 2; α 2 )] ranka 3 = rank [ ĝ 3 (α 2, i 3; α 3 )] Complexity of computation of compression ranks is linear in d. Truncation is performed in the SVD of small-size matrices. NUMBER OF OPERATIONS = O(dnr 3 ) GUARANTEED ACCURACY = d ε (in the Frobenius norm)

TT APPROXIMATION FOR LAPLACIAN d TT recompression time Canonical rank Compresison rank 10 0.01 sec 10 2 20 0.09 sec 20 2 40 0.78 sec 40 2 80 13 sec 80 2 160 152 sec 160 2 200 248 sec 200 2 1D grids are of size 32. Tensor has modes of size n = 1024.

WHAT CAN WE DO WITH TENSOR TRAINS? a(i 1,...i d ) = α 1,...,α d 1 g 1 (i 1, α 1 ) g 2 (α 1, i 2, α 2 )... g d (α d 1, i d ) RECOMPRESSION: given a tensor train with TT-ranks r, we can approximate it by another tensor train with a guaranteed accuracy using O(dnr 3 ) operations. QUASI-OPTIMALITY OF RECOMPRESSION: ERROR d 1 BEST APPROX. ERROR WITH SAME TT-RANKS EFFICIENT APPROXIMATE MATRIX OPERATIONS

CANONICAL VERSUS TENSOR-TRAIN Canonical Tensor-Train Number of parameters O(dnR) O(dnr + (d 2)r 3 ) Matrix-by-vector O(dn 2 R 2 ) O(dn 2 r 2 + dr 6 ) Addition O(dnR) O(dnr) Recompression O(dnR 2 + d 3 R 3 ) O(dnr 2 + dr 4 ) Tensor-vector contraction O(dnR) O(dnr + dr 3 )

TENSOR-VECTOR CONTRACTION γ = i 1,...,i d a(i 1,..., i d ) x 1 (i 1 )... x d (i d ) ALGORITHM: Compute matrices Z k = ik g k (i k, α k 1, α k ) x k (i k ) Multiply matrices γ = Z 1 Z 2...Z k NUMBER OF OPERATIONS = O(dnr 2 )

RECOVER A d-dimensional TENSOR FROM A SMALL PORTION OF ITS ELEMENTS Given a procedure for computation of a(i 1,..., i d ). We need to choose true elements and use them to construct a TT approximation for this tensor. TT decomposition with maximal compression rank r is allowed to be constructed from some O(dnr 2 ) elements.

HOW THIS PROBLEM IS SOLVED FOR MATRICES Let A be close to a matrix of rank r: σ r+1 (A) ε Then there exists a cross of r columns C and r rows R such that (A CG 1 R) ij (r + 1)ε G is an r r matrix on the intersection of C and R Take G of maximal volume among all r r submatrices in A. S.A.Goreinov, E.E.Tyrtyshnikov: The maximal-volume concept in approximation by low-rank matrices, Contemporary Mathematics, Vol. 208 (2001), 47 51. S.A.Goreinov, E.E.Tyrtyshnikov, N.L.Zamarashkin: A theory of pseudo-skeleton approximations, Linear Algebra Appl. 261: 1 21 (1997). Doklady RAS (1995).

GOOD INSTEAD OF BEST: PSEUDO-MAX-VOLUME Given A of size n r, find a row permutation to move a good submatrix to the upper r r block. Since volume does not change by right-side multiplications, assume that 1... A = 1 a r+1,1... a r+1,r......... a n1... a nr NECESSARY FOR MAX-VOL: a ij 1, r + 1 i n, 1 j r Let this define a good submatrix. Then here is an algorithm: If a ij 1 + δ, then swap rows i and j. Make I in the first r rows by right-side multiplication. Check new a ij. Quit if all are less than 1 + δ. Otherwise repeat.

MATRIX CROSS ALGORITHM Assume we are given some initial column indices j 1,..., j r. Find maximal-volume row indices i 1,..., i r in these columns. Find maximal-volume column indices in the rows i 1,..., i r. Proceed choosing columns and rows until the skeleton cross approximations stabilize. E.E.Tyrtyshnikov, Incomplete cross approximation in the mosaic-skeleton method, Computing 64, no. 4 (2000), 367 380.

TENSOR-TRAIN CROSS INTERPOLATION Given a(i 1, i 2, i 3, i 4 ), consider the unfoldings and r-column sets: A 1 = [a(i 1 ; i 2, i 3, i 4 )], J 1 = {i (β 1) 2 i (β 1) 3 i (β 1) 4 } A 2 = [a(i 1, i 2 ; i 3, i 4 )], J 2 = {i (β 2) 3 i (β 2) 4 } A 3 = [a(i 1, i 2, i 3 ; i 4 )], J 3 = {i (β 3) 4 } Successively choose good rows: I 1 = {i (α 1) 1 } in a(i 1 ; i 2, i 3, i 4 ) : a = α 1 g 1 (i 1 ; α 1 ) a 2 (α 1 ; i 2, i 3, i 4 ) I 2 = {i (α 2) 1 i (α 2) 2 } in a 2 (α 1, i 2 ; i 3, i 4 ) : a 2 = α 2 g 2 (α 1, i 2 ; α 2 ) a 3 (α 2, i 3 ; i 4 ) I 3 = {i (α 3) 1 i (α 3) 2 i (α 3) 3 } in a 3 (α 2, i 3 ; i 4 ) : a 3 = α 3 g 3 (α 2, i 3 ; α 3 ) g 4 (α 3 ; i 4 ) Finally a = g 1 (i 1, α 1 ) g 2 (α 1, i 2, α 2 ) g 3 (α 2, i 3, α 3 ) g 4 (α 3, i 4 ) α 1,α 2,α 3,α 4

TT-CROSS INTERPOLATION OF A TENSOR Tensor A of size n 1 n 2... n d with compression ranks r k = ranka k, A k = A(i 1 i 2... i k ; i k+1... i d ) is recovered by elements of TT-cross C k (α k 1, i k, β k ) = A(i (α k 1) 1, i (α k 1) 2,..., i (α k 1) k 1, i k, j (β k) k+1,..., j(β k) d ) TT-cross is defined by index sets I k = {i (α k) 1... i (α k) k }, 1 α k r k J k = {j (β k) k+1... j(β k) d }, 1 β k r k Nested property for α sets. Require nonsingularity of r k r k matrices Â k (α k, β k ) = A(i (α k) 1, i (α k) 2,..., i (α k) k ; j (β k) k+1,..., j(β k) d ) α k, β k = 1,..., r k

FORMULA FOR TT-INTERPOLATION A(i 1, i 2,..., i d ) = α 1,...,α d 1 Ĉ 1 (α 0, i 1, α 1 ) Ĉ2(α 1, i 2, α 2 )... Ĉ d (α d 1, i d, α d ) Ĉ k (α k 1, i k, α k ) = α k C k (α k 1, i k, α k ) Â 1 k (α k, α k) k = 1,..., d Â d = I

TENSOR-TRAIN CROSS ALGORITHM Assume we are given r k initial column indices j (β k) k+1,..., j(β k) d in the unfolding matrices A k. Find r k maximal-volume rows in submatrices in A k of the form a(i (α k 1) 1,..., i (α k 1) k 1, i k ; j (β k) k+1,..., j(β k) d ). Use the row indices obtained and do the same from right to left to find new column indices. Proceed with these sweeps from left to right and from right to left. Stop when tensor trains stabilize.

EXAMPLE OF TT-CROSS APPROXIMATION HILBERT TENSOR 1 a(i 1, i 2,..., i d ) = i 1 + i 2 +... + i d d = 60, n = 32 r max Time Iterations Relative accuracy 2 1.37 5 1.897278e+00 3 4.22 7 5.949094e-02 4 7.19 7 2.226874e-02 5 15.42 9 2.706828e-03 6 21.82 9 1.782433e-04 7 29.62 9 2.151107e-05 8 38.12 9 4.650634e-06 9 48.97 9 5.233465e-07 10 59.14 9 6.552869e-08 11 72.14 9 7.915633e-09 12 75.27 8 2.814507e-09

COMPUTATION OF d-dimensional INTEGRALS: example 1 I(d) = sin(x 1 + x 2 +... + x d ) dx 1 dx 2... dx d = Im e i(x 1+x 2 +...+x d ) dx 1 dx 2... dx d = Im(( ei 1 ) d ) [0,1] d i Use the Chebyshev (Clenshaw-Curtis) quadrature with n = 11 nodes. All n d values are NEVER COMPUTED! Instead, we find a TT cross and construct a TT approximation for this tensor. d I Relative accuracy Time 10-6.299353e-01 1.409952e-15 0.14 100-3.926795e-03 2.915654e-13 0.77 500-7.287664e-10 2.370536e-12 4.64 1000-2.637513e-19 3.482065e-11 11.60 2000 2.628834e-37 8.905594e-12 33.05 4000 9.400335e-74 2.284085e-10 105.49

COMPUTATION OF d-dimensional INTEGRALS: example 2 I(d) = [0,1] d x 2 1 + x 2 2 +... x 2 d dx 1dx 2... dx d d = 100 Chebyshev quadrature with n = 41 nodes plus TT-cross of size r max = 32 give a reference solution. For comparison, take n = 11 nodes: r max Relative accuracy Time 2 1.747414e-01 1.76 4 2.823821e-03 11.52 8 4.178328e-05 42.76 10 3.875489e-07 66.28 12 2.560370e-07 94.39 14 4.922604e-08 127.60 16 9.789895e-10 167.02 18 1.166096e-10 211.09 20 2.706435e-11 260.13

INCREASE DIMENSIONALITY (TENSORS INSTEAD MATRICES) Matrix is a 2-way array. A d-level matrix is naturally viewed as a 2d-way array: A(i, j) = A(i 1, i 2,..., i d ; j 1, j 2,..., j d ) i (i 1...i d ), j (j 1...j d ) Important to consider a related reshaped array: B(i 1 j 1,..., i d j d ) = A(i 1, i 2,..., i d ; j 1, j 2,..., j d ) Matrix A is represented by tensor B.

MINIMAL TENSOR TRAINS a(i 1... i d ; j 1... j d ) = 1 α k r k g 1 (i 1 j 1, α 1 ) g 2 (α 1, i 2 j 2, α 2 )... g d 1 (α d 2, i d 1 j d 1, α d 1 ) g d (α d 1, i d j d ) Minimal possible values of compression ranks r k are equal to the ranks of specific unfolding matrices: r k = ranka k, A k = [A(i 1 j 1,..., i k j k ; i k+1 j k+1,..., i d j d )] If all r k = 1 then A = G 1... G d In general A = α 1 G 1α1 α2 G 2α1 α 2 α3 G 3α2 α 3......

NO CURSE OF DIMENSIONALITY Let 1 i k, j k n and r k = r. Then the number of representation parameters is dn 2 r 2. Dependence on d is linear! SO LET US MAKE d AS LARGE AS POSSIBLE BY ADDING FICTITIOUS AXES Assume we had d 0 levels. If n = 2 d 1 then set d = d 0 d 1.Then memory = 4dr 2 d = log 2 (size(a)) LOGARITHMIC IN THE SIZE OF MATRIX

CAUCHY TOEPLITZ EXAMPLE [ ] 1 A = i j + 1/2 Relative accuracy Compression ranks for A and A 1 1.e-5 3 7 8 8 8 7 7 7 3 1.e-7 3 7 9 10 10 9 9 7 3 1.e-9 3 7 11 11 11 11 11 7 3 1.e-11 3 7 12 13 13 13 12 7 3 1.e-13 3 7 14 14 15 14 14 7 3 n = 1024, d 0 = 1, d 1 = 10

INVERSES TO BANDED TOEPLITZ MATRICES Let A be a band Toeplitz matrix: A ij = [a(i j)] a k = 0, k > s, s is half-bandwidth. THEOREM Let size(a) = 2 d 2 d and det A 0. Then r k (A 1 ) 4s 2 + 1, k = 1,..., d 1, the estimate being sharp. COROLLARY The inverse to a band Toeplitz matrix A of size 2 d 2 d with halfbandwidth s has a TT representation with the number of parameters O(s 4 log 2 n). Using Newton with approximations we obtain the inversion algorithm with complexity O(log 2 n).

AVERAGE COMPRESSION RANK r = memory 2d memory = 4dr 2 INVERSION OF d 0 -DIMENSIONAL LAPLACIAN BY MODIFIED NEWTON d 1 = 10 Physical dimensionality (= d 0 ) 1 3 5 10 30 50 Average compression rank of A 2.8 3.5 3.6 3.7 3.8 3.8 Average compression rank of approximation to A 1 7.3 18.6 19.2 17.4 16.1 16.5 Time (sec) 2. 10. 17. 23. 27. 33. AX I / I 1.e-2 6.e-3 2.e-3 5.e-5 4.e-5 4.e-5 The last matrix size is 2 100.

INVERSION OF 10-DIMENSIONAL LAPLACIAN VIA INTEGRAL REPRESENTATION BY STENGER FORMULA 0 exp( At)dt h τ M k= M w k exp ( t k τ A ) h = π/ M, w k = t k = exp(hk), λ min (A/τ) 1

CONCLUSIONS AND PERSPECTIVES Tensor-train decompositions and corresponding algorithms (see http://pub.inm.ras.ru) provide us with excellent approximation tools for vectors and matrices. TT-toolbox for Matlab is available: http://spring.inm.ras.ru/osel. The memory needed depends on the matrix size logarithmically. It is terrific advantage when compression ranks are small. It is exactly so in many applications. Approximate inverses can be computed in the tensor-train format generally with complexity logarithmic in the size of matrix. Applications unclude huge-scale matrices (with size up to 2 100 ) and as well typical large-scale and even modest-scale matrices (like images). The key to efficient tensor-train operations is the recompression algorithm with complexity O(dnr 6 ) and reliability of the SVD. Modified Newton method with truncations and integral representations of matrix functions are viable in the tensor-train format.

GOOD PERSPECTIVES Multi-variate interpolation (construction of tensor trains from a small portion of all elements, tensor cross methods using the maximal volume concept). Fast computation of integrals in d dimensions (no Monte Carlo). Approximate matrix operations (e.g. inversion) with complexity O(log 2 n). linear in d = linear in log 2 n New direction in data compression and image processing (movies). Statistical interpretation of tensor trains. Applications to quantum chemistry, multi-parametric optimization, stochastic PDEs, data mining etc.

MORE DETAILS and WORK IN PROGRESS I. V. Oseledets and E. E. Tyrtyshnikov, Breaking the curse od dimensionality, or how to use SVD in many dimensions, Research Report 09-03, Hong Kong: ICM HKBU, 2009 (www.math.hkbu.edu.hk/icm/pdf/09-03.pdf), SIAM J. Sci. Comput., 2009. I. Oseledets, Compact matrix form of the d-dimensional tensor decomposition, SIAM J. Sci. Comput., 2009. I. V. Oseledets, "Tensors inside matrices give logarithmic complexity", SIAM J. Matrix Anal. Appl., 2009. I. V. Oseledets, TT-Cross Approximation for Multidimensional Arrays, Research Report 09-11, Hong Kong: ICM HKBU, 2009 (www.math.hkbu.edu.hk/icm/pdf/09-11.pdf), Linear ALgebra Appl., 2009. I. Oseledets, E. E. Tyrtyshnikov, On a recursive decomposition of multi-dimensional tensors, Doklady RAS, vol. 427, no. 2 (2009). I. Oseledets, On a new tensor decomposition, Doklady RAS, vol. 427, no. 3 (2009). I. Oseledets, On approximation of matrices with logarithmic number of parameters, Doklady RAS, vol. 427, no. 4 (2009). N. Zamarashkin, I. Oseledets, E. Tyrtyshnikov, Tensor structure of the inverse to a banded Toeplitz matrix, Doklady RAS, vol. 427, no. 5 (2009). Efficient ranks of tensors and stability of TT approximations, TTM for image processing, TT approximations in electronic structure calculations. In preparation.