This work has been submitted to ChesterRep the University of Chester s online research repository.

Similar documents
1.4 LECTURE 4. Tensors and Vector Identities

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

Tensors, and differential forms - Lecture 2

Introduction to Tensors. 8 May 2014

Tensor Decompositions and Applications

Phys 201. Matrices and Determinants

Matrices: 2.1 Operations with Matrices

A Brief Introduction to Tensors

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Mathematical Preliminaries

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Physics 110. Electricity and Magnetism. Professor Dine. Spring, Handout: Vectors and Tensors: Everything You Need to Know

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

A matrix over a field F is a rectangular array of elements from F. The symbol

Computational Linear Algebra

VECTORS, TENSORS AND INDEX NOTATION

Matrices and Determinants

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Math 671: Tensor Train decomposition methods

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1...

Fundamentals of Engineering Analysis (650163)

Matrix Algebra: Vectors

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

DM559 Linear and Integer Programming. Lecture 3 Matrix Operations. Marco Chiarandini

A Primer on Three Vectors

A primer on matrices

The Matrix Representation of a Three-Dimensional Rotation Revisited

Cartesian Tensors. e 2. e 1. General vector (formal definition to follow) denoted by components

Linear Algebra and Eigenproblems

Matrices BUSINESS MATHEMATICS

4 Linear Algebra Review

Fundamentals of Multilinear Subspace Learning

Elementary Row Operations on Matrices

A Review of Matrix Analysis

Quantum Computing Lecture 2. Review of Linear Algebra

Matrices. Chapter Definitions and Notations

Rotational motion of a rigid body spinning around a rotational axis ˆn;

Linear Algebra (Review) Volker Tresp 2018

Foundations of Matrix Analysis

Professor Terje Haukaas University of British Columbia, Vancouver Notation

B553 Lecture 5: Matrix Algebra Review

Getting Started with Communications Engineering. Rows first, columns second. Remember that. R then C. 1

Tensor Analysis in Euclidean Space

Appendix C Vector and matrix algebra

Math 671: Tensor Train decomposition methods II

An OpenMath Content Dictionary for Tensor Concepts

Mathematics for Graphics and Vision

Vectors and Matrices Notes.

Institute for Computational Mathematics Hong Kong Baptist University

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Index Notation for Vector Calculus

Basics of Calculus and Algebra

NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA

NEW TENSOR DECOMPOSITIONS IN NUMERICAL ANALYSIS AND DATA PROCESSING

Vectors. January 13, 2013

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

1.3 LECTURE 3. Vector Product

Vector analysis and vector identities by means of cartesian tensors

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

From Matrix to Tensor. Charles F. Van Loan

MATRICES. a m,1 a m,n A =

EE731 Lecture Notes: Matrix Computations for Signal Processing

Notes on vectors and matrices

Chapter 7. Linear Algebra: Matrices, Vectors,

Review of Vector Analysis in Cartesian Coordinates

ELEMENTARY LINEAR ALGEBRA

M. Matrices and Linear Algebra

1.13 The Levi-Civita Tensor and Hodge Dualisation

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Tensors - Lecture 4. cos(β) sin(β) sin(β) cos(β) 0

Chapter 8 Vectors and Scalars

Properties of Matrices and Operations on Matrices

1 Matrices and matrix algebra

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Vectors and Matrices

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Linear Algebra V = T = ( 4 3 ).

Classical Mechanics Solutions 1.

Review of Vectors and Matrices

Faloutsos, Tong ICDE, 2009

Linear Algebra Review. Vectors

Chapter 8 Integral Operators

G1110 & 852G1 Numerical Linear Algebra

Covariant Formulation of Electrodynamics

Chapter 1. Describing the Physical World: Vectors & Tensors. 1.1 Vectors

Introduction to Matrices

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Chapter XI Novanion rings

Elementary Linear Algebra

Numerical Analysis Lecture Notes

Mathematical Methods wk 2: Linear Operators

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

PHYS 705: Classical Mechanics. Rigid Body Motion Introduction + Math Review

. a m1 a mn. a 1 a 2 a = a n

Image Registration Lecture 2: Vectors and Matrices

Linear Algebra and its Applications

Linear Algebra. Preliminary Lecture Notes

Elementary maths for GMT

arxiv: v2 [math.na] 13 Dec 2014

Transcription:

This work has been submitted to ChesterRep the University of Chester s online research repository http://chesterrep.openrepository.com Author(s): Daniel Tock Title: Tensor decomposition and its applications Date: September 2010 Originally published as: University of Chester MSc dissertation Example citation: Tock, D. (2010). Tensor decomposition and its applications. (Unpublished master s thesis). University of Chester, United Kingdom. Version of item: Submitted version Available at: http://hdl.handle.net/10034/123074

Tensor Decomposition and its Applications Thesis submitted in accordance with requirements of the University of Chester for the degree Master of Science by Daniel Tock September 2010

Acknowledgments I would like to thank my Programme Leader Dr Jason Roberts whose encouragement, guidance and support throughout my thesis helped me to develop a wide understanding of the subject. I also would like to thank my family and friends for their morale support and encouragement throughout the past year.

Abstract In this Thesis I review classical vector - tensor analysis, building up to the necessary techniques required to decompose a tensor into a tensor train and to reconstruct it back into the original tensor with minimal error. The tensor train decomposition decomposes a tensor of dimensionality d into a train of d third order tensors, whose sizes are dependent upon the rank and chosen error bound. I will be reviewing the required operations of matricization, tensor - matrix, vector and tensor multiplication to be able to compute this decomposition. I then move onto analysing the tensor train decomposition by applying it to different types of tensor, of differing dimensionality with a variety of accuracy bounds to investigate their influence on the time taken to complete the decomposition and the final absolute error. Finally I explore a method to compute a d-dimensional integration from the tensor train, which will allow larger tensors to be integrated with the memory required dramatically reduced after the tensor is decomposed. I will be applying this technique to two tensors with different ranks and compare the efficiency and accuracy of integrating directly from the tensor to that of the tensor train decomposition.

Contents 1 Introduction 3 2 Review of classical Tensor Analysis 4 2.1 Vectors and Scalars........................ 4 2.1.1 Vector Algebra...................... 5 2.1.2 Laws of Vector Algebra.................. 6 2.1.3 Cartesian Coordinate system............... 7 2.1.4 Contravariant and Covariant Vectors.......... 7 2.2 Tensor Analysis.......................... 8 2.2.1 The Einstein Summation Convention.......... 8 2.2.2 The Kronecker Delta Symbol.............. 8 2.2.3 The Levi-Civita Symbol................. 9 2.2.4 Contravariant, Covariant and Mixed Tensors...... 9 2.2.5 Raising and Lowering Indices.............. 10 2.2.6 Matricization....................... 10 2.3 Tensor Multiplication....................... 12 2.3.1 Tensor-Matrix Multiplication.............. 13 2.3.2 Tensor-Vector Multiplication............... 14 2.3.3 Tensor-Tensor Multiplication............... 15 2.4 Multilinear Algebra........................ 20 2.4.1 The Algebra of Matrices................. 20 2.4.2 Fundamental Operations with Tensors......... 24 2.5 Tensor Rank and Decomposition................. 28 2.5.1 The CP (CANDECOMP/PARAFAC) decomposition. 30 2.5.2 The TT-decomposition.................. 30 2.5.3 Computing the TT-decomposition............ 32 2.5.4 Numerical Experiments.................. 36 2.5.5 Results........................... 40 2.5.6 Applications of the TT-decomposition......... 40 3 High-dimensional integration using the TT-decomposition 40 3.1 Computation of high-dimensional integration.......... 42 3.1.1 Example 1......................... 42 3.1.2 Example 2......................... 49 4 Conclusions and Further Work 60 1

5 Appendix 61 5.1 M-files............................... 61 5.1.1 tensor matrix multiplication............... 61 5.1.2 tensor vector multiplication............... 61 5.1.3 truncated SVD...................... 62 5.1.4 tensor inner product................... 62 5.1.5 tensor outer product................... 63 5.1.6 Sparse tensor generator.................. 63 5.1.7 tensor contracted product................ 64 5.1.8 tensor train decomposition................ 65 5.1.9 Multiplying out a tensor train.............. 67 5.1.10 TT Contraction (Integration).............. 67 2

1 Introduction In today s society vector analysis, or more precisely, tensor analysis is one of the most important mathematical techniques used in the physical sciences and the modelling of real life situations. The concept of vectors were first introduced by the Egyptians and Babylons, but the first 2-dimensional and 3-dimensional vectoral systems weren t published until 1799 by Wessel [6]; hence the idea of a tensor (or a multidimensional array), is a fairly new mathematical concept. Tensors or Multidimensional arrays are encountered in many real life applications, although due to the unmanageable amount of memory required we are unable to perform anything but basic operations upon them, with which the number of operations required to perform grow exponentially with the dimensionality d [12]. In order to reduce the required number of operations it is possible to express a tensor as the product of smaller tensors, hence reducing the memory required. There are many forms of decomposition, the most popular of which is the canonical decomposition or parallel factors model, CANDECOMPPA- RAFAC decomposition [5]. The problem with this model is that it often suffers degeneracy (diverging components), and only has efficient algorithms for 3-dimensional data. For higher dimensional arrays the Tucker decomposition [5] is used which is effectively an extension of the CANDECOMPPA- RAFAC decomposition. This involves decomposing a d-dimensional tensor into a tensor of dimensionality d 1 and d 2-dimensional tensors (matrices). Again the issue of memory arises as for decomposing a tensor of large d, we still acquire a tensor of d 1 dimensions, which still requires a large amount of memory. In an attempt to remove the issue of exponential memory requirements, I will be looking at the recently suggested Tensor-Train decomposition by E. Tyrtyshnikov and I. Oseledets [11], [12]. It can be computed via standard decompositions such as the singular value decomposition (SVD) and the QR decomposition and only produces 3-dimensional tensors, which are easily manageable to perform operations on. The Tensor train decomposition algorithm can be considered as an adap- 3

tive interpolation algorithm for a multivariate function given on a tensor grid [12], and so we are able to apply it to high-dimensional integration. I will be comparing the accuracy and efficiency of using the original tensor to compute the integration against the tensor train. 2 Review of classical Tensor Analysis A tensor is a multidimensional array. More formally, an N-way or Nth-order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. This notion of tensors is not to be confused with tensors in physics and engineering (such as stress tensors), which are generally referred to as tensor fields in mathematics [9]. 2.1 Vectors and Scalars Definition 2.1. A vector is a quantity having both magnitude and direction [13]. Displacement, Velocity, Acceleration and Force are all examples of vectors. Graphically a vector is represented by a line OP, which defines the direction. The Magnitude of the vector is the length of the line. The starting point of the line, O, is called the origin, or initial point of the vector. The head of the line, P, is called the terminal point or terminus. Analytically a vector can be described as A or A, with magnitudes A, A or A. I will be using The bold faced notation throughout this paper, Hence the vector OP will be denoted as OP with magnitude OP. Definition 2.2. A scalar is a quantity having only magnitude and no direction [13]. Mass, Volume, Speed, energy, length, time and temperature are all examples of scalars. Scalars are defined by letters, and their operations follow the same rules as in elementary algebra. 4

2.1.1 Vector Algebra With the following Definitions, we are able to extend the operations of addition, subtraction and multiplication to vector algebra. Definition 2.3. Two Vectors A and B are equal if they have the same magnitude and direction, regardless of the position of their initial points. As seen in figure 1. A B Figure 1: Equal vectors Definition 2.4. A vector having direction opposite to A but having the same magnitude is denoted by A. As seen in figure 2. -A A Figure 2: Opposite vectors Definition 2.5. The sum of two vectors A and B results in the vector C, which is formed by placing the terminal point of A on the initial point of B, and then joining the initial point of A and the terminal point of B. As seen in figure 3. 5

A B C Figure 3: Sum of two vectors Definition 2.6. The difference of the two vectors A and B, A-B is equivalent to A+(-B). If A = B, then A-B yields 0, the null or zero vector. 0 has zero magnitude and no specific direction. A vector which is not null is defined as a proper vector. All vectors are assumed to be proper unless otherwise stated [13]. Definition 2.7. The product of a vector A by a scalar m is a vector ma with magnitude m A, and direction the same as or opposite to that of A, according to m being positive or negative respectively. If m = 0, ma is the null vector. 2.1.2 Laws of Vector Algebra If A, B and C are vectors and m and n are scalars then: 1. A + B = B + A Addition is Commutative, 2. A + (B + C) = (A + B) + C Addition is Associative, 3. ma = Am Multiplication is Commutative, 4. m(na) = (mn)a Multiplication is Associative, 5. (m + n)a = ma + na Distributive Law, 6. m(a + B) = ma + mb Distributive Law. Definition 2.8. A Unit Vector has unit magnitude, i.e. a magnitude of 1. Let A be a proper vector, then A is a unit vector with the same direction A as A. 6

2.1.3 Cartesian Coordinate system Any Vector in a 3-dimensional space can be described in terms of the i, j and k unit vectors, which relate to the x, y and z axes respectively. y k x i j z Figure 4: Cartesian Coordinate system 2.1.4 Contravariant and Covariant Vectors If N quantities A 1, A 2,..., A N in a coordinate system (x 1, x 2,..., x N ) are related to N other quantities Ā1, Ā2,..., ĀN in another coordinate system ( x 1, x 2,..., x N ) by the transformation equations: Ā p = N q=1 x p x q A q p = 1, 2,..., N (1) Then they are called components of a contravariant vector. Similarly, if N quantities A 1, A 2,..., A N in a coordinate system (x 1, x 2,..., x N ) are related to N other quantities Ā1, Ā2,..., ĀN in another coordinate system ( x 1, x 2,..., x N ) by the transformation equations: 7

Ā p = N q=1 x q x p A q p = 1, 2,..., N (2) then they are called components of a covariant vector. 2.2 Tensor Analysis Tensors are a further extension of ideas we already use when defining quantities such as scalars and vectors. The order of a tensor is the dimensionality of the array that is required to represent it. A scalar, is a 0-dimensional array, therefore, can be represented by a tensor of order 0. A vector, is 1-dimensional array, and therefore can be represented by a 1st order tensor. A square matrix, is a 2 dimensional array, which represents a 2nd order tensor. In general a n-dimensional array is required to represent a nth order tensor. 2.2.1 The Einstein Summation Convention This is a compact method of notation which omits the summation sign corresponding to any index which occurs twice in a single term. A repeated index in any term therefore implies a summation over that index [7]. We reserve the lowercase letters i, j, k, for indices ranging over the values 1, 2, 3 and shall not indicate their range explicitly. For example, the equation a 1 x 1 + a 2 x 2 +... + a N x N can be written as N j=1 a jx j. The Einstein summation convention shortens this yet further to a j x j, where j = 1, 2,..., N. 2.2.2 The Kronecker Delta Symbol The 2-dimensional Kronecker Delta Symbol, written as δ jk, is defined by: { 0 if k j δ jk = 1 if k = j Properties of the Kronecker Delta Symbol from [2]: 8

δ ij = δ ji δ i j = δ j i = δji = δ ji δ i i = 3 δ j i A j = A i δ ij A i B j = A i B i = A B 2.2.3 The Levi-Civita Symbol The 3-dimensional Levi-Civita symbol ɛ ijk is defined as follows: +1 when i, j, k is an even permutation of 1, 2, 3 ɛ ijk = 1 when i, j, k is an odd permutation of 1, 2, 3 0 when any two indices are identical The order i, j, k is an even/odd permutation of 1, 2, 3 if an even/odd number of transpositions is required to bring i, j, k into the order 1, 2, 3. 2.2.4 Contravariant, Covariant and Mixed Tensors If N 2 quantities A qs in a coordinate system (x 1, x 2,..., x N ) are related to N 2 other quantities Āpr in another coordinate system ( x 1, x 2,..., x N ) by the transformation equations: Ā pr = x p x q x r x s A qs (3) Then by the Einstein summation convention, they are called contravariant components of a tensor of rank 2. Similarly, The N 2 quantities A qs are called covariant components of a tensor of rank 2 if: Ā pr = x q x p x s x r A qs (4) We can also extend this idea to mixed tensors of higher ranks. For example, A qst kl are the components of a mixed tensor of rank 5. Contravariant of order 3 and covariant of order 2 if they transform to the relations: 9

Ā prm ij = x p x r x m x k x l A qst kl (5) x q x s x t x i x j In any tensor equation an index can only appear once for a single index, or twice for a repeated index. i.e. A i ii is impossible. If an index is repeated, it may only be repeated once, i.e. A ii ii is impossible. A single index can be either be covariant or contravariant in the whole equation. It cannot be covariant in one term and contravariant in another [2]. 2.2.5 Raising and Lowering Indices The metric tensor can be used to raise and lower indices of a tensor. Definition 2.9. Metric Tensor. In Cartesian coordinates the square of the distance between two arbitrarily close points in space, one with coordinates x i and the other x i + dx i, is: which can be written as: (ds) 2 = (dx) 2 + (dy) 2 + (dz) 2 = (dx i ) 2 (6) (ds) 2 = δ ij dx i dx j (7) Hence the distance between these two point determines a tensor of rank 2, known as the metric tensor δ ij. Thus, the covariant components of the metric tensor in Cartesian coordinates are given by Kronecker delta symbol [13]. For example, if A i are contravariant components of a vector then its covariant components are A i = δ ij A j. Likewise A i = δ ij A j. This operation, called raising and lowering indices can be applied to any tensor. 2.2.6 Matricization Matricization, also known as unfolding or flattening, is the process of reordering the elements of an N-dimensional array into a matrix [9]. For instance a 4 3 2 matrix can be written as a 4 6 matrix, or a 8 3 matrix, and so on. For example consider the tensor A R 4 3 2 seen in figure 5. 10

A = 13 17 21 1 5 9 14 18 22 2 6 10 15 19 23 3 7 11 16 20 24 4 8 12 Figure 5: 3rd order, 4 3 2 tensor, For ease, I will display 3rd order tensors, as matrices, where the divisions represent the layers from the front face of the matrix to the back. Hence, A becomes: A = 1 5 9 13 17 21 2 6 10 14 18 22 3 7 11 15 19 23 4 8 12 16 20 24 There are three mode-n unfoldings of a 3rd order tensor, using A above as an example, we get: A (1) = A (2) = A (3) = 1 5 9 13 17 21 2 6 10 14 18 22 3 7 11 15 19 23 4 8 12 16 20 24 1 2 3 4 13 14 15 16 5 6 7 8 17 18 19 20 9 10 11 12 21 22 23 24 ( 1 2 3 4... 9 10 11 12 13 14 15 16... 21 22 23 24 Different papers sometimes use different ordering of the columns for the mode-n unfolding. In general, the specific permutation of columns is not ) 11

important so long as it is consistent across related calculations. It is also possible to vectorize a tensor. Once again, ordering the elements is not important as long as it is consistent. V ec(a) = Another way to imagine matricization is lining up the tensor fibres from a particular dimension as done by Kolda and Bader [8, 9]. 1 2... 24 (a) A (1) (b) A (2) (c) A (3) Figure 6: Fibres of a 3rd order tensor Where the column fibres of a 3rd order tensor comprise the columns of A (1), as shown in figure 6a. The row fibres comprise the columns of A (2), as shown in figure 6b. And the tube fibres comprise the columns of A (3), as shown in figure 6c, 2.3 Tensor Multiplication Tensors can be multiplied together, though obviously the notation and symbols for this are much more complex than for matrices. 12

2.3.1 Tensor-Matrix Multiplication The n-mode product of a tensor A R x 1 x 2... x N with a matrix U R J xn is denoted by A n U and is of size x 1 x 2... x n 1 J x n+1... x N. Element wise we have: (A n U) i1,...,i n 1,j,i n+1,...,i N = It is important to note that: x n i n=1 a i1,i 2,...,i N u jin (8) B = A n U (9) relates to tensors, therefore after matricization, we use the following: B (n) = UA (n). (10) As an example, consider the tensor A defined above in figure 5 and let U = ( 1 3 5 2 4 6 Then the product of A R 4 3 2 with U R 2 3 gives B = A 2 U R 4 2 2 : ( ) 61 70 79 88 169 178 187 196 B (2) =. 76 88 100 112 220 232 244 256 This is clearly not a 4 2 2 tensor; to obtain the tensor we must reverse matricize B. It is of the form mode-2, due to the mode-2 multiplication that took place. Therefore we know that the columns of B comprise the row fibres of the tensor. Hence B = ). 61 76 169 220 70 88 178 232 79 100 187 244 88 112 196 256 13.

There are a few important identities to note about n-mode multiplication: Identity 1: A n (UV ) = A n U n V, Identity 2: A m U n V = A n V m U for m n, Identity 3: Let U be a p q matrix with full column rank. If B = A n U then A = B n Z where Z is the q p left inverse of U. For Example if U has orthonormal columns, then B = A n U A = B n U T. 2.3.2 Tensor-Vector Multiplication The n-mode product of a tensor A R x 1 x 2... x N with a vector P R xn is again denoted by A n P, or sometimes as A n P. The result is of order N 1 and is of size x 1 x 2... x n 1 x n+1... x N. Element wise we have: (A n P ) i1,...,i n 1,i n+1,...,i N = x n i n=1 a i1,i 2,...,i N P in (11) The general idea is to compute the inner product of each mode-n fibre with the vector. Hence the equation: C = A n P (12) Which relates to the tensors, requires altering after Matricization. as we are computing the inner product we use the following equation: C (n) = P T A (n) (13) As an example, again consider the tensor A defined above in figure 5 and let P = Then the product of A R 4 3 2 with P R 4 gives C = A 1 P R 3 2 : 14 1 2 3 4.

C (1) = ( 30 70 110 150 190 230 ) Again we must reverse matricize C to achieve the correct solution. It is of the form mode-1, therefore we know that the columns of C comprise the column fibres of the tensor. Hence which is equivalent to C = ( 30 70 110 150 190 230 ), C = [ 30 70 110 150 190 230 When it comes to mode-n vector multiplication, the following rule applies, where A is a tensor, and b and c are vectors: ]. A n b m c = (A n b) m 1 c = (A m c) n b for m > n. (14) 2.3.3 Tensor-Tensor Multiplication The last category of tensor multiplication to consider is the product of two tensors. There are three general types of multiplication to consider for tensortensor multiplication: outer product, contracted product and inner product. Tensor Tensor Outer Product The outer product of two tensors is often just referred to as the tensor product. If we first consider it in terms of vectors, the outer product of two vectors b R 3 and c R 5 is defined by d = b c R 3 5. Element wise we get: b c = bc T For example, if we let (b c)ij = b i c j 15

b = then their outer product is 1 2 3 and c = 4 5 6 7 8 b c = bc T = 1 2 3 [ 4 5 6 7 8 ] = 4 5 6 7 8 8 10 12 14 16 12 15 18 21 24 Similarly, the outer product of two tensors U R u 1... u N and V R v 1... v M gives A = U V R u 1... u N v 1... v M. Element wise we get:. For example, if we let U = (U U) u1...u N v 1...v M = U u1...u N V v1...v M. [ 1 2 5 6 3 4 7 8 ] and V = [ 9 10 13 14 11 12 15 16 which are two 2 2 2 3rd order tensors, then their outer product is going to be a 2 2 2 2 2 2 6th order tensor. This I will display as a set of 3 dimensional matrices with their corresponding 4th, 5th and 6th dimensions: ], C = [ 10 20 50 60 30 40 70 80 [ 15 30 75 90 45 60 105 120 [ 9 18 45 54 27 36 63 72 ] (1,2,1) ] (2,1,2) ] (1,1,1) [ 12 24 60 72 36 48 84 96 [ 11 22 55 66 33 44 77 88 ] [ 14 28 70 84 42 56 98 112 16 (2,2,1) ] (1,2,2) ] (2,1,1) [ 13 26 65 78 39 52 91 104 ] [ 16 32 80 96 48 64 112 128 (1,1,2) ]. (2,2,2)

Tensor Tensor Inner Product The inner product is effectively the opposite of the outer product, although requires the tensors to have equal dimensions [3]. For two tensors R x 1 x 2... x N, the inner product is defined by A, B = A a1 a 2...a N B b1 b 2...b N. For vectors, the inner product is defined by Element wise we obtain b, c = b T c. For example, if we let b = b, c = b j c i. 1 2 3 and c = 4 5 6 then their inner product is b, c = b T c = [ 4 5 6 ] 1 2 3 = 32. For Tensors, the inner product is the sum of the products of corresponding elements. Hence if A = [ 1 2 5 6 3 4 7 8 then their inner product is ] [ 9 10 13 14, B = 11 12 15 16 ] A, B = (1 9) + (2 10) +... + (7 15) + (8 16) = 492. 17

Tensor Tensor Contracted Product The contracted product of two tensors is a generalization of the tensor vector and tensor matrix multiplications. The key distinction is that the modes to be multiplied and the ordering of the resulting modes is handled specially in the matrix and vector cases. In this general case, let A R x 1... x N j 1... j P and B R x 1... x N k 1... k M. We can multiply both tensors along the first M modes and the result is a tensor of size j 1... j P k 1... k M. Element wise we have A, B (1,...,N;1,...,N) j1,...,j P,k 1,...,k M = A x1,...,x N,j 1,...,j P B x1,...,x N,k 1,...,k M. With this notation, the modes to be multiplied are specified in the subscripts that follow the angle brackets. The remaining modes are ordered such that those from A come before B, which is different from the tensor-matrix product case considered previously, where the leftover matrix dimension of B replaces x n, rather than being moved to the end [3]. If we first consider two second order tensors A R 2 4 and B R 3 2, i.e. matrices. Let [ ] 9 10 1 2 3 4 A =, B = 11 12. 5 6 7 8 13 14 Then their contracted product A, B (1;2) R 4 3 is given by which gives A, B (1;2) ij = 2 A ki B jk k=1 59 71 83 78 94 110 97 117 137 116 140 164 This is clearly different from that of a normal matrix multiplication encountered before, although if this method were applied to two matrices that 18.

satisfied the normal matrix multiplication rules, the outputs would be identical. If we now consider two 3rd order tensors A R 2 3 2 and B R 2 3 3, where [ ] 1 2 3 7 8 9 A = 4 5 6 10 11 12 and B = [ 13 14 15 19 20 21 25 26 27 16 17 18 22 23 24 28 29 30 then there are three possible contracted products that we can calculate: A, B (1;1), A, B (2;2) and A, B (1,2;1,2) ; two single contractions and one double contraction. If we first consider A, B (1;1), then we will get a 4th order tensor of size 3 2 3 3, whose elements are defined by A, B (1;1) ijlm = 2 A kij B klm giving the output, where the corresponding 4th dimension is displayed next to each matrix: 77 251 82 268 87 285 106 280 113 299 120 318, 135 309 144 330 153 351 (1) 107 353 112 370 117 387 148 394 155 413 162 432, 189 435 198 456 207 477 (2) 137 455 142 472 147 489 190 508 197 527 204 546. 243 561 252 582 261 603 If we now consider the double contraction A, B (1,2;1,2) R 2 3, we calculate the elements as follows: k=1 (3) ] A, B (1,2;1,2) ij = 2 3 A kmi B kmj k=1 m=1 19

The output is the following 2 3 matrix: [ 343 469 595 901 1243 1585 It should be noted that the inner product of two tensors is the same as the contracted product, where the answer is a scalar. 2.4 Multilinear Algebra Multilinear algebra is a generalization of linear algebra, since a linear function is also multilinear in one variable. Multilinear algebra implies that we are dealing with functions of several variables that are linear in each variable separately. I will therefore firstly consider the algebra of matrices and extend the algebra, where possible, to tensors for multilinear algebra. ] 2.4.1 The Algebra of Matrices In general I will be considering an m n matrix: x 11 x 12 x 13... x 1n x 21 x 22 x 23... x 2n x 31 x 32 x 33... x 3n....... x m1 x m2 x m3... x mn Addition of Matrices Definition 2.10. Given two m n matrices A and B, we define the sum C = A + B to be the m n matrix whose elements are C ij = A ij + B ij [4]. It is important to note that the sum of two matrices is defined only when they are of equal size. The sum is simply obtained by adding corresponding elements, therefore resulting in an equal sized output matrix. Theorem 2.1. (Addition of matrices is commutative) For two matrices A and B, both of equal size, we have A + B = B + A. 20

Proof. From [4]. If A and B are each of size m n then A + B and B + A are also of size m n and by the above definition, element wise we have: A + B = [a ij + b ij ], B + A = [b ij + a ij ]. Since the addition of numbers is commutative, we have a ij + b ij = b ij + a ij for all i, j and so, we conclude that A + B = B + A. Theorem 2.2. (Addition of matrices is associative) For the matrices A, B and C, all of equal size, we have A + (B + C) = (A + B) + C. Proof. From [4]. If A, B and C are each of size m n then A + (B + C) and (A + B) + C are also of size m n. Element wise we have: A + (B + C) = [a ij + (b ij + c ij )], (A + B) + C = [(a ij + b ij ) + c ij ]. Since the addition of numbers is commutative, we have a ij +(b ij +c ij ) = (a ij + b ij ) + c ij for all i, j and so, we conclude that A + (B + C) = (A + B) + C. Zero and inverse matrices Theorem 2.3. (Zero matrix) There is a unique m n matrix M such that, for every m n matrix A, A + M = A. Proof. From [4]. Consider the matrix M whose entries are all zeros, i.e. m ij = 0 i, j. Hence element wise we have: A + M = a ij + m ij = a ij + 0 = a ij = A To establish the uniqueness of this matrix, assume we have another matrix B such that A + B = A for every m n matrix A. Then in particular we have M + B = M. But, taking B instead of A in the property for M we have B + M = B. It now follows by Theorem 2.1 that B = M. Definition 2.11. The unique matrix arising in Theorem 2.3 is called the m n zero matrix and will be denoted simply as 0, or if confusion is caused 0 ij. Theorem 2.4. (Additive inverse) For every m n matrix A there is a unique matrix B such that A + B = 0. 21

Proof. From [4]. Given A = a ij, consider the matrix B whose elements are the additive inverse of A. Hence element wise we have: A + B = a ij + b ij = a ij + ( a ij ) = 0 To establish the uniqueness of such a matrix B, suppose there exists a matrix C such that A+C = 0. Then for all i, j we have a ij +c ij = 0 and consequently c ij = a ij, which means that C = B. Definition 2.12. The unique matrix B arising in Theorem 2.4 is called the additive inverse of A and will be denoted by A. Thus A is the matrix whose elements are the additive inverse of the corresponding elements of A. Given two scalars a and b, their difference a b is defined to be a + ( b). The same principle applies for matrices. A and B both of equal size. A B is written as A+( B), where A and B are of the same size. The - operation is also defined as matrix subtraction. Matrix - scalar multiplication Definition 2.13. Given a matrix A and a scalar λ, we define their product to be the matrix λa. Whose elements are obtained by multiplying each element of A by λ, defined by λa = λa ij. The principal properties of multiplying a matrix by a scalar are as follows. Theorem 2.5. (Properties of multiplying a matrix by a scalar ) (1) λ(a + B) = λa + λb, (2) (λ + µ)a = λa + µa, (3) λ(µa) = (λµ)a, (4) ( 1)A = A, (5) 0A = 0 ij. 22

Matrix - matrix multiplication key properties. Matrix multiplication has the following Theorem 2.6. (Matrix multiplication is associative) Given three matrices A, B and C, who have compatible sizes for multiplication, then A(BC) = (AB)C. Proof. From [4]. For A(BC) to be defined we require their respective sizes to be m n, n p, p q, in which case (AB)C is also defined. Computing the elements of these products we obtain: [A(BC)] ij = a ik [BC] kj = a ik (b kt c tj ) = a ik b kt c tj [(AB)C] ij = [AB] it c tj = (a ik b kt )c tj = a ik b kt c tj Clearly, we can see that A(BC) = (AB)C. Theorem 2.7. (Matrix multiplication distributive law) Given three matrices A, B and C, who have compatible sizes for multiplication, then: A(B + C) = AB + AC, (B + C)A = BA + CA Proof. From [4]. For the first equality we require A to be of size m n and B, C to be of size n p, in which case: [A(B+C)] ij = a ik (b kj +c kj ) = a ik b kj +a ik c kj = [AB] ij +[AC] ij = [AB+AC] ij it follows that A(B + C) = AB + AC. Theorem 2.8. (Matrix multiplication with scalars) If AB is defined then for all scalars λ we have: λ(ab) = (λa)b = A(λB) Definition 2.14. A matrix is said to be square if it is of size n n. Theorem 2.9. (Identity matrix) There is a unique n n matrix M with the property that for every n n matrix A, AM = A = MA. M is defined as the matrix whose elements are all zero for i j and 1 for i = j. i.e: 23

1 0 0... 0 0 1 0... 0 M = 0 0 1... 0....... 0 0 0... 1 This is also defined as the Kronecker delta symbol δ ij which we have already encountered. 2.4.2 Fundamental Operations with Tensors 1. Addition The sum of two or more tensors of the same rank and type yields a tensor of the same rank and type. i.e. A mp q + Bq mp = Cq mp. Addition is commutative and associative. 2. Subtraction The difference of two vectors of the same rank and type is also a tensor of the same rank and type. i.e. A mp q Bq mp = Dq mp. 3. Contraction If one contravariant and one covariant index of a tensor are set equal, the result indicates that a summation over the equal indices is to be taken according to the summation convention. This resulting sum is a tensor of rank two less than the original tensor. This process is called contraction. i.e. The tensor of rank 5, Eqr mps, set r = s to obtain Eqs mps = Bq mp, a tensor of rank 3. 4. Outer Product The product of two tensors is a tensor whose rank is the sum of the ranks of the given tensors. This product which involves ordinary multiplication of the components of the tensor is called the outer product. A mp q Br s = Eqr mps. 5. Contracted Product By the process of outer multiplication, followed by a contraction, we obtain a new tensor called a contracted product of the given tensors. i.e. Given A mp q and Bst, r the outer product is A mp q Bst. r Letting q = r, we obtain the contracted product A mp r Bst. r 6. Inner Product The inner product of two tensors is obtained when the contracted product of two equally sized tensors is calculated, and the product is contracted to a scalar. i.e. Given A mp q and Bst, r hence letting s = m, t = p and q = r we get a scalar. This process is called inner multiplication. 24

7. Quotient Law Suppose it is not known if a quantity X is a tensor or not. If an inner product of X with an arbitrary tensor is itself a tensor, then X is also a tensor. This is known as the quotient law. Addition of Tensors Definition 2.15. Given two x 1 x 2... x N tensors A and B, we define the sum C = A + B to be the x 1 x 2... x N tensor whose elements are C x1 x 2...x N = A x1 x 2...x N + B x1 x 2...x N [4]. Similarly to matrices, it is important to note that the sum of two tensors is defined only when they are of equal size. The sum is simply obtained by adding corresponding elements, therefore resulting in an equal sized output. Theorem 2.10. (Addition of tensors is commutative) For two tensors A and B, both of equal size, we have A + B = B + A. Proof. If A and B are each of size x 1 x 2... x N then A + B and B + A are also of size x 1 x 2... x N and by the above definition, element wise we have: [A + B] x1 x 2...x N = [a x1 x 2...x N + b x1 x 2...x N ], [B + A] x1 x 2...x N = [b x1 x 2...x N + a x1 x 2...x N ] Since the addition of numbers is commutative, we have a x1 x 2...x N +b x1 x 2...x N = b x1 x 2...x N + a x1 x 2...x N for all x 1, x 2,..., x N and so we conclude that A + B = B + A. Theorem 2.11. (Addition of tensors is associative) For the tensors A, B and C, all of equal size, we have A + (B + C) = (A + B) + C. Proof. If A, B and C are each of size x 1 x 2... x N then A + (B + C) and (A + B) + C are also of size x 1 x 2... x N. Element wise we have: A + (B + C) = [a x1 x 2...x N + (b x1 x 2...x N + c x1 x 2...x N )], (A + B) + C = [(a x1 x 2...x N + b x1 x 2...x N ) + c x1 x 2...x N ]. Since the addition of numbers is commutative, we have a x1 x 2...x N +(b x1 x 2...x N + c x1 x 2...x N ) = (a x1 x 2...x N + b x1 x 2...x N ) + c x1 x 2...x N for all x 1, x 2,..., x N and so we conclude that A + (B + C) = (A + B) + C. 25

Zero and inverse tensors Theorem 2.12. (Zero tensor) There is a unique x 1 x 2... x N tensor T such that, for every x 1 x 2... x N tensor A, A + T = A. Proof. Consider the tensor T whose entries are all zeros, i.e. t x1 x 2...x N = 0 x 1, x 2,..., x N, hence element wise we have A + T = a x1 x 2...x N + t x1 x 2...x N = a x1 x 2...x N + 0 = a x1 x 2...x N = A To establish the uniqueness of this tensor, assume we have another tensor S such that A + S = A for every x 1 x 2... x N tensor A. Then in particular we have T +S = T. But, taking S instead of A in the property for T we have S + T = S. It now follows by Theorem 2.10 that S = T. Hence unique. Definition 2.16. The unique tensor arising in Theorem 2.12 is called the x 1 x 2... x N zero tensor and will be denoted simply as 0, or if confusion is caused 0 x1 x 2...x N. Theorem 2.13. (Additive inverse of a tensor) For every x 1 x 2... x N tensor A there is a unique tensor B such that A + B = 0. Proof. Given A = a x1 x 2...x N, consider the tensor B whose elements are the additive inverse of A. Hence element wise we have: A + B = a x1 x 2...x N + b x1 x 2...x N = a x1 x 2...x N + ( a x1 x 2...x N ) = 0 To establish the uniqueness of such a tensor B, suppose there exists a tensor C such that A+C = 0. Then for all x 1, x 2,..., x N we have a x1 x 2...x N +c x1 x 2...x N = 0 and consequently c x1 x 2...x N = a x1 x 2...x N, which means that C = B. Hence unique. Definition 2.17. The unique tensor B arising in Theorem 2.13 is called the additive inverse of A and will be denoted by A. Thus A is the tensor whose elements are the additive inverse of the corresponding elements of A. As with matrices, we defined their difference as A+( B). This is the same for tensors. 26

Tensor - scalar multiplication Definition 2.18. Given a tensor A and a scalar λ, we define their product to be the tensor λa. Whose elements are obtained by multiplying each element of A by λ, defined by λa = λa x1 x 2...x N. The principal properties of multiplying a tensor by a scalar are the same as that of multiplying a matrix by a scalar, as seen in Theorem 2.5. Tensor multiplication Theorem 2.14. (Tensor inner product commutative) Given two equally sized tensors A and B then there inner product is defined by A, B = B, A. Proof. Given A = a x1 x 2...x N and b x1 x 2...x N, and using the fact that multiplication of scalars is commutative: A, B = a x1 x 2...x N b x1 x 2...x N = b x1 x 2...x N a x1 x 2...x N = B, A The outer product of tensors isn t commutative but is associative. Theorem 2.15. (Tensor Outer product associative) Given three tensors A, B and C of sizes x 1... x N, y 1... y M and v 1... v K respectively then: (A B) C = A (B C) Proof. [(A B) C] x1,...,x N,y 1,...,y M,v 1,...,v K = [A B] x1,...,x N,y 1,...,y M c v1,...,v K = a x1,...,x N b y1,...,y M c v1,...,v K [A (B C)] x1,...,x N,y 1,...,y M,v 1,...,v K = a x1,...,x N [B C] y1,...,y M,v 1,...,v K = a x1,...,x N b y1,...,y M c v1,...,v K 27

The contracted product, which is effectively a generalised version of matrix multiplication, also holds for associativity. Theorem 2.16. (Tensor contracted product associative) Given three tensors A, B and C with sizes x 1 x 2... x i...x N, x 1 x 2... x i... x j...x N and x 1 x 2... x j...x N respectively then: A, B, C (xi;xi) = A, B, C (xj ;x j ) (x j ;x j ) (x i ;x i ) Proof. A, B (xi;xi), C (x j ;x j ) = A x1 x 2...k...x N B x1 x 2...k...x j...x N, C (x j ;x j ) = A x1 x 2...k...x N B x1 x 2...k...p...x N C x1 x 2...p...x N A, B, C (xj ;x j ) = A, B x1 x 2...x i...p...x N C x1 x 2...p...x N (xi ;x (x i ;x i ) i ) = A x1 x 2...k...x N B x1 x 2...k...p...x N C x1 x 2...p...x N 2.5 Tensor Rank and Decomposition Definition 2.19. Rank-one tensors. An N-way tensor X R x 1 x 2... x N is a rank one tensor if it can be written as the outer product of N vectors, i.e: X = a (1) a (2)... a (N) This means that each element of the tensor is the product of the corresponding vector elements [9]. For example: 28 35 42 32 40 48 36 45 54 56 70 84 64 80 96 72 90 108 84 105 126 96 120 144 108 135 162 = 1 2 3 [ 4 5 6 ] [ 7 8 9 ] 28

Definition 2.20. Tensor Rank The rank of a tensor X, denoted rank(x), is defined as the smallest number of rank-one tensors that generate X as their sum [9]. Hence the rank of X is R for: X = R i=1 a (1) i a (2) i... a (N) i If you consider the above example of a rank one tensor, but alter the (1,1,1) element from 28 to 30, then the tensor is no longer of rank 1. It is now of rank 2: = 30 35 42 32 40 48 36 45 54 56 70 84 64 80 96 72 90 108 84 105 126 96 120 144 108 135 162 28 35 42 32 40 48 36 45 54 56 70 84 64 80 96 72 90 108 84 105 126 96 120 144 108 135 162 + = + 1 2 3 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [ 4 5 6 ] [ 7 8 9 ] [ 1 0 0 ] [ 1 0 0 ] Unfortunately there is no straight forward algorithm to determine the rank of a specific given tensor. In practice, the rank of a tensor is determined numerically by fitting various rank-r CANDECOMP/PARAFAC models [12]. Kruskal [10] discusses the case of 2 2 2 tensors which have typical ranks of two and three over R. In fact, Monte Carlo experiments (which randomly draw each entry of the tensor from a normal distribution with mean zero and standard deviation one) reveal that the set of 2 2 2 tensors of rank two fills about 79% of the space while those of rank three fill 21%. Rank-one tensors are possible but occur with almost zero probability [9]. 29

2.5.1 The CP (CANDECOMP/PARAFAC) decomposition As mentioned previously, there is no finite algorithm to determine the rank of a tensor, the most popular model used in determining a tensors decomposition is the CP decomposition, from [9] defined as: A x1 x 2...x d = R a 1(x1 r)a 2(x2 r)...a d(xd r) r=1 The first issue that arises in computing the CP decomposition is how to choose the number R, of rank one components. One technique is to fit multiple CP decompositions with differing numbers of rank one components until one is good. We could also calculate the CP decomposition for 1,2,3,... rank one components until we get the correct solution, however, there are many problems with this procedure. As real life data is often very large, chaotic and noisy. We may have approximations of a lower rank that are arbitrarily close in terms of fit, but cause problems in practice compared to the approximations at a higher rank. 2.5.2 The TT-decomposition Recently a new decomposition, named the tensor train, or TT-decomposition, was proposed for compact representation and approximation of high-dimensional tensors. It can be computed via standard decompositions (such as the SVD and QR) but does not suffer from the curse of dimensionality. By the curse of dimensionality we mean that the memory required to store an array with d indices and the amount of operations required to perform basic operations with such an array grows exponentially in the dimensionality d. Therefore, direct manipulation of general d-dimensional arrays seems to be impossible. The TT-decomposition is written as a tensor train of the form: A(i 1, i 2,..., i d ) α 1,...,α d G 1 (i 1, α 1 )G 2 (α 1, i 2, α 2 )...G d 1 (α d 2, i d 1, α d 1 )G d (α d 1, i d ) With tensor carriages G 1,..., G d, where any two neighbours have a common summation index. The summation indices α k run from 1 to r k (r k is the compression rank) and are referred to as auxiliary indices, in contrast to the initial indices i k that are called spacial indices. The tensor carriages G k have sizes r k 1 n k r k + 1 except for k = 1 and k = d where they have sizes 30

n 1 r 1 and r d 1 n d, respectively. It is sometimes convenient to assume that G 1 and G d are not two dimensional but in fact three-dimensional with sizes 1 n 1 r 1 and r d 1 n d 1 [12]. Theorem 2.17. (TT-decomposition [12]) For any tensor A = [A(i 1,..., i d )] there exists a TT approximation T = [T (i 1,..., i d )] with compression ranks r k such that: A T F d 1 where ɛ k is the distance (in the Frobenious norm) from A k to its best rank-r k approximation: ɛ k = min A k B F rank B r k The proof of which can be found in [12], which gives us the following algorithm. k=1 ɛ 2 k TT-decomposition algorithm. Given a tensor A R n 1 n 2... n d, and an accuracy bound ɛ. 1. Compute the Frobenious norm nrm = A, 2. Compute the size of the first unfolded matrix A 1 : N l = n 1, N r = Π d k=2n k 3. Create a temporary tensor M = A, 4. Unfold the tensor into the calculated dimensions M = M R N l N r, 5. Compute the truncated SVD of M USV so that the rank r satisfies: min(n l,n r) σk 2 ɛ nrm d 1 k=r+1 6. Set the first carriage G 1 = U, recalculate M = SV T = (V S) T and let r 1 = r, 31

7. Process the remaining modes from k = 2 to d 1, 8. Calculate the dimensions of the next carriage: N l = n k, N r = N r n k 9. Unfold to the correct size M = M R rn l N r, 10. Compute the truncated SVD of M USV so that the rank r satisfies: min(n l,n r) σk 2 ɛ nrm d 1 k=r+1 11. Reshape U into the k th carriage: 12. Recalculate M = SV, G k = U Rr k 1 n k r k 13. Once the first d 1 carriage s have been calculated, G d = M T. 2.5.3 Computing the TT-decomposition To demonstrate the TT-decomposition algorithm consider the tensor A R 2 2 2 and error 0.1: [ 1 2 5 ] 6 3 4 7 8 The Frobenious norm of A, A F = 14.2829, The error bound we acquire with d = 3 is 1.01, The first unfolding M is of size 2 (2 2) = 2 4: [ ] 1 2 5 6 M = 3 4 7 8 32

Computing the SVD of M gives [ ] 0.5667 0.8239 U = 0.8239 0.5667 [ ] 14.2358 0 0 0 S = 0 1.1585 0 0 0.2134 0.7564 0.4314 0.4430 V = 0.3111 0.5344 0.3905 0.6820 0.6042 0.1316 0.5952 0.5133 0.7019 0.3536 0.5542 0.2742 Hence, from S we can see that the rank of M, r 1 = 2. Due to r 1 = 2 we must adjust the U, S, V matrices to contain orthonormal columns of U, 2 orthonormal rows of V T and 2 orthonormal columns and rows of S: [ ] 0.5667 0.8239 U = 0.8239 0.5667 [ ] 14.2358 0 S = 0 1.1585 0.2134 0.7564 V = 0.3111 0.5344 0.6042 0.1316 0.7019 0.3536 We now set G 1 = U and recalculate M = (V S) T : G 1 = [ 0.5667 0.8239 0.8239 0.5667 ] 33

[ 3.0384 4.4291 8.6010 9.9916 M = 0.8763 0.6191 0.1525 0.4097 ] Reshape to size M R rn l N r (2 2) 2 = 4 2: M = 3.0384 8.6010 0.8763 0.1525 4.4291 9.9916 0.6191 0.4097 We now compute the SVD of M to get U = 0.6405 0.3341 0.6835 0.1046 0.0132 0.6914 0.4160 0.5906 0.7678 0.2746 0.5750 0.0664 0.0103 0.5788 0.1705 0.7974 S = V = 14.2274 0 0 1.2573 0 0 0 0 [ 0.3762 ] 0.9266 0.9266 0.3762 Here M is of rank 2, and we again adjust the U, S, V matrices as such: U = S = 0.6405 0.3341 0.0132 0.6914 0.7678 0.2746 0.0103 0.5788 [ 14.2274 0 ] 0 1.2573 34

[ 0.3762 0.9266 V = 0.9266 0.3762 ] We now set G 2 = U R r 1 n 2 r 2 2 2 2 and recalculate M = SV : [ 0.6405 0.7678 0.3341 0.2746 G 2 = 0.0132 0.0103 0.6914 0.5788 [ ] 5.3519 13.1824 M = 1.1650 0.4730 ] Now we must finally calculate G 3 = M T : G 3 = [ 5.3519 1.1650 13.1824 0.4730 ] Hence A G 1 G 2 G 3 [ 1 2 5 6 ] 3 4 7 8 [ ] [ 0.5667 0.8239 0.6405 0.7678 0.3341 0.2746 0.8239 0.5667 0.0132 0.0103 0.6914 0.5788 [ ] 1.0002 1.9998 5.0005 5.9996 = 3.0001 4.0000 6.9999 8.0000 ] [ 5.3519 1.1650 13.1824 0.4730 ] 35

Definition 2.21. A Hilbert tensor is a d dimensional tensor of size i 1 i 2... d and contains the elements A i1 i 2...i d = 1 i 1 + i 2 +... + i d Hence the 2-dimensional Hilbert tensor of size 2 2 is [ 1/2 ] 1/3 1/3 1/4 The 3-dimensional Hilbert tensor of size 3 3 3 is 1/3 1/4 1/5 1/4 1/5 1/6 1/5 1/6 1/7 1/4 1/5 1/6 1/5 1/6 1/7 1/6 1/7 1/8 1/5 1/6 1/7 1/6 1/7 1/8 1/7 1/8 1/9 The 3-dimensional Hilbert tensor of size 5 2 2 is 1/3 1/4 1/4 1/5 1/4 1/5 1/5 1/6 1/5 1/6 1/6 1/7 1/6 1/7 1/7 1/8 1/7 1/8 1/7 1/9 etc. 2.5.4 Numerical Experiments I will now run the TT-decomposition algorithm on random dense tensors, random sparse tensors and Hilbert tensors of increasing dimensions to compare accuracy and time taken (in seconds) to complete. In all of these tensors each dimension will be of size 3, and so a d-dimensional tensor will have 3 d elements. As the algorithm requires the use of the singular value decomposition, I will only be able to compute up to, and including 9 dimensions. This is due to an unfolded 10 dimensional tensor requires more memory to compute than MATLAB has. 36

To calculate the absolute error, we recompute the tensor from the tensor train to get the approximate solution and then sum the absolute differences of all the corresponding elements. Dimensions Time(s) Absolute Error 3 0.000599 5.481726184086710e-015 4 0.000950 3.160249839595508e-013 5 0.001400 3.229430611817463e-013 6 0.003586 1.219748260727904e-012 7 0.012798 6.101206345698884e-012 8 0.190649 3.496403297095774e-011 9 1.868366 4.497736081146564e-011 Table 1: Dense Tensor with accuracy 0.00001. Dimensions Time(s) Absolute Error 3 0.000595 5.481726184086710e-015 4 0.000949 3.160249839595508e-013 5 0.001390 3.229430611817463e-013 6 0.003524 1.219748260727904e-012 7 0.012970 6.101206345698884e-012 8 0.191436 3.496403297095774e-011 9 1.868051 4.497736081146564e-011 Table 2: Dense Tensor with accuracy 0.001. Tables 1,2,3 show the time taken and accuracy of the TT-decomposition algorithm upon randomly generated tensors. Although the different levels of accuracy were applied to the same tensor, as to help comparing results. You can see that with errors 0.00001 and 0.001 in tables 1 and 2 identical results are obtained. As expected, the times taken grow exponentially as the dimensions increase, due to a d dimensional tensor having 3 d elements. Although it only requires 1.87 seconds to compute a 9-dimensional tensor which contains 19683 elements, and the absolute error is 4.4977e-011. 37

Dimensions Time(s) Absolute Error 3 0.000462 0.548172618408671 4 0.000922 1.222504139360485 5 0.001388 3.229430611817463 6 0.003230 11.97544192366226 7 0.010927 36.17063990114404 8 0.189168 103.0903731752137 9 1.783017 409.5933577147941 Table 3: Dense Tensor with accuracy 0.1. When the accuracy was lowered to 0.1, table 3, the times taken to complete the algorithm improve, although not greatly. The cost of this improved time is a vastly increased absolute error, where the 9-dimensional tensor now has an error of 409.59, which is only an average element error of 0.0208, but is far worse than with accuracy levels 0.001 and 0.00001. Dimensions Time(s) Absolute Error 3 0.000608 2.664535259100376e-015 4 0.000856 8.658229781327576e-007 5 0.001238 3.633284856321972e-006 6 0.002217 9.628497013636217e-006 7 0.009952 2.162607546331297e-005 8 0.169623 1.059060077813001e-003 9 1.765740 3.356218074425516e-003 Table 4: Hilbert Tensor with accuracy 0.00001. Tables 4,5 show the time taken and accuracy of the TT-decomposition algorithm upon Hilbert tensors. If we compare the results of the Hilbert tensors, which have an ordered structure, too that of the random tensors, the random tensors gave much more accurate results. In fact, to achieve a similar error with the 9-dimensional Hilbert tensor to that of the random tensor we must increase the accuracy level to 0.0000000000001, which takes the same amount of time to compute. This is most likely down to the random tensors having a much higher rank 38

Dimensions Time(s) Absolute Error 3 0.000586 0.003892207201136 4 0.000798 0.005054637681228 5 0.001130 0.012985484130547 6 0.002040 0.023844744642908 7 0.009184 0.046796732618198 8 0.160419 0.094324369187404 9 1.648268 0.201164134157645 Table 5: Hilbert Tensor with accuracy 0.001. to that of the Hilbert tensors. Table 5 has the most efficient times to compute, which of course leads to an increased error. Although a much smaller one than observed with the random tensors and a accuracy level of 0.1. Dimensions Time(s) Absolute Error 3 0.000581 4.506840310874860e-015 4 0.000956 3.698752148568742e-014 5 0.001253 5.314365469582339e-013 6 0.003487 3.219748260727904e-012 7 0.013145 2.1684692243965474-012 8 0.200458 7.1625748959623314-011 9 1.752219 3.563275864215860e-011 Table 6: Sparse Tensor with accuracy 0.00001. The decomposition of sparse tensors as seen in tables 6 and 7, which I generated using my sparse tensor MATLAB program. Similarly to the random tensors, errors of 0.00001 and 0.001 give almost identical results. Again the times taken grow exponentially as the dimensions increase, due to a d dimensional tensor having 3 d elements. When the accuracy was lowered from 0.00001 to 0.001, the times taken to complete the algorithm change very little. Although the absolute error is 39

Dimensions Time(s) Absolute Error 3 0.000582 4.7582758063489e-015 4 0.000954 3.1688198340326e-013 5 0.001218 2.6846912027582e-013 6 0.003389 1.7586692758790e-012 7 0.013102 6.1684695698884e-011 8 0.189921 5.4964032427583e-011 9 1.698478 3.27586120634325-011 Table 7: Sparse Tensor with accuracy 0.001. almost identical and so it would be of interest to use a smaller error level. 2.5.5 Results Increasing the accuracy level decreases the absolute error, with only a small effect on the time taken to compute the decomposition. 2.5.6 Applications of the TT-decomposition The TT-decomposition has many potential applications including high-dimensional integration and multivariate function approximation. I will be focusing on the high-dimensional integration in the next section. Multivariate function approximation has a wide scope of applications in such areas as telecommunications, as it provides the ability to compress a large amount of data to a portion of the size. This will improve the efficiency of moving the data around, and then the data can be reconstructed easily. 3 High-dimensional integration using the TTdecomposition The TT-decomposition algorithm can be applied to high-dimensional integration [12], suppose we have a function f of d variables and are required to calculate a d-dimensional integral of the form: 40