On the Convergence of Alternating Least Squares Optimisation in Tensor Format Representations

Size: px
Start display at page:

Download "On the Convergence of Alternating Least Squares Optimisation in Tensor Format Representations"

Transcription

1 M A Y 2 5 P R E P R I N T On the Convergence of Alternating Least Squares Optimisation in Tensor Format Representations Mike Espig*, Wolfgang Hackbusch, Aram Khachatryan* Institut für Geometrie und Praktische Mathematik Templergraben 55, 5262 Aachen, Germany * RWTH Aachen University, Germany Address: RWTH Aachen University, Department of Mathematics, IGPM Pontdriesch 4-6, 5262 Aachen Germany. Phone: +49 () , address: mike.espig@alopax.de Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany

2 On the Convergence of Alternating Least Squares Optimisation in Tensor Format Representations Mike Espig Wolfgang Hackbusch Aram Khachatryan 28th May 25 Abstract The approximation of tensors is important for the efficient numerical treatment of high dimensional problems, but it remains an extremely challenging task. One of the most popular approach to tensor approximation is the alternating least squares method. In our study, the convergence of the alternating least squares algorithm is considered. The analysis is done for arbitrary tensor format representations and based on the multiliearity of the tensor format. In tensor format representation techniques, tensors are approximated by multilinear combinations of objects lower dimensionality. The resulting reduction of dimensionality not only reduces the amount of required storage but also the computational effort. Keywords: tensor format, tensor representation, tensor network, alternating least squares optimisation, orthogonal projection method. MSC: 5A69, 49M2, 65K5, 68W25, 9C26. Introduction During the last years, tensor format representation techniques were successfully applied to the solution of high-dimensional problems like stochastic and parametric partial differential equations [6,, 4, 2, 24, 26, 27]. With standard techniques it is impossible to store all entries of the discretised high-dimensional objects explicitly. The reason is that the computational complexity and the storage cost are growing exponentially with the number of dimensions. Besides of the storage one should also solve this high-dimensional problems in a reasonable (e.g. linear) time and obtain a solution in some compressed (low-rank/sparse) tensor formats. Among other prominent problems, the efficient solving of linear systems is one of the most important tasks in scientific computing. We consider a minimisation problem on the tensor space V = d ν= Rmν equipped with the Euclidean inner product,. The objective function f : V R of the optimisation task is quadratic f(v) := b 2 [ Av, v b, v 2 ], () where A R m m d m m d is a positive definite matrix (A >, A T = A) and b V. A tensor u V is represented in a tensor format. A tensor format U : P P L V is a multilinear map from the cartesian RWTH Aachen University, Germany Address: RWTH Aachen University, Department of Mathematics, IGPM Pontdriesch 4-6, 5262 Aachen Germany. Phone: +49 () , address: mike.espig@alopax.de Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany

3 product of parameter spaces P,..., P L into the tensor space V. A L-tuple of vectors (p,..., p L ) P := P P L is called a representation system of u if u = U(p,..., p L ). The precise definition of tensor format representations is given in Section 2. The solution A b = argmin v V f(v) is approximated by elements from the range set of the tensor format U, i.e. we are looking for a representation system (p,..., p L ) P such that for we have F := f U : P V R (2) [ ] F (p,..., p L ) = b 2 2 AU(p,..., p L ), U(p,..., p L ) b, U(p,..., p L ) F (p,..., p L) = inf F (p,..., p L ). (p,...,p L ) P The alternating least squares (ALS) algorithm [2, 3,, 8, 2, 29, 3] is iteratively defined. Suppose that the k-th iterate p k = (p k,..., pk L ) and the first µ components pk+,..., p k+ µ of the (k + )-th iterate pk+ have been determined. The basic step of the ALS algorithm is to compute the minimum norm solution p k+ µ := argmin qµ P µ F (p k+,..., p k+ µ, q µ, p k µ+,..., p k L). Thus, in order to obtain p k+ from p k, we have to solve successively L ordinary least squares problems. The ALS algorithm is a nonlinear Gauss Seidel method. The local convergence of the nonlinear Gauss Seidel method to a stationary point p P follows from the convergence of the linear Gauss Seidel method applied to the Hessian F (p ) at the limit point p. If the linear Gauss Seidel method converges R-linear then there exists a neighbourhood B(p ) of p such that for every initial guess p B(p ) the nonlinear Gauss Seidel method converges R-linear with the same rate as the linear Gauss Seidel method. We refer the reader to Ortega and Rheinboldt for a description of nonlinear Gauss Seidel method [28, Section 7.4] and convergence analysis [28, Thm..3.5, Thm..3.4, and Thm...3]. A representation system of a represented tensor is not unique, since the tensor representation U is multilinear. Consequently, the matrix F (p ) is not positive definite. Therefore, convergence of the linear Gauss Seidel method is in general not ensured. However, if the Hessian matrix at p is positive semidefinite then the linear Gauss Seidel method still converges for sequences orthogonal to the kernel of F (p ), see e.g. [9, 23]. Under useful assumptions on the null space of F (p ), Uschmajew et al. [33, 36] showed local convergence of the ALS method. These assumptions are related to the nonuniqueness of a representation system and meaningful in the context of a nonlinear Gauss Seidel method. However, for tensor format representations the assumptions are not true in general, see the counterexample of Mohlenkamp [25, Section 2.5] and discussion in [36, Section 3.4]. The current analysis is not based on the mathematical techniques developed for the nonlinear Gauss Seidel method, but on the multilinearity of the tensor representation U. This fact is in contrast to previous works. The present article is partially related to the study by Mohlenkamp [25]. For example, the statement of Lemma 4.4 is already described for the canonical tensor format. Section 2 contains a unified mathematical description of tensor formats. The relation between an orthogonal projection method and the ALS algorithm is explained in Section 3. The convergence of the ALS method is analysed in Section 4, where we consider global convergence. Further, the rate of convergence is described in detail and explicit examples for all kind of convergent rates are given. The ALS method can converge for all tensor formats of practical interest sublinearly, Q-linearly, and even Q-superlinearly. We illustrate our theoretical results on numerical examples in Section 5. We refer the reader to [28] for details concerning convergence speed. 2

4 2 Unified Description of Tensor Format Representations A tensor format representation for tensors in V is described by a parameter space P = L µ= P µ and a multilinear map U : P V from the parameter space into the tensor space. For the numerical treatment of high dimensional problems by means of tensor formats it is essential to distinguish between a tensor u V and a representation system p P of u, where u = U(p). The data size of a representation system is often proportional to d. Thanks to the multilinearity of U, the numerical cost of standard operations like matrix vector multiplication, addition, and computation of scalar products is also proportional to d, see e.g. [, 5, 7, 3, 32]. Notation 2. (N n ). The set N n of natural numbers smaller than n N is denoted by N n := {j N : j n}. Definition 2.2 (Parameter Space, Tensor Format Representation, Representation System). Let L d, µ N d, and P µ a finite dimensional vector spaces equipped with an inner product, Pµ. The parameter space P is the following cartesian product P := L µ= P µ. (3) A multilinear map U from the parameter space P into the tensor space V is called a tensor format representation L d U : P µ R mν. (4) µ= We say u V is represented in the tensor format representation U if u rangeu. A tuple (p,..., p L ) P is called a representation system of u if u = U(p,..., p L ). Remark 2.3. Due to the multilinearity of U, a representation system of a given tensor u range(u) is not uniquely determined. Example 2.4. For the canonical tensor format representation with r-terms we have L = d and P µ = R mµ r. The canonical tensor format representation with r-terms is the following multilinear map U CF : d µ= ν= R mµ r V (p,..., p d ) U CF (p,..., p d ) := r j= µ= d p µ,j, where p µ,j denotes the j-th column of the matrix p µ R mµ r. For recent algorithms in the canonical tensor format we refer to [7, 8, 9,, 2]. The tensor train (TT) format representation discussed in [3] is for d = 3 and representation ranks r, r 2 N defined by the multilinear map U T T : R m r R m 2 r r 2 R m 3 r 2 R m R m 2 R m 3 (p, p 2, p 3 ) U T T (p, p 2, p 3 ) := r r 2 i= j= p,i p 2,i,j p 3,j. 3

5 3 Orthogonal Projection Method and Alternating Least Squares Algorithm It is shown in the following that the ALS algorithm is an orthogonal projection method on subspaces of V = d ν= Rmν. For a better understanding, we briefly repeat the description of projection methods, see e.g. [4, 34] for a detailed description. An orthogonal projection method for solving the linear system Av = b is defined by means of a sequence (K k ) k N of subspaces of V and the construction of a sequence (v k ) k N V such that v k+ K k and r k+ = b Av k+ K k. A prototype of projection method is explained in Algorithm. Algorithm Prototype Projection Method : while Stop Condition do 2: Compute an orthonormal basis V k = [ u k,..., uk m k ] of Kk 3: r k := b Av k 4: v k+ = v k + V k (V T 5: k k + 6: end while k AV k) V T k r k Notation 3. (L(A, B)). Let A, B be two arbitrary vector spaces. The vector space of linear maps from A to B is denoted by L(A, B) := {ϕ : A B : ϕ is linear}. In the following, let U : P V be a tensor format representation, see Definition 2.2. We need to define subspaces of V in order to show that the ALS algorithm is an orthogonal projection method. The multilinearity of U and the special form of the ALS micro-step are important for the definition of these subspaces. Let µ N L and v V be a tensor represented in the tensor format U, i.e. there is (p,..., p µ, p µ, p µ+,..., p L ) P such that v = U(p,..., p µ, p µ, p µ+,..., p L ). Since the tensor format representation U is multilinear we can define a linear map W µ (p,..., p µ, p µ+,..., p L ) L(P µ, V) such that v = W µ (p,..., p µ, p µ+,..., p L )p µ. The map W µ depends multilinearly on the parameter p,..., p µ, p µ+,..., p L. The linear subspace range (W µ (p,..., p µ, p µ+,..., p L )) V is of great importance for ALS method. For the rest of the article, we identify linear maps with its canonical matrix representation. Definition 3.2. Let µ N L. We write for a given representation system p = (p,..., p L ) P and define p [µ] := (p,..., p µ, p µ+,... p L ) W µ, p [µ] : P µ V (5) p µ W µ, p [µ] p µ := U(p,..., p µ, p µ, p µ+,... p L ). We simply write W µ for W µ, p [µ], i.e. W µ := W µ, p [µ] if it is clear from the context which representation system is considered. Proposition 3.3. Let µ N L and p = (p,..., p L ) P. The following holds: ( ) (i) W µ, p [µ] is a linear map and range W µ, p [µ] is a linear subspace of V. 4

6 ( ) (ii) We have rank W µ, p [µ] dim(p µ ). ( ) ( ) (iii) range W µ, p [µ] range(u), i.e. for all v range W µ, p [µ] there exist p µ P µ such that v = U(p,..., p µ, p µ, p µ+,..., p L ). (iv) Set H µ := W T W µ, p [µ] µ, p [µ] R dim Pµ dim Pµ and let H µ = ŨµD µ Ũµ T be the diagonalisation of the square matrix H µ, where D µ = diag(δ i,µ ) i=,...,dim Pµ with δ,µ δ 2,µ δ dim Pµ,µ. Define further D µ = diag( δ i,µ ) i=,...,rank(w. Then the columns of µ, p [µ]) ( ) build an orthonormal basis of range W µ, p [µ] and V µ p µ = W µ, p [µ] V µ := W µ, p [µ]ũµ D 2 µ (6) ( ) Ũ µ D 2 µ p µ = U(p,..., p µ, Ũµ D 2 µ p µ, p µ+,..., p L ). (7) (v) The map is multilinear. W µ : P P µ P µ+..., P L L(P µ, V) p µ W µ, p [µ] Proof. Note that W µ, p [µ] is linear, since the tensor format U is multilinear. The rest of the assertions follows after short calculations, where the last assertion (v) is a direct consequence of the multilinearity of U. Remark 3.4. In chemistry the definition of V µ in Proposition 3.3 (iv) is often called Löwdin transformation, see [35, Section 3.4.5]. Nevertheless, the construction can be found in several proofs for the existence of the singular value decomposition, see e.g. [6, Lemma 2.9]. Definition 3.5. Let µ N L, p = (p,..., p L ) P, and F : P R as defined in Eq. (2). We define F µ,p [µ] : P µ R (8) p µ F µ,p [µ]( p µ ) := F (p,..., p µ, p µ, p µ+,..., p L ). We write for convenience F µ := F µ,p [µ] if it is clear from the context which representation system is considered. Lemma 3.6. Let µ N L and p = (p,..., p L ) P. We have (i) F µ(q µ ) = W T µ (b AW µ q µ ), (ii) (V T µ AV µ ) V T µ b = argmin qµ P µ F µ (q µ ), where V µ is defined in Eq. (6). Proof. (i): Let q µ P µ. We have f(w µ q µ ) = F µ (q µ ) for all µ N L and F µ (q µ ) = [ ] b 2 2 AW µq µ, W µ q µ b, W µ q µ = [ ] W T b 2 2 µ AW µ q µ, q µ W T µ b, q µ. 5

7 Since Wµ T AW µ is symmetric, we have F µ(q µ ) = Wµ T AW µ q µ Wµ T b = Wµ T (b AW µ q µ ). (ii): For V µ q µ range(w µ ) we can write f(v µ q µ ) = [ ] V T b 2 2 µ AV µ q µ, q µ V T µ b, q µ. Since V µ is a basis of range(w µ ), we have that V T µ AV µ is positive definite and therefore p µ = argmin qµ P µ F µ (q µ ) V T µ AV µ p µ V T µ b = p µ = (V T µ AV µ ) V T µ b. Theorem 3.7. Let µ N L and p = (p,..., p L ) P. We have where V µ is from Eq. (6). p µ = argmin qµ P µ F µ (q µ ) b A V µ p µ rangew µ, Proof. Follows from Lemma 3.6 and orthogonal projection theorem. Algorithm 2 Alternating Least Squares (ALS) Algorithm : Set k := and choose an initial guess p = (p,..., p L ) P, p, := p, and v := U(p ). 2: while Stop Condition do 3: v k, := v k 4: for µ L do 5: Compute an orthonormal basis V of the range space of W := W [µ] µ,p k, µ 6: p k+, see e.g. Eq. (5) and (6). µ := Ũk, µ D 2 k, µ (V T AV ) D 2 k, µũ T T Wµ (p k+,..., p k+ µ }{{}, pk µ+,..., p k L)b (9) G + k, µ := p + := (p k+,..., p k+ µ, pk+ µ, p k µ+,..., p k L) r := b Av k, µ v + = v + V (V T ) V T = V (V T ) V T + ) where Ũk, µ D 2 k, µ is from Eq. (6) 7: end for 8: p k+ := p k,l and v k+ := U(p k+ ) 9: k k + : end while Remark 3.8. From the definition of p k+ µ, it follows directly that p k+ µ kernel(w µ,k ) and p k+ µ is the vector with smallest norm that fulfils the normal equation G p k+ µ = W T b. This is very important for the convergence analysis of the ALS method and we like to point out that our results are based on this condition. We must give special attention in a correct implementation of an ALS micro-step in order to fulfil this essential property. 6

8 b Range (W ) V v r + v + Figure : Graphical illustration of an ALS micro- step for the case when A = id. At the current iteration step, we define the linear map W L(P µ, V) by means of v and the multilinearity of U, cf. Definition 3.2. The successor v + is then the best approximation of b on the subspace range (W ) V. 4 Convergence Analysis We consider global convergence of the ALS method. The convergence analysis for an arbitrary tensor format representation U : L µ= P µ V is a quite challenging task. The objective function F from Eq. (2) is highly nonlinear. Even the existence of a minimum is in general not ensured, see [5] and [22]. We need further assumptions on the sequence from the ALS method. In order to justify our assumptions, let us study an example from Lim and de Silva [5] where it is shown that the tensor b = x x y + x y x + y x x with tensor rank 3 has no best tensor rank 2 approximation. Lim and de Silva explained this by constructing a sequence (v k ) k N of rank 2 tensors with v k = (x + k ) y (x + k ) y (kx + y) x x kx b. k The linear map W,k from Definition 3.2 and the first component vector p,k of the parameter system have the following form: 7

9 W,k = p k = (x + k ) y (x + k ) y, x x Id R n, }{{} column vectors of the matrix ( ) ( ) (kx + y) + kx. It is easy to verify that the equation W,k p k = v k holds. Furthermore, we have lim k pk =, W = lim W,k = ( x x x x ) Id R n. k Obviously, the rank of W is equal to n but rank(w,k ) = 2n for all k N. This example shows already that we need assumptions on the boundedness of the parameter system and on the dimension of the subspace span(w µ,k ). Definition 4. (Critical Points). The set M of critical points is defined by M := { v V : p P : v = U(p) F (p) = }. () In our context, critical points are tensors that can be represented in our tensor format U and there exists a parameter system p such that (f U) (p) =, i.e. p is a stationary point of F = f U. A representation system of a tensor v = U(p) is never uniquely defined since the tensor format is a multilinear map. The following remark shows that the non uniqueness of a parameter system has even more subtle effects, in particular when the parameter system of v = U(p) is also a stationary point of F. Remark 4.2. In general, v M does not imply F (ˆp) = for any parameter system ˆp of v, i.e. there exist a tensor format Ũ and two different p, ˆp P such that Ũ(p) = Ũ(ˆp) and = F (p) F (ˆp). Proof. Let Ũ : R 2 R 2 R 2 R 2 R 4 (x, y) Obviously, Ũ is a bilinear map. Further, let b = and e 2 the canonical vectors in R 2, i.e. Then the following holds e = ( Ũ(x, y) := x y + x 2 y x y + x 2 y x y 2 x 2 y 2., A = Id in the definition of F from Eq. (2), and e ) (, e 2 = ). 8

10 a) Ũ(e, e ) = Ũ(e 2, e ), b) F (e, e ) =, c) F (e 2, e ). Elementary calculations result in Ũ(e, e ) = Ũ(e 2, e ) =. The definition of F from Eq. (2) gives Then and F (x, y) = 3 ( 2 (2(y (x + x 2 )) 2 + (x y 2 ) 2 + (x 2 y 2 ) 2 ) 2(x + x 2 )y x 2 y 2 ). F x(x, e ) = F (x, e ) = 3 ((x + x 2 ) 2 2(x + x 2 )), F (e, y) = 3 (y2 + 2 y2 2 2y ) F (e 2, y) = 3 (y2 + 2 y2 2 2y y 2 ). ( 2(x +x 2 ) 3 2(x +x 2 ) 3 One verifies that F (e, e ) = ), F y(e, y) = and F (e 2, e ) = ( 2(y ) 3 y ) ( 2(y ), F y(e 2, y) = 3 y 2 3 ). For a convenient understanding, let us briefly repeat the notations from the ALS method, see Algorithm 2. Let µ N L {}, k N, and p = (p k+,..., p k+ µ, pk µ,... p k L) P, () v = U(p ) = U(p k+,..., p k+ µ, pk µ,... p k L) V be the elements of the sequences (p ) k N and (v ) k N from the ALS algorithm. Note that p k = p k, = (p k,..., pk L ) and v k = U(p k ). Definition 4.3 (A(v k )). The set of accumulation points of (v k ) k N is denoted by A(v k ), i.e. A(v k ) := {v V : v is an accumulation point of (v k ) k N }. (2) We demonstrate in Theorem 4.3 that every accumulation point of (v k ) k N is a critical point, i.e. A(v k ) M. This is an existence statement on the parameter space P. Lemma 4.5 shows us a candidate for such a parameter system. 9

11 Remark 4.4. Obviously, if the sequence of parameter (p k ) k N is bounded, then the set of accumulation points of (p k ) k N is not empty. Consequently, the set A(v k ) is not empty, since the tensor format U is a continuous map. Lemma 4.5. Let the sequence (p k ) k N from the ALS method be bounded and define for J N the following set of accumulation points: A J := L µ= There exists p = (p,..., p L ) A J such that {p P : p is an accumulation point of (p ) k J }. p = min p A J p. Proof. Since the sequence (p k ) k N is bounded, it follows from the definition in Eq. () that (p ) k N is } also bounded. Therefore the set {p P : p is an accumulation point of (p ) k J is not empty and compact. Hence A is a compact and non-empty set. We are now ready to establish our main assumptions on the sequence from the ALS method. Assumption 4.6. During the article, we say that (p k ) k N satisfies assumption A or assumption A2 if the following holds true: (A:) The sequence (p k ) k N is bounded. (A2:) The sequence (p k ) k N is bounded and for J N we have µ N L : k J : k J : k k rank(w ) = rank(w µ), (3) where p = (p,..., p L ) A J is a accumulation point form Lemma 4.5 and W = W µ (p k+,... p k+ µ, pk µ+,..., p k L), Wµ = W µ (p,... p µ, p µ+,..., p L). Remark 4.7. In the proof of Theorem 4.3, assumption A2 ensures that the ALS method depends continuously on the parameter system p, i.e. G + W T b k G+ µ W T µ b for a convergent subsequence (p ) k J. Using the notations and definitions from Section 3, we define further for k N and µ N L. A := V T AV, (4) For the ALS method there is an explicit formula for the decay of the values between f(v k, µ+ ) and f(v k, µ ). The relation between the function values from Eq. (5) is crucial for the convergence analysis of the ALS method. Lemma 4.8. Let k N, µ N L. We have f(v k, µ+ ) f(v ) = V A V T r, r 2 b 2 (5)

12 Proof. From Algorithm 2, we have that v + = v +, where := V A V T r. Elementary calculations give f(v + ) = b 2 [A(v + ), v + b, v + ] = f(v ) + r k, µ, k, µ + 2 A k, µ, k, µ b 2 V A V T r k, µ, r k, µ + 2 AV A V T r, V A V T r = f(v ) + V A V T r k, µ, r k, µ + 2 = f(v ) + b 2 = f(v ) 2 b 2 V A V T r k, µ, r k, µ. b 2 V A V T r k, µ, r k, µ Corollary 4.9. (f(v k )) k N R is a descending sequence and there exists α R such that f(v k ) k α. Proof. Let k N and µ N L. From Lemma 4.8 it follows that f(v k+ ) f(v k ) = f(v k,l ) f(v k, ) = = 2 b 2 L µ= L f(v k, µ ) f(v ) µ= A V T r, V T r, since the matrices A are positive definite. This shows that (f(v k)) k N R is a descending sequence. The sequence of function values (f(v k )) k N is bounded from below, since the matrix A in the definition of f is positive definite. Therefore, there exists an α R such that f(v k ) α. k Lemma 4.. Let (v ) k N,µ NL V be the sequence from Algorithm 2. We have f(v ) = 2 b 2 v, b = 2 b 2 v 2 A (6) for all k N, µ N L, where v, w A := Av, w and v A := v, v A. Proof. Let k N and µ N L. We have ( ) v, v A = AV V T ( ) AV V T b, V V T AV V T = V T b, ( V T AV ) V T b = v, b. The rest follows from the definition of f, see Eq. (). b Corollary 4.. Let (v ) k N,µ NL V be the sequence of represented tensors from the ALS algorithm. The following holds:

13 (a) f(v + ) f(v ), (b) v + 2 A v 2 A, (c) v +, b v, b. Proof. Follows from Lemma 4.9 and Lemma 4.. Lemma 4.2. Let the sequence (p k ) k N P fulfil the assumption A. Then we have max F µ(p k µ) µ L k. Proof. According to Lemma 3.6 and Lemma 4.8, we have f(v k ) f(v k+ ) = = = L f(v ) f(v k, µ ) = L 2 b 2 µ= µ= L ( 2 b 2 A Ũ D 2 kµ 2 b 2 µ= L µ= A ) T W T r, A V T r, V T r ( ) T Ũ D 2 kµ W T r ( ) T ( ) T Ũ D 2 kµ F µ(p k µ), Ũ D 2 kµ F µ(p k µ) L 2 b 2 λ min (A )λ min (Ũ D kµ ) Ũ T F µ (p k µ) µ= L 2λ max (A) b 2 µ= λ min (Ũ D 2 kµ ) Ũ T F µ (p k µ) 2, where V = W Ũ D 2 kµ is from Eq. (6). In the last estimate, we have used that the Ritz values are bounded by the smallest and largest eigenvalue of A, i.e λ min (A) λ min (A ) λ max (A ) λ max (A). Since the tensor format U is continues and the sequence (p k ) k N is bounded, it follows from the theorem of ) Gershgorin and Cauchy-Schwarz inequality that there is γ > such that λ max (W T W γ, recall that W T W = ŨD Ũ T, see Proposition 3.3. Therefore, we have f(v k ) f(v k+ ) 2λ max (A)γ b 2 L F µ(p k µ) 2 µ= Further, it follows from Corollary 4.9 that = lim f(vk ) f(v k+ ) = lim k 2λ max (A)γ b 2 max k µ L F µ(p k µ) max µ L. F µ(p k µ) 2. Theorem 4.3. Let (v k ) k N be the sequence of represented tensors and suppose that the sequence of parameter (p k ) k N P from the ALS method fulfils assumption A2. Every accumulation point of (v k ) k N is a critical point, i.e. A(v k ) M. Further, we have dist (v k, M) k. 2

14 Proof. Let v A(v k ) be an accumulation point and (v k ) k J N U(P ) a subsequence in the range set of the tensor format U with v k v. Then there exists (p k) k J N P with v k = U(p k ) for all k J. Further, k let µ N L and define for all k J g := (p k+,..., p k+ µ, p k µ+,..., p k L) P. Let µ N L and (g ) k J J with g p := argmax p AJ p P, see Lemma 4.5. Without loss k of generality, let us assume that µ =. This assumption makes the the notations not more complicated then necessary. Since (g ) k J is bounded, there exists p [µ] P and a corresponding subsequence (g ) k Jµ J such that p k = g k, = (p k, p k 2,..., p k L) k p = (p, p 2,..., p L ), g k, = (p k+, p k 2,..., p k L) k p[] = ( p, p 2,..., p L ), g = (p k+,..., p k+ µ, p k µ+,..., p k L) k p[µ] = ( p,..., p µ, p µ+,..., p L ), g k,l = (p k+,..., p k+ L ) k p[l] = ( p,..., p L ). From Lemma 4.5 and U(p k ) k v it follows that v = lim k U(g ) = U(p[µ] ) f.a. µ N L, (7) where k J. Furthermore, we have To show this, assume that M := p = p [] = = p [L]. { µ N L : p [µ] p } and define ν := min M N L. From assumption A2 it follows p k+ ν = G + k,ν W T k,ν b k G+ ν W T ν b = p ν. Thus we have in particular that p ν kernelw ν, see the Definition of G + ν and Proposition 3.3. Since U(p ) = U(p [ν] ) W ν p ν = W ν p ν, it follows further that δ ν := p ν p ν kernelw ν and p ν 2 = p ν 2 + δ ν 2. Lemma 3.6 and Lemma 4.2 show that From the definition of ν, we have then note that for ν = min M W T k,ν (AW k,νp k ν b) k. W T ν (AW ν p ν b) =, g k,ν = (p k+,..., p k+ ν, p k ν+,..., p k L) k p[ν] = (p,..., p ν, p ν, p ν+,..., p L ) holds. Since p ν = G + ν W T ν b and p = argmin p AJ p, it follows p ν = p ν. Hence, we have p ν = p ν, because p ν 2 = p ν 2 + δ ν 2 implies then δ ν =. But p ν = p ν contradicts the definition of ν = min M. Consequently, we have p = p [] = = p [L]. 3

15 From Eq. (7) and the definition of p k+ µ it follows then and v = U(p ) = lim k F µ(g ) = F µ(p ) for all µ N L, i.e. A(v k ) M. Now, let δ k = inf v M v k v and suppose that there exists a subsequence (δ k ) k J N with lim k δ k = δ R +. Then (v k ) k J has a convergent subsequence. Since this subsequence must have its limit point in M, it follows that δ =. Which proves dist (v k, M) k by contradiction. Lemma 4.4. Let (v k ) k N V be the sequence of represented tensors from the ALS method. It holds Proof. Let k N. We have v k+ v k 2 L A = v v µ= v k+ v k A k. 2 A L v v A µ= Since v + v = V A V T r, it follows further v + v 2 A = V A V T r 2 = AV A A = A V T r, V T AV A V T = V A V T r, r. Combining this with Eq. (5) and (8) gives V T r 2 L L v + v 2 A.(8) µ= r, V A V T r = A V T r, V T r L v k+ v k 2 A 2L b 2 (f(v ) f(v + )) = 2L b 2 (f(v k ) f(v k+ )). µ= From Corollary 4.9 it follows (f(v k ) f(v k )) k. Therefore, we have v k+ v k A k. Lemma 4.5. Let (v k ) k N V be the sequence of tensors from the ALS method and v V with lim k v k = v. Further, let µ N L and (v ) k N V as defined in Algorithm 2. We have lim v = v for all µ N L. k Proof. Define v k, := v k (like in Algorithmus 2) and assume that M := { } µ N L : lim v v. k 4

16 Furthermore, set µ := min M N L. From Lemma 4.4 it follows v v A v v µ,k A + v v A k. But this contradicts the definition of µ. In the following, the dimension of the tensor space V = d µ= Rmµ is denoted by N N, i.e. N = d µ= m µ. The statement of Lemma 4.2 delivers an explicit recursion formula for the tangent of the angle between iteration points and an arbitrary tensor. This result is important for the rate of convergence of the ALS algorithm. Lemma 4.6. Let µ, ν N L, ν µ, and p = (p,..., p ν,..., p µ,..., p L ) P. There exists a multilinear map M µ,ν : P P ν P ν+ P µ P µ+ P L V L(P ν, P µ ) such that W T µ (p,..., g ν,..., p µ, p µ+,..., p L )b = M µ,ν (p,..., p ν, p ν+,..., p µ, p µ+,..., p L, b)g ν (9) for all g ν P ν. Moreover, we have M µ,ν (p,..., p ν, p ν+,..., p µ, p µ+,..., p L, b) = M T ν,µ(p,..., p ν, p ν+,..., p µ, p µ+,..., p L, b). Proof. Follows form Proposition 3.3 (v) and definition of W T µ (p,..., g ν,..., p µ, p µ+,..., p L )b. Corollary 4.7. Let µ N L, k 2, and p = (p k+,..., p k+ µ, pk µ, p k µ+,..., pk L ) form Algorithm 2. There exists a multilinear map M µ : P P µ 2 P µ+ P L V L(P µ, P µ ) such that p k+ µ = G + k, µ M µ(p k+,..., p k+ µ 2, pk µ+,..., p k L, b)p k+ µ, (2) p k+ µ = G + k, µ M T µ (p k+,..., p k+ µ 2, pk µ+,..., p k L, b)p k µ, (2) i.e. p k+ µ = G + k, µ M µ(p k+,..., p k+ µ 2, pk µ+,..., p k L, b) G + k, µ M T µ (p k+,..., p k+ µ 2, pk µ+,..., p k L, b)p k µ, where G + k, µ and G+ k, µ are defined in Algorithm 2. Proof. Follows form Eq. (9) and Lemma 4.6. The following example shows a concrete realisation of the matrix M µ for the tensor rank-one approximation problem. Example 4.8. The approximation of b V by a rank one tensor is considered. Let v k = p k pk 2... pk d and t t d d b = b µ,iµ, i = β (i,...,i d ) i d = µ= i.e. the tensor b is given in the Tucker decomposition. From Eq. (9) it follows p k+ t t d d = d µ=2 p k µ 2 β (i,...,i d ) b µ,iµ, p k µ b,i i = i d = µ=2 t t d t 2 t d = d b p k,i β (i d 2,...,i d ) = µ=2 p k µ d µ=2 p k µ p k d 2 i = i d = i 2 = B Γ,k Bd T k pd }{{}, M (p k 2,...,pk d )= 5 i d = d µ=2 bµ,iµ, p k µ p k µ b T d,i d p k d

17 where B µ = ( ) b µ,iµ : i µ t µ R n µ t µ, Bµ T B µ = Id R tµ, and the entries of the matrix Γ,k R t t d are defined by bµ,iµ, p k µ [Γ,k ] i,i d = t 2 i 2 = t d d β (i,...,i d ) i d = µ=2 p k µ ( i t, i d t d ). Note that Γ,k is a diagonal matrix if the coefficient tensor β d µ= Rtµ is super- diagonal, see the example in [3]. For p k d it follows further p k d = and finally t d µ= p k 2 µ i = t d d β (i,...,i d ) i d = µ= p k+ = b µ,iµ, p k µ b d,id = d µ= p k µ 2 B Γ,k Γ T,k BT p k. p k 2 d B d Γ T p k µ,k BT p k Lemma 4.9. Let µ N L, (p ) k N P, and (v ) k N V be the sequences from Algorithm 2. Furthermore, define M := M µ (p k+,..., p k+ µ 2, pk µ+,..., p k L, b), H := W T W, N := W G + M H + W T, where we have used the notations from Algorithm 2. A micro-step of the ALS method is described by the following recursion formula: v + = N v for all k 2, (22) i.e. U(p k+,..., p k+ µ, pk+ µ, p k µ+,..., p k L) = N U(p k+,..., p k+ µ, pk µ, p k µ+,..., p k L). Proof. According to Corollary 4.7, Remark 3.8, and definition of v +, we have that N v = W G + M H + W T W p k+ µ = W G + M H + H p k+ µ = W G + M p k+ µ = W G + W T b = W p k+ µ = v +. Lemma 4.2. Let (N ) k N, µ NL be the sequence of matrices from Lemma 4.9, v V \ {}, and R R N N an orthogonal matrix with span( v) = range(r), i.e. the column vectors of R form an orthonormal basis of the linear space span( v). Assume further that c := vt v v R \ {} and s := R T v R N \ {} holds true. Then we have the following recursion formula for the tangent of the angles: q (s) tan [ v, v + ] = tan [ v, v ], where q (s) := q (c) := q (c) R T N v c + R T N R s, s v T N v c + v T N R s. c µ=2 6

18 Proof. The block matrix V := [ v R ] R N N, ( v := v/ v ). is orthogonal, i.e. the columns of the matrix V build an orthonormal basis of the tensor space V. The tensor v and the matrix N are represented with respect to the basis V, i.e v = V V T v = [ v R ] ( v T ) v R T = [ v R ] ( ) c v s and N = V ( V T N V ) V T = [ v R ] [ v T N v v T N R R T N v R T N R ] [ v R ] T. The recursion formula (22) leads to the recursion of the coefficient vector ( ) [ ck+,µ v = T N v v T ] ( ) ( N R c v R T N v R T = T N v c + v T N R s N R R T N v c + R T N R s s k+,µ s ). Since s and c we have tan 2 [ v, v + ] = = RR T v +, v + vv T v +, v + = RT v + 2 (v T v + ) 2 = s + 2 (c + ) 2 = q(s) q (c) 2 R T v 2 (v T v ) 2 = q(s) q (c) 2 tan 2 [ v, v ]. ( ( ) q (s) 2 q (c) ) 2 s 2 (c ) 2 Theorem 4.2. Suppose that the sequence (p k ) k N P from Algorithm 2 fulfils assumption A. If one accumulation point v A(v k ) is isolated, then we have v k k Furthermore, we have that either the ALS method converges after finitely many iteration steps or tan [ v, v + ] q µ tan [ v, v ], v. where q µ := lim sup k q (s) q (c). Proof. Let ε > such that v is the only accumulation point in Ū := {v V : v v A ε}. Assuming that the sequence (v k ) k N V from the ALS algorithm does not converge to v and let I N be a subset with v v k A ε for all k I. Since v is the only accumulation in Ū and (v k) k N does not converge to v the following set I k is for all k I well-defined and finite: { I k := k N : v v k A ε for all k k k }. 7

19 The definition of the map k : I N, k k (k) := max I k implies that v v k (k) A ε and v v k (k)+ A > ε for all k I. Since v is the only accumulation point of (v k ) k N in Ū it follows that the subsequence (v k (k)) k I converges to v. Therefore, we have v v k (k) A ε 2 and v k (k)+ v k (k) A v v k (k)+ A v v k (k) A ε 2 for sufficient large k I. But this contradicts the statement v k+ v k A from Lemma 4.4. The inequality for the rate of convergence of an ALS micro-step tan [ v, v + ] q µ tan [ v, v ] follows direct from Lemma 4.2 and the definition of q µ. Note that in Lemma 4.2, c since lim k v = v. If s k,µ = for some k N, then the ALS method converges after finitely many iteration steps. Corollary Suppose that the sequence (p k ) k N P fulfils A2 and assume that the set of critical points M is discrete, 2 then the sequence of represented tensors (v k ) k N from the ALS method is convergent. Proof. Follows directly from Theorem 4.2 and Theorem 4.3. Remark k The convergence rate for an entire ALS iteration step is given by q := L µ= q µ, since tan [ v, v k+ ] = tan [ v, v k,l ] q L tan [ v, v k,l ] L q µ tan [ v, v k, ] = q tan [ v, v k ]. Without further assumptions on the tensor b from Eq. (), one cannot say more about the rate of convergence. But the ALS method can converge sublinearly, Q-linearly, and even Q-superlinearly. We refer the reader to [28] for a detailed description of convergence speed. - If q =, then the sequence ( tan [ v, v k ] ) k N converges Q-superlinearly. - If q <, then the sequence ( tan [ v, v k ] ) k N converges at least Q-linearly. µ= - If q =, then the sequence ( tan [ v, v k ] ) k N converges sublinearly. A specific tensor format U has practically no impact on the different convergence rates. Since we can find explicit examples for all cases already for rank-one tensors. Please note that the representation of rank-one tensors is included in all tensor formats of practical interest. In the following, we give a brief overview of our results about the convergence rates for the tensor rank-one approximation, please see [3] for proofs and detailed description. The multilinear map that describes the representation of rank-one tensors is given by U : d R n µ= d µ= (p,..., p d ) U(p,..., p d ) = 2 In topology, a set which is made up only of isolated points is called discrete. R n d p µ. µ= 8

20 A tensor b is called totally orthogonal decomposable if there exist r N with b = r λ j j= µ= d b µ,j (b µ,j R n ) such that for all µ N d and j, j 2 N r the following holds: b µ,j, b µ,j = δ j,j 2. The set of all totally orthogonal decomposable tensors is denoted by T O = {b V : b is totally orthogonal decomposable} V. It is shown in [3] that the tensor rank-one approximation of every b T O by means of the ALS method converges Q-superlinearly, i.e. q =. For examples of Q-linear and sublinear convergence, we will consider the tensor b λ V given by b λ = 3 p + λ (p q q + q p q + q q p) µ= for some λ R and p, q R n with p = q =, p, q =. If λ 2, it is shown in [3] that v = 3 µ= p is the unique best approximation of b λ. Furthermore, for the rate of convergence we have the following two cases: a) For λ = 2 it holds q =, i.e. the sequence ( tan [ v, v k] ) k N converges sublinearly. b) For λ < 2 the ALS method converges Q-linearly with the convergence rate [ λ ( q λ = 3λ + λ 2 + ) ] 3 (3λ + λ 2 2 ) 2 + 4λ. This example is not restricted to d = 3. The extension to higher dimensions is straightforward, see [3] for details. 5 Numerical Experiments In this subsection, we observe the convergence behavior of the ALS method by using data from interesting examples and more importantly from real applications. In all cases, we focus particularly on the convergence rate. 5. Example We consider an example introduced by Mohlenkamp in [25, Section 4.3.5]. Here we have A = id and ( ) ( ) ( ) ( ) ( ) ( ) b = 2 +, }{{}}{{} e := } {{ } b := 9 e 2 := } {{ } b 2 :=

21 see Eq. (). The tensor b is orthogonally decomposable. Although the example is rather simple, it is of great theoretical interest. It follows from Theorem 4.2 and [3] that the rate of convergence for an ALS micro- step is q (s) q µ = lim sup k =. q (c) Here the ALS method converges Q-superlinearly. Let τ, our initial guess is defined by ( ) ( ) ( ) τ τ τ v (τ) :=. Since ( 4 ) ( τ, ) 2 = 4τ 2 and ( ) ( τ, ) 2 =, we have for τ < 2 that the initial guess v (τ) dominates at b 2. Therefore, the ALS iteration converge to b 2, see [3] for details. In the our numerical test, the tangents of the angle between the current iteration point and the corresponding parameter of the dominate term b l ( l 2) is plotted in Figure 5., i.e. cos tan ϕ k,l = 2 ϕ k,l cos 2, (23) ϕ k,l where cos ϕ k,l = pk,e l p k..e+.e- tau=.4999 tau=.495 tau=.4.e-2 tan(phi_k,2).e-3.e-4.e-5.e-6.e-7.e k Figure 2: The tangents tan ϕ k,2 from Eq. (23) is plotted for τ {.4,.495,.4999}. 5.2 Example 2 Most algorithms in ab initio electronic structure theory compute quantities in terms of one- and two-electron integrals. In [] we considered the low-rank approximation of the two-electron integrals. In order to illustrate 2

22 the convergence of the ALS method on an example of practical interest, we use the two-electron integrals of the so called AO basis for the CH 4 molecule. We refer the reader to [] for a detailed description of our example. The ALS method converges here Q-linearly, see Figure 3..e+ AO_CH4.dat.e-.e-2 tan phi_k,.e-3.e-4.e-5.e-6.e-7.e k Figure 3: The approximation of two-electron integrals for methane is considered. The tangents of the angle between the current iteration point and the limit point with respect to the iteration number is shown. 5.3 Example 3 We consider the tensor b λ = 3 p + λ (p q q + q p q + q q p) µ= from Remark The vectors p and q are arbitrarily generated orthogonal vectors with norm. The values of tan(ϕ,k ) are plotted, where ϕ,k is the angle between p k and the limit point p. For the case λ =.5 the convergence is sublinearly, whereas for λ <.5 it is Q-linearly. According to Theorem 4.2 and [3], the rate of convergence for an ALS micro-step is given by q (s,λ) q λ = lim sup k, k = λ ( 3λ + λ 2 + ) (3λ + λ 2 2 ) 2 + 4λ. q (c,λ) k, For λ =.46, we have for the convergence rate q.46 =.847. In Figure 5.3 the ratio tan(ϕ,k+) tan(ϕ,k ) is plotted. The ratio tan(ϕ,k+) tan(ϕ,k ) perfectly matches to q.46 =.847. This plot shows on an example the precise analytical description of the convergence rate. 2

23 .e+.e- lambda=.3 lambda=.2 lambda=..e+.e- lambda=.44 lambda=.46 lambda=.48.e-2.e-2 tan phi_k,.e-3.e-4.e-5 tan phi_k,.e-3.e-4.e-5.e-6.e-6.e-7.e-7.e k.e k (a) The tangents tan ϕ k, for λ {.,.2,.3}. (b) The tangents tan ϕ k, for λ {.44,.46,.48}. Figure 4: The approximation of b λ from Remark 4.23 is considered. The tangents of the angle between the current iteration point and the limit point with respect to the iteration ( number is plotted. For λ <.5 the sequence converges Q-linearly with a convergence rate q λ = λ 2 3λ + λ 2 + ) (3λ + λ 2 ) 2 + 4λ <. References [] U. Benedikt, A. Auer, M. Espig, and W. Hackbusch. Tensor decomposition in post-hartree fock methods. i. two-electron integrals and mp2. The Journal of Chemical Physics, 34(5):, 2. [2] G. Beylkin and M. J. Mohlenkamp. Numerical operator calculus in higher dimensions. Proceedings of the National Academy of Sciences, 99(6):246 25, 22. [3] G. Beylkin and M. J. Mohlenkamp. Algorithms for numerical analysis in high dimensions. SIAM Journal on Scientific Computing, 26(6): , 25. [4] C. Brezinski. Projection methods for systems of equations. Studies in computational mathematics. Elsevier Science, 997. [5] V. de Silva and L.-H. Lim. Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM. J. Matrix Anal. and Appl., 3:84 27, 28. [6] Alireza Doostan, AbdoulAhad Validi, and Gianluca Iaccarino. Non-intrusive low-rank separated approximation of high-dimensional stochastic models. Comput. Methods Appl. Mech. Engrg., 263:42 55, 23. [7] M. Espig. Effiziente Bestapproximation mittels Summen von Elementartensoren in hohen Dimensionen. PhD thesis, Universität Leipzig, 28. [8] M. Espig, L. Grasedyck, and W. Hackbusch. Black box low tensor rank approximation using fibrecrosses. Constructive approximation, 29. [9] M. Espig and W. Hackbusch. A regularized newton method for the efficient approximation of tensors represented in the canonical tensor format. Num. Math., 22. [] M. Espig, W. Hackbusch, S. Handschuh, and R. Schneider. Optimization problems in contracted tensor networks. Computing and Visualization in Science, 4(6):27 285, 2. 22

24 .e+ lambda=.5 tan phi_k,.e k Figure 5: The approximation of b λ from Remark 4.23 is considered. The tangents of the angle between the current iteration point and the limit point with respect to the iteration number is plotted. For λ =.5, we have sublinear convergence since q.5 =. [] M. Espig, W. Hackbusch, A. Litvinenko, H. G. Matthies, and E. Zander. Efficient analysis of high dimensional data in tensor formats. accepted for Springer Lecture Note series for Computational Science and Engineering, 2. [2] M. Espig, W. Hackbusch, T. Rohwedder, and R. Schneider. Variational calculus with sums of elementary tensors of fixed rank. Numerische Mathematik, 22. [3] M. Espig and A. Khachatryan. Convergence of alternating least squares optimisation for rank-one approximation to high order tensors. Preprint: [4] Mike Espig, Wolfgang Hackbusch, Alexander Litvinenko, Hermann G. Matthies, and Philipp Wähnert. Efficient low-rank approximation of the stochastic Galerkin matrix in tensor formats. Computers & Mathematics with Applications, 22. [5] L. Grasedyck. Hierarchical singular value decomposition of tensors. SIAM J. Matrix Analysis Applications, 3(4): , 2. [6] W. Hackbusch. Tensor Spaces and Numerical Tensor Calculus. Springer, 22. [7] W. Hackbusch and S. Kühn. A new scheme for the tensor representation. The journal of Fourier analysis and applications, 5(5):76 722, 29. [8] S. Holtz, T. Rohwedder, and R. Schneider. The alternating linear scheme for tensor optimization in the tensor train format. SIAM J. Sci. Comput., 34(2):683 73, March 22. [9] H. Keller. On the solution of singular and semidefinite linear systems by iteration. Journal of the Society for Industrial and Applied Mathematics Series B Numerical Analysis, 2(2):28 29,

25 .e+ q=.847 (tan phi_k+,)/(tan phi_k,).e k Figure 6: The ratio tan(ϕ,k+) tan(ϕ,k ) is plotted for λ =.46. The rate of convergence from Theorem 4.2 is for this example equal to.847. The plot illustrates that the description of the convergence rate is accurate and sharp. [2] B. N. Khoromskij and C. Schwab. Tensor-structured galerkin approximation of parametric and stochastic elliptic pdes. SIAM J. Sci. Comput., 33(): , February 2. [2] T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM REVIEW, 5(3):455 5, 29. [22] Joseph M. Landsburg, Yang Qi, and Ke Ye. On the geometry of tensor network states. Quantum Information and Computation, 2(3-4): , 22. [23] Young-Ju Lee, Jinbiao Wu, Jinchao Xu, and Ludmil Zikatanov. On the convergence of iterative methods for semidefinite linear systems. SIAM J. Matrix Anal. Appl., 28(3):634 64, August 26. [24] H. G. Matthies and E. Zander. Solving stochastic systems with low-rank tensor compression. 436(): , May 22. [25] M. J. Mohlenkamp. Musings on multilinear fitting. Linear Algebra Appl., 438(2): , 23. [26] Anthony Nouy. A generalized spectral decomposition technique to solve a class of linear stochastic partial differential equations. Comput. Methods Appl. Mech. Engrg., 96(45-48): , 27. [27] Anthony Nouy. Proper generalized decompositions and separated representations for the numerical solution of high dimensional stochastic problems. Arch. Comput. Methods Eng., 7(4):43 434, 2. [28] J. M. Ortega and W. C. Rheinboldt. Iterative Solution of Nonlinear Equations in Several Variables. Society for Industrial Mathematics, 97. [29] I. V. Oseledets. Dmrg approach to fast linear algebra in the tt-format. Comput. Meth. in Appl. Math., (3): , 2. 24

26 [3] I. V. Oseledets. Tensor-train decomposition. SIAM J. Scientific Computing, 33(5): , 2. [3] I. V. Oseledets and S. V. Dolgov. Solution of linear systems and matrix inversion in the tt-format. SIAM J. Scientific Computing, 34(5), 22. [32] I. V. Oseledets and E. Tyrtyshnikov. Breaking the curse of dimensionality, or how to use svd in many dimensions. SIAM J. Scientific Computing, 3(5): , 29. [33] T. Rohwedder and A. Uschmajew. On local convergence of alternating schemes for optimization of convex problems in the tensor train format. SIAM Journal on Numerical Analysis, 5(2):34 62, 23. [34] Y. Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, 2. [35] A. Szabó and N.S. Ostlund. Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. Dover Books on Chemistry Series. Dover Publications, 996. [36] A. Uschmajew. Local convergence of the alternating least squares algorithm for canonical tensor approximation. SIAM Journal on Matrix Analysis and Applications, 33(2): ,

Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition

Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition R. Schneider (TUB Matheon) John von Neumann Lecture TU Munich, 2012 Setting - Tensors V ν := R n, H d = H := d ν=1 V ν

More information

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples Lars Grasedyck and Wolfgang Hackbusch Bericht Nr. 329 August 2011 Key words: MSC: hierarchical Tucker tensor rank tensor approximation

More information

Matrix-Product-States/ Tensor-Trains

Matrix-Product-States/ Tensor-Trains / Tensor-Trains November 22, 2016 / Tensor-Trains 1 Matrices What Can We Do With Matrices? Tensors What Can We Do With Tensors? Diagrammatic Notation 2 Singular-Value-Decomposition 3 Curse of Dimensionality

More information

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko tifica Lecture 1: Introduction to low-rank tensor representation/approximation Alexander Litvinenko http://sri-uq.kaust.edu.sa/ KAUST Figure : KAUST campus, 5 years old, approx. 7000 people (include 1400

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Dynamical low rank approximation in hierarchical tensor format

Dynamical low rank approximation in hierarchical tensor format Dynamical low rank approximation in hierarchical tensor format R. Schneider (TUB Matheon) John von Neumann Lecture TU Munich, 2012 Motivation Equations describing complex systems with multi-variate solution

More information

Spectral Theorem for Self-adjoint Linear Operators

Spectral Theorem for Self-adjoint Linear Operators Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;

More information

A New Scheme for the Tensor Representation

A New Scheme for the Tensor Representation J Fourier Anal Appl (2009) 15: 706 722 DOI 10.1007/s00041-009-9094-9 A New Scheme for the Tensor Representation W. Hackbusch S. Kühn Received: 18 December 2008 / Revised: 29 June 2009 / Published online:

More information

Tensor Sparsity and Near-Minimal Rank Approximation for High-Dimensional PDEs

Tensor Sparsity and Near-Minimal Rank Approximation for High-Dimensional PDEs Tensor Sparsity and Near-Minimal Rank Approximation for High-Dimensional PDEs Wolfgang Dahmen, RWTH Aachen Collaborators: Markus Bachmayr, Ron DeVore, Lars Grasedyck, Endre Süli Paris, Oct.11, 2013 W.

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

ON MANIFOLDS OF TENSORS OF FIXED TT-RANK

ON MANIFOLDS OF TENSORS OF FIXED TT-RANK ON MANIFOLDS OF TENSORS OF FIXED TT-RANK SEBASTIAN HOLTZ, THORSTEN ROHWEDDER, AND REINHOLD SCHNEIDER Abstract. Recently, the format of TT tensors [19, 38, 34, 39] has turned out to be a promising new format

More information

Density Matrix Renormalization Group Algorithm - Optimization in TT-format

Density Matrix Renormalization Group Algorithm - Optimization in TT-format Density Matrix Renormalization Group Algorithm - Optimization in TT-format R Schneider (TU Berlin MPI Leipzig 2010 Density Matrix Renormalisation Group (DMRG DMRG algorithm is promininent algorithm to

More information

Tensor Product Approximation

Tensor Product Approximation Tensor Product Approximation R. Schneider (TUB Matheon) Mariapfarr, 2014 Acknowledgment DFG Priority program SPP 1324 Extraction of essential information from complex data Co-workers: T. Rohwedder (HUB),

More information

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems Elliptic boundary value problems often occur as the Euler equations of variational problems the latter

More information

Numerical tensor methods and their applications

Numerical tensor methods and their applications Numerical tensor methods and their applications 8 May 2013 All lectures 4 lectures, 2 May, 08:00-10:00: Introduction: ideas, matrix results, history. 7 May, 08:00-10:00: Novel tensor formats (TT, HT, QTT).

More information

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets) WHAT ARE TENSORS? Tensors

More information

MATH Linear Algebra

MATH Linear Algebra MATH 304 - Linear Algebra In the previous note we learned an important algorithm to produce orthogonal sequences of vectors called the Gramm-Schmidt orthogonalization process. Gramm-Schmidt orthogonalization

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information

31. Successive Subspace Correction method for Singular System of Equations

31. Successive Subspace Correction method for Singular System of Equations Fourteenth International Conference on Domain Decomposition Methods Editors: Ismael Herrera, David E. Keyes, Olof B. Widlund, Robert Yates c 2003 DDM.org 31. Successive Subspace Correction method for Singular

More information

1. Structured representation of high-order tensors revisited. 2. Multi-linear algebra (MLA) with Kronecker-product data.

1. Structured representation of high-order tensors revisited. 2. Multi-linear algebra (MLA) with Kronecker-product data. Lect. 4. Toward MLA in tensor-product formats B. Khoromskij, Leipzig 2007(L4) 1 Contents of Lecture 4 1. Structured representation of high-order tensors revisited. - Tucker model. - Canonical (PARAFAC)

More information

On Riesz-Fischer sequences and lower frame bounds

On Riesz-Fischer sequences and lower frame bounds On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko tifica Lecture 1: Introduction to low-rank tensor representation/approximation Alexander Litvinenko http://sri-uq.kaust.edu.sa/ KAUST Figure : KAUST campus, 5 years old, approx. 7000 people (include 1400

More information

SYLLABUS. 1 Linear maps and matrices

SYLLABUS. 1 Linear maps and matrices Dr. K. Bellová Mathematics 2 (10-PHY-BIPMA2) SYLLABUS 1 Linear maps and matrices Operations with linear maps. Prop 1.1.1: 1) sum, scalar multiple, composition of linear maps are linear maps; 2) L(U, V

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

A new truncation strategy for the higher-order singular value decomposition

A new truncation strategy for the higher-order singular value decomposition A new truncation strategy for the higher-order singular value decomposition Nick Vannieuwenhoven K.U.Leuven, Belgium Workshop on Matrix Equations and Tensor Techniques RWTH Aachen, Germany November 21,

More information

Chapter 2 Finite Element Spaces for Linear Saddle Point Problems

Chapter 2 Finite Element Spaces for Linear Saddle Point Problems Chapter 2 Finite Element Spaces for Linear Saddle Point Problems Remark 2.1. Motivation. This chapter deals with the first difficulty inherent to the incompressible Navier Stokes equations, see Remark

More information

Diskrete Mathematik und Optimierung

Diskrete Mathematik und Optimierung Diskrete Mathematik und Optimierung Steffen Hitzemann and Winfried Hochstättler: On the Combinatorics of Galois Numbers Technical Report feu-dmo012.08 Contact: steffen.hitzemann@arcor.de, winfried.hochstaettler@fernuni-hagen.de

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations Oleg Burdakov a,, Ahmad Kamandi b a Department of Mathematics, Linköping University,

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Optimization Theory. A Concise Introduction. Jiongmin Yong

Optimization Theory. A Concise Introduction. Jiongmin Yong October 11, 017 16:5 ws-book9x6 Book Title Optimization Theory 017-08-Lecture Notes page 1 1 Optimization Theory A Concise Introduction Jiongmin Yong Optimization Theory 017-08-Lecture Notes page Optimization

More information

Adaptive low-rank approximation in hierarchical tensor format using least-squares method

Adaptive low-rank approximation in hierarchical tensor format using least-squares method Workshop on Challenges in HD Analysis and Computation, San Servolo 4/5/2016 Adaptive low-rank approximation in hierarchical tensor format using least-squares method Anthony Nouy Ecole Centrale Nantes,

More information

2. Review of Linear Algebra

2. Review of Linear Algebra 2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear

More information

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov,

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov, LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES Sergey Korotov, Institute of Mathematics Helsinki University of Technology, Finland Academy of Finland 1 Main Problem in Mathematical

More information

We wish the reader success in future encounters with the concepts of linear algebra.

We wish the reader success in future encounters with the concepts of linear algebra. Afterword Our path through linear algebra has emphasized spaces of vectors in dimension 2, 3, and 4 as a means of introducing concepts which go forward to IRn for arbitrary n. But linear algebra does not

More information

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook. Math 443 Differential Geometry Spring 2013 Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook. Endomorphisms of a Vector Space This handout discusses

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information

5 Compact linear operators

5 Compact linear operators 5 Compact linear operators One of the most important results of Linear Algebra is that for every selfadjoint linear map A on a finite-dimensional space, there exists a basis consisting of eigenvectors.

More information

Where is matrix multiplication locally open?

Where is matrix multiplication locally open? Linear Algebra and its Applications 517 (2017) 167 176 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Where is matrix multiplication locally open?

More information

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan Wir müssen wissen, wir werden wissen. David Hilbert We now continue to study a special class of Banach spaces,

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Max Planck Institute Magdeburg Preprints

Max Planck Institute Magdeburg Preprints Thomas Mach Computing Inner Eigenvalues of Matrices in Tensor Train Matrix Format MAX PLANCK INSTITUT FÜR DYNAMIK KOMPLEXER TECHNISCHER SYSTEME MAGDEBURG Max Planck Institute Magdeburg Preprints MPIMD/11-09

More information

Functional Analysis. James Emery. Edit: 8/7/15

Functional Analysis. James Emery. Edit: 8/7/15 Functional Analysis James Emery Edit: 8/7/15 Contents 0.1 Green s functions in Ordinary Differential Equations...... 2 0.2 Integral Equations........................ 2 0.2.1 Fredholm Equations...................

More information

Review and problem list for Applied Math I

Review and problem list for Applied Math I Review and problem list for Applied Math I (This is a first version of a serious review sheet; it may contain errors and it certainly omits a number of topic which were covered in the course. Let me know

More information

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

CHAPTER 11. A Revision. 1. The Computers and Numbers therein CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of

More information

Matrix assembly by low rank tensor approximation

Matrix assembly by low rank tensor approximation Matrix assembly by low rank tensor approximation Felix Scholz 13.02.2017 References Angelos Mantzaflaris, Bert Juettler, Boris Khoromskij, and Ulrich Langer. Matrix generation in isogeometric analysis

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

für Mathematik in den Naturwissenschaften Leipzig

für Mathematik in den Naturwissenschaften Leipzig ŠܹÈÐ Ò ¹ÁÒ Ø ØÙØ für Mathematik in den Naturwissenschaften Leipzig Quantics-TT Approximation of Elliptic Solution Operators in Higher Dimensions (revised version: January 2010) by Boris N. Khoromskij,

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

The tensor algebra of power series spaces

The tensor algebra of power series spaces The tensor algebra of power series spaces Dietmar Vogt Abstract The linear isomorphism type of the tensor algebra T (E) of Fréchet spaces and, in particular, of power series spaces is studied. While for

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Dynamical low-rank approximation

Dynamical low-rank approximation Dynamical low-rank approximation Christian Lubich Univ. Tübingen Genève, Swiss Numerical Analysis Day, 17 April 2015 Coauthors Othmar Koch 2007, 2010 Achim Nonnenmacher 2008 Dajana Conte 2010 Thorsten

More information

Categories and Quantum Informatics: Hilbert spaces

Categories and Quantum Informatics: Hilbert spaces Categories and Quantum Informatics: Hilbert spaces Chris Heunen Spring 2018 We introduce our main example category Hilb by recalling in some detail the mathematical formalism that underlies quantum theory:

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

PRODUCT OF OPERATORS AND NUMERICAL RANGE

PRODUCT OF OPERATORS AND NUMERICAL RANGE PRODUCT OF OPERATORS AND NUMERICAL RANGE MAO-TING CHIEN 1, HWA-LONG GAU 2, CHI-KWONG LI 3, MING-CHENG TSAI 4, KUO-ZHONG WANG 5 Abstract. We show that a bounded linear operator A B(H) is a multiple of a

More information

Linear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products

Linear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products Linear Algebra Paul Yiu Department of Mathematics Florida Atlantic University Fall 2011 6A: Inner products In this chapter, the field F = R or C. We regard F equipped with a conjugation χ : F F. If F =

More information

4. Multi-linear algebra (MLA) with Kronecker-product data.

4. Multi-linear algebra (MLA) with Kronecker-product data. ect. 3. Tensor-product interpolation. Introduction to MLA. B. Khoromskij, Leipzig 2007(L3) 1 Contents of Lecture 3 1. Best polynomial approximation. 2. Error bound for tensor-product interpolants. - Polynomial

More information

Institut für Mathematik

Institut für Mathematik U n i v e r s i t ä t A u g s b u r g Institut für Mathematik Martin Rasmussen, Peter Giesl A Note on Almost Periodic Variational Equations Preprint Nr. 13/28 14. März 28 Institut für Mathematik, Universitätsstraße,

More information

INF-SUP CONDITION FOR OPERATOR EQUATIONS

INF-SUP CONDITION FOR OPERATOR EQUATIONS INF-SUP CONDITION FOR OPERATOR EQUATIONS LONG CHEN We study the well-posedness of the operator equation (1) T u = f. where T is a linear and bounded operator between two linear vector spaces. We give equivalent

More information

Characterization of half-radial matrices

Characterization of half-radial matrices Characterization of half-radial matrices Iveta Hnětynková, Petr Tichý Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague 8, Czech Republic Abstract Numerical radius r(a) is the

More information

2 Two-Point Boundary Value Problems

2 Two-Point Boundary Value Problems 2 Two-Point Boundary Value Problems Another fundamental equation, in addition to the heat eq. and the wave eq., is Poisson s equation: n j=1 2 u x 2 j The unknown is the function u = u(x 1, x 2,..., x

More information

Linear algebra for MATH2601: Theory

Linear algebra for MATH2601: Theory Linear algebra for MATH2601: Theory László Erdős August 12, 2000 Contents 1 Introduction 4 1.1 List of crucial problems............................... 5 1.2 Importance of linear algebra............................

More information

Chapter 8 Integral Operators

Chapter 8 Integral Operators Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset

More information

Solution of Linear Equations

Solution of Linear Equations Solution of Linear Equations (Com S 477/577 Notes) Yan-Bin Jia Sep 7, 07 We have discussed general methods for solving arbitrary equations, and looked at the special class of polynomial equations A subclass

More information

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra Tensors Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra This lecture is divided into two parts. The first part,

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

We describe the generalization of Hazan s algorithm for symmetric programming

We describe the generalization of Hazan s algorithm for symmetric programming ON HAZAN S ALGORITHM FOR SYMMETRIC PROGRAMMING PROBLEMS L. FAYBUSOVICH Abstract. problems We describe the generalization of Hazan s algorithm for symmetric programming Key words. Symmetric programming,

More information

Uniqueness of the Solutions of Some Completion Problems

Uniqueness of the Solutions of Some Completion Problems Uniqueness of the Solutions of Some Completion Problems Chi-Kwong Li and Tom Milligan Abstract We determine the conditions for uniqueness of the solutions of several completion problems including the positive

More information

Applied Linear Algebra

Applied Linear Algebra Applied Linear Algebra Peter J. Olver School of Mathematics University of Minnesota Minneapolis, MN 55455 olver@math.umn.edu http://www.math.umn.edu/ olver Chehrzad Shakiban Department of Mathematics University

More information

Analysis-3 lecture schemes

Analysis-3 lecture schemes Analysis-3 lecture schemes (with Homeworks) 1 Csörgő István November, 2015 1 A jegyzet az ELTE Informatikai Kar 2015. évi Jegyzetpályázatának támogatásával készült Contents 1. Lesson 1 4 1.1. The Space

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

A 3 3 DILATION COUNTEREXAMPLE

A 3 3 DILATION COUNTEREXAMPLE A 3 3 DILATION COUNTEREXAMPLE MAN DUEN CHOI AND KENNETH R DAVIDSON Dedicated to the memory of William B Arveson Abstract We define four 3 3 commuting contractions which do not dilate to commuting isometries

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

Low Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL

Low Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL Low Rank Approximation Lecture 7 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Alternating least-squares / linear scheme General setting:

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Tensorization of Electronic Schrödinger Equation and Second Quantization

Tensorization of Electronic Schrödinger Equation and Second Quantization Tensorization of Electronic Schrödinger Equation and Second Quantization R. Schneider (TUB Matheon) Mariapfarr, 2014 Introduction Electronic Schrödinger Equation Quantum mechanics Goal: Calculation of

More information

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

A Recursive Trust-Region Method for Non-Convex Constrained Minimization A Recursive Trust-Region Method for Non-Convex Constrained Minimization Christian Groß 1 and Rolf Krause 1 Institute for Numerical Simulation, University of Bonn. {gross,krause}@ins.uni-bonn.de 1 Introduction

More information

Non-Intrusive Solution of Stochastic and Parametric Equations

Non-Intrusive Solution of Stochastic and Parametric Equations Non-Intrusive Solution of Stochastic and Parametric Equations Hermann G. Matthies a Loïc Giraldi b, Alexander Litvinenko c, Dishi Liu d, and Anthony Nouy b a,, Brunswick, Germany b École Centrale de Nantes,

More information

Sampling and Low-Rank Tensor Approximations

Sampling and Low-Rank Tensor Approximations Sampling and Low-Rank Tensor Approximations Hermann G. Matthies Alexander Litvinenko, Tarek A. El-Moshely +, Brunswick, Germany + MIT, Cambridge, MA, USA wire@tu-bs.de http://www.wire.tu-bs.de $Id: 2_Sydney-MCQMC.tex,v.3

More information

ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS

ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS J. OPERATOR THEORY 44(2000), 243 254 c Copyright by Theta, 2000 ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS DOUGLAS BRIDGES, FRED RICHMAN and PETER SCHUSTER Communicated by William B. Arveson Abstract.

More information

Simple Examples on Rectangular Domains

Simple Examples on Rectangular Domains 84 Chapter 5 Simple Examples on Rectangular Domains In this chapter we consider simple elliptic boundary value problems in rectangular domains in R 2 or R 3 ; our prototype example is the Poisson equation

More information

Max-Planck-Institut fur Mathematik in den Naturwissenschaften Leipzig H 2 -matrix approximation of integral operators by interpolation by Wolfgang Hackbusch and Steen Borm Preprint no.: 04 200 H 2 -Matrix

More information

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm

A sensitivity result for quadratic semidefinite programs with an application to a sequential quadratic semidefinite programming algorithm Volume 31, N. 1, pp. 205 218, 2012 Copyright 2012 SBMAC ISSN 0101-8205 / ISSN 1807-0302 (Online) www.scielo.br/cam A sensitivity result for quadratic semidefinite programs with an application to a sequential

More information

Math 671: Tensor Train decomposition methods II

Math 671: Tensor Train decomposition methods II Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016 Table of Contents 1 What we ve talked about so far: 2 The Tensor Train decomposition

More information