Math 671: Tensor Train decomposition methods

Similar documents
Math 671: Tensor Train decomposition methods II

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Matrix-Product-States/ Tensor-Trains

NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA

TENSORS AND COMPUTATIONS

NEW TENSOR DECOMPOSITIONS IN NUMERICAL ANALYSIS AND DATA PROCESSING

Numerical tensor methods and their applications

Institute for Computational Mathematics Hong Kong Baptist University

Tensor networks and deep learning

Linear Algebra and its Applications

From Matrix to Tensor. Charles F. Van Loan

Applied Linear Algebra in Geoscience Using MATLAB

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko

The multiple-vector tensor-vector product

Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition

This work has been submitted to ChesterRep the University of Chester s online research repository.

Institute for Computational Mathematics Hong Kong Baptist University

Linear Algebra (Review) Volker Tresp 2018

arxiv: v2 [math.na] 13 Dec 2014

Lecture 2. Tensor Unfoldings. Charles F. Van Loan

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018

Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks

für Mathematik in den Naturwissenschaften Leipzig

MATRICES. a m,1 a m,n A =

MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS

Linear Algebra (Review) Volker Tresp 2017

Higher-Order Singular Value Decomposition (HOSVD) for structured tensors

Relation of Pure Minimum Cost Flow Model to Linear Programming

Lecture 2: The Fast Multipole Method

Numerical Analysis Preliminary Exam 10 am to 1 pm, August 20, 2018

Parallel Tensor Compression for Large-Scale Scientific Data

Main matrix factorizations

The Strong Largeur d Arborescence

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

H 2 -matrices with adaptive bases

FINITE-DIMENSIONAL LINEAR ALGEBRA

Tensor Decompositions and Applications

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Tensors and graphical models

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

Preliminary Examination, Numerical Analysis, August 2016

Preliminary Examination in Numerical Analysis

Introduction to Data Mining

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Singular Value Decompsition

Consider the following example of a linear system:

Lecture 8: Boundary Integral Equations

Structured tensor missing-trace interpolation in the Hierarchical Tucker format Curt Da Silva and Felix J. Herrmann Sept. 26, 2013

Computational Linear Algebra

Fast Direct Methods for Gaussian Processes

Max Planck Institute Magdeburg Preprints

Linear Algebra and Robot Modeling

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples

Numerical Linear Algebra

Finite difference method for elliptic problems: I

Department of Applied Mathematics Preliminary Examination in Numerical Analysis August, 2013

Low Rank Approximation Lecture 3. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL

Applied Linear Algebra in Geoscience Using MATLAB

Hierarchical Matrices. Jon Cockayne April 18, 2017

Clarkson Inequalities With Several Operators

McGill University Department of Mathematics and Statistics. Ph.D. preliminary examination, PART A. PURE AND APPLIED MATHEMATICS Paper BETA

Image Registration Lecture 2: Vectors and Matrices

Fast numerical methods for solving linear PDEs

Mathematics 13: Lecture 10

Linear Algebra, part 3 QR and SVD

The Singular Value Decomposition

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

STA141C: Big Data & High Performance Statistical Computing

Properties of Matrices and Operations on Matrices

Applied Numerical Linear Algebra. Lecture 8

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

1 Linearity and Linear Systems

Review of Vectors and Matrices

CSC 576: Linear System

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3

Signatures of GL n Multiplicity Spaces

A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators

Linear Algebra for Machine Learning. Sargur N. Srihari

Numerical Linear and Multilinear Algebra in Quantum Tensor Networks

Lecture 6. Numerical methods. Approximation of functions

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

The Singular Value Decomposition

(Linear equations) Applied Linear Algebra in Geoscience Using MATLAB

Linear Algebra. and

Lecture 4.3 Estimating homographies from feature correspondences. Thomas Opsahl

Lecture 1: Center for Uncertainty Quantification. Alexander Litvinenko. Computation of Karhunen-Loeve Expansion:

n 1 f n 1 c 1 n+1 = c 1 n $ c 1 n 1. After taking logs, this becomes

Fast direct solvers for elliptic PDEs

Data Mining Techniques

Linear Algebra Review. Fei-Fei Li

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Chapter 7 Network Flow Problems, I

Transcription:

Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016

Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3 Model problem: integral equations Problem formulation

Preliminaries and goal Motivation: Low rank approximation Optimal low rank approximation using a truncated SVD with k terms Quasi-optimal low rank approximation using the ID Block low-rank structure - Fast Multipole Method and others (e.g. Butterfly algorithm)

Preliminaries and goal What is a tensor? A tensor is a generalization of a vector to multiple dimensions In the most abstract setting, a tensor T is an element (v 1, v 2,..., v d ) in the product of vector spaces V 1 V 2 V d. In the context of vectors and matrices (V i = R or R n ), you can also think of them as multidimensional arrays of d dimensions: A vector v is then a 1-tensor, with elements v(i). A matrix A is a 2-tensor, with elements A(i, j) A tensor T of dimension d has elements T (i 1, i 2,..., i d )

Preliminaries and goal Going from a vector or matrix to a tensor It is possible to go from a tensor to a vector by flattening or vectorizing it. It is also possible to create extra-dimensions in a vector to tensorize it. You might ve already done this with matrices, and in fact, all arrays in your computer are really just vectors. Matlab commands reshape and permute When we merge tensor indices, we will indicate it by placing a bar on top: i = i 1 i 2 i d

Preliminaries and goal The Tensor Train decomposition TT decomposition (Tyrtyshnikov, Oseledets et al): an extremely efficient numerical method to compress tensors. Extension of low rank approximation to d dimensional tensors. Overcomes curse of dimensionality: for many examples work and storage are O(d) or O(d k ) for small k. This decomposition can be applied to vectors and matrices, by reshaping them as higher dimensional tensors.

Preliminaries and goal Goal: function compression and fast matrix algebra By tensorizing function samples, we can use the TT to evaluate, interpolate and perform operations on that function efficiently. Applying this to matrices, we can obtain compact matrix factorizations of A and A 1. We also have algorithms to perform operations with these (e.g. apply them to vectors) fast.

Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3 Model problem: integral equations Problem formulation

Unfolding matrices For a tensor T of dimension d, unfolding matrices can be defined as T k (i 1 i 2..i k, i k+1..i d ) = T (i 1, i 2,.., i d ) k = 1,, d, That is, rows and columns result from merging the first k and the last d k dimensions of T, respectively. Using Matlab notation, T k = reshape(t, k n i, i=1 d i=k+1 n i )

Why are unfolding matrices T k important? As we will see, the TT decomposition is obtained from low rank decompositions of a tensor s unfolding matrices. Low rank decomposition for matrices What do unfolding matrices of a tensorized vector mean? What about for tensorized matrices? Why can we expect them to be low rank?

Unfolding matrices for tensorized arrays What does it mean to tensorize a vector? Let us assume we have a vector f, which we have obtained from taking 8 samples of a function f (x). (Say, f (x) = sin(x) on [0, 2π]). What does it mean to make it a tensor?

Unfolding matrices for tensorized arrays Tensorized vector index Domain x i Ω 0 1 0 1 0 1 0 1 0 1 0 1 0 1 i i i 1 2 3 Unfolding Matrix F T ( i i, i ) = f(x ) 1 2 3 i

Unfolding matrices for tensorized arrays Unfolding matrices: vectors of function samples Let f : Ω R Hierarchical partition of Ω tree structure T of depth d Any x Ω can be encoded by indices {i l } d 1 l=1, which of the n l tree branches it belongs to at level l. We sample f on n d samples on each leaf (indexed by i d ). Index by integer coordinates and tensorize: F T (i 1, i 2,.., i d ) = f (x i ), i = i 1 i d

Unfolding matrices for tensorized arrays Why are these unfolding matrices low rank?

Unfolding matrices for tensorized arrays Model problem: Integral Equations for PDEs Many boundary value problems from classical physics, when cast as boundary or volume integral equations, take the form A[σ](x) = a(x)σ(x) + K (x, y)σ(y)ds(y) = f (x), x Γ, Γ K (x, y) kernel function related to PDE fundamental solution. Typically singular near diagonal (y = x) but otherwise smooth. We prefer Fredholm of the 2nd kind (Identity + Compact).

Unfolding matrices for tensorized arrays Discretization Discretizing these integrals using, e.g., the Nyström method: (Aσ) i = a(x i )σ i + N K (x i, y j )σ j ω j = f (x i ) j=1 results in a linear system of the form Aσ = f where A is a dense N N matrix. If K is singular, special quadratures are needed if sources x i and targets y j get close.

Unfolding matrices for tensorized arrays Matrices of kernel samples Matrix entries are kernel evaluations K (x i, y j ) for N sources {y j } and M targets {x i }. If we partition the domain and range of K (x, y), we can consider a hierarchy of interactions At every level, a matrix block is encoded by the integer coordinate pair (i l, j l ), or equivalently, by a block coordinate b l = i l j l A T (i 1 j 1, i 2 j 2,.., i d j d ) = A(i 1 i 2... i d, j 1 j 2... j d ) = K (x i, y j )

Unfolding matrices for tensorized arrays Tensorized matrix index A(i,j) =... A 1 A 2 A 3 16 x 16 32 x 4 64 x 1 Unfolding matrix A l all interactions between source and target nodes at a given level.

Unfolding matrices for tensorized arrays Why are these unfolding matrices low rank?

The Tensor Train decomposition What is the TT decomposition? For a d dimensional tensor A sampled at N = d i=1 n i points indexed by (i 1, i 2,..., i d ), this decomposition can be written as: A(i 1, i 2,..., i d ) α 1,...,α d 1 G 1 (i 1, α 1 )G 2 (α 1, i 2, α 2 )... G d (α d 1, i d ) Each G k is known as a tensor core. Auxiliary indices α k determine number of terms in the decomposition and run from 1 to r k. r k is known as the kth TT rank.

The Tensor Train decomposition How to compute TT cores and ranks? The k-th TT rank is the rank of the k-th unfolding matrix A k, and a TT decomposition may be obtained by a series of low rank approximations. You can think of it as an extension of the SVD to tensors (it s not the only one, but it s one with some of the very same properties). By truncating the ranks, we also get a quasi-optimal analogue to the optimal low rank approximation.

The Tensor Train decomposition How to obtain a TT decomposition If we compute our favorite low rank approximation of A 1, we obtain: A 1 (i 1, i 2..i d ) = U(i 1, α 1 )V (α 1, i 2..i d ) The first core G 1 (α 1, i 1 ) is a reshaping of U. We can iterate this procedure for V (α 1 i 2, i 3,.., i d ) More efficient algorithms: series of low TT rank approximations, enriched with updates local to each core G k. (amencross algorithm from TT-Toolbox)

The Tensor Train decomposition TT for matrices A = A 1 = 16 16 M 1 = 16 r 1 U 1 G 1 [ ] r 1 16 V 1 4r 1 4 M 2 = 4r 1 r 2 U 2 G 2 [ ] r 2 4 V 2 4r 2 1 M 3 = U 3 G 3 =

The Tensor Train decomposition Why does it achieve better compression? TT rank is low if there exists a small basis of interactions For many examples from differential and integral equations, ranks are actually bounded or grow very slowly (like log N) Other examples have higher growth (Toeplitz have ranks with growth N 1/2 ) Symmetries, and particularly translation or rotation invariance reduce TT ranks significantly.

Model problem: integral equations Problem formulation TT matrix-vector apply Once we have an approximation of A or A 1, we want to compute fast matrix-vector products. If x can be efficiently compressed using the TT decomposition, the TT cores of y = Ax can be computed: Y k (α k β k, i k, α k+1 β k+1 ) = j k G k (α k, i k, j k, α k+1 )X k (β k, j k, β k+1 ) If x is a dense / incompressible vector, a fast O(N log N) algorithm proceeds by contracting each dimension (applying each core) at a time.