Tensors: a geometric view Open lecture November 24, 2014 Simons Institute, Berkeley. Giorgio Ottaviani, Università di Firenze

Similar documents
Giorgio Ottaviani. Università di Firenze. Tutorial on Tensor rank and tensor decomposition

Tensor decomposition and tensor rank

A Geometric Perspective on the Singular Value Decomposition

Special Session on Secant Varieties. and Related Topics. Secant Varieties of Grassmann and Segre Varieties. Giorgio Ottaviani

Tensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra

Is Every Secant Variety of a Segre Product Arithmetically Cohen Macaulay? Oeding (Auburn) acm Secants March 6, / 23

Secant Varieties of Segre Varieties. M. Catalisano, A.V. Geramita, A. Gimigliano

The number of singular vector tuples and approximation of symmetric tensors / 13

Eigenvectors of Tensors and Waring Decomposition

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

The Algebraic Degree of Semidefinite Programming

Coding the Matrix Index - Version 0

Voronoi Cells of Varieties

Semidefinite Programming

and let s calculate the image of some vectors under the transformation T.

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

City Suburbs. : population distribution after m years

Ranks of Real Symmetric Tensors

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

MH1200 Final 2014/2015

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Chapter 3 Transformations

Open Problems in Algebraic Statistics

Linear Algebra (Review) Volker Tresp 2017

Conceptual Questions for Review

Counting matrices over finite fields

Knowledge Discovery and Data Mining 1 (VO) ( )

arxiv: v3 [math.ag] 15 Dec 2016

Conics and their duals

1 Last time: least-squares problems

ALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley

12x + 18y = 30? ax + by = m

Linear Algebra (Review) Volker Tresp 2018

A Robust Form of Kruskal s Identifiability Theorem

Review problems for MA 54, Fall 2004.

arxiv: v1 [math.ag] 25 Dec 2015

Algebraic Geometry (Math 6130)

The Euclidean Distance Degree of an Algebraic Variety

Review of Some Concepts from Linear Algebra: Part 2

SYLLABUS. 1 Linear maps and matrices

Maximum likelihood for dual varieties

1 9/5 Matrices, vectors, and their applications

Lecture 19: Isometries, Positive operators, Polar and singular value decompositions; Unitary matrices and classical groups; Previews (1)

Oeding (Auburn) tensors of rank 5 December 15, / 24

Algebraic Methods in Combinatorics

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

On the Waring problem for polynomial rings

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

DS-GA 1002 Lecture notes 10 November 23, Linear models

Commuting birth-and-death processes

Homotopy Techniques for Tensor Decomposition and Perfect Identifiability

Chapter 2 Linear Transformations

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Linear Algebra. Session 12

Lecture 6 Positive Definite Matrices

SPECTRAHEDRA. Bernd Sturmfels UC Berkeley

Problem # Max points possible Actual score Total 120

Linear Algebra and Robot Modeling

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

The definition of a vector space (V, +, )

Copositive matrices and periodic dynamical systems

Contents. 1 Vectors, Lines and Planes 1. 2 Gaussian Elimination Matrices Vector Spaces and Subspaces 124

Report on the recent progress on the study of the secant defectivity of Segre-Veronese varieties. Hirotachi Abo

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

University of Colorado at Denver Mathematics Department Applied Linear Algebra Preliminary Exam With Solutions 16 January 2009, 10:00 am 2:00 pm

ECE 598: Representation Learning: Algorithms and Models Fall 2017

Hyperdeterminants of Polynomials

Extra Problems for Math 2050 Linear Algebra I

AN INTRODUCTION TO AFFINE TORIC VARIETIES: EMBEDDINGS AND IDEALS

SOLUTIONS: ASSIGNMENT Use Gaussian elimination to find the determinant of the matrix. = det. = det = 1 ( 2) 3 6 = 36. v 4.

MILNOR SEMINAR: DIFFERENTIAL FORMS AND CHERN CLASSES

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

[06.1] Given a 3-by-3 matrix M with integer entries, find A, B integer 3-by-3 matrices with determinant ±1 such that AMB is diagonal.

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

ANSWERS (5 points) Let A be a 2 2 matrix such that A =. Compute A. 2

Lecture 7: Positive Semidefinite Matrices

1. Select the unique answer (choice) for each problem. Write only the answer.

On the geometry of matrix multiplication

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class:

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Computational Complexity of Tensor Problems

Chapter 2 Notes, Linear Algebra 5e Lay

Lecture Notes 1: Vector spaces

4.2. ORTHOGONALITY 161

c 2016 Society for Industrial and Applied Mathematics

Eigenvalues and diagonalization

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

MATHEMATICS. Course Syllabus. Section A: Linear Algebra. Subject Code: MA. Course Structure. Ordinary Differential Equations

Sufficient Conditions to Ensure Uniqueness of Best Rank One Approximation

On Strassen s Conjecture

Maths for Signals and Systems Linear Algebra in Engineering

arxiv:alg-geom/ v2 30 Jul 1996

Spherical varieties and arc spaces

8. Diagonalization.

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Lecture 8: Linear Algebra Background

Math 408 Advanced Linear Algebra

PRIMARY DECOMPOSITION FOR THE INTERSECTION AXIOM

Transcription:

Open lecture November 24, 2014 Simons Institute, Berkeley

Plan of the talk Introduction to Tensors A few relevant applications

Tensors as multidimensional matrices A (complex) a b c tensor is an element of the space C a C b C c, it can be represented as a 3-dimensional matrix. Here is a tensor of format 2 2 3.

Slices of tensors Several slices of a 2 2 3 tensor.

Gaussian elimination Gaussian elimination consists in simplify a matrix, by adding to a row a multiple of another one. It corresponds to left multiplication by invertible matrices.

Gaussian elimination Gaussian elimination consists in simplify a matrix, by adding to a row a multiple of another one. It corresponds to left multiplication by invertible matrices.

Gaussian elimination Gaussian elimination consists in simplify a matrix, by adding to a row a multiple of another one. It corresponds to left multiplication by invertible matrices.

Gaussian elimination Gaussian elimination consists in simplify a matrix, by adding to a row a multiple of another one. It corresponds to left multiplication by invertible matrices.

Gaussian elimination Gaussian elimination consists in simplify a matrix, by adding to a row a multiple of another one. It corresponds to left multiplication by invertible matrices.

Gaussian elimination and canonical form adding rows backwards...

Gaussian elimination and canonical form adding rows backwards......adding columns we get a canonical form! This matrix of rank 5 is the sum of five rank one (or decomposable ) matrices.

The decomposable (rank one) tensors Here is a decomposable matrix = Here is a decomposable tensor =

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice. How many zeroes we may assume, at most?

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice. How many zeroes we may assume, at most? Strassen showed in 1983 one remains with at least 5 nonzero entries. Even, at least 5(> 3) decomposable summands.

Trying Gaussian elimination on a 3-dimensional tensor We can add a scalar multiple of a slice to another slice. How many zeroes we may assume, at most? Strassen showed in 1983 one remains with at least 5 nonzero entries. Even, at least 5(> 3) decomposable summands. Read the masters!

The six canonical forms of a 2 2 2 tensor general rank 2. hyperdeterminant vanishes. support on one slice (only not symmetric)! rank 1. 2 2 2 is one of the few lucky cases, where such a classification is possible.

Gaussian elimination and group action A dimensional count shows we cannot expect finitely many canonical forms. The dimension of C n C n C n is n 3. The dimension of GL(n) GL(n) GL(n) is 3n 2, too small for n 3. The same argument works for general C n 1... C n d, d 3, with a few small dimensional exceptions.

Matrices were created by God, Tensors by the Devil Paraphrasing Max Noether, we can joke that Matrices were created by God

Matrices were created by God, Tensors by the Devil Paraphrasing Max Noether, we can joke that Matrices were created by God Tensors were created by the Devil

Rank one matrix and its tangent space This is a rank one matrix

Rank one matrix and its tangent space This is a rank one matrix this is its tangent space by Leibniz formula.

Picture of tangent space Denote by {a 1,..., a 5 } the row basis, by {b 1,..., b 7 } the column basis. The picture represents a 2 b 2 A B Consider a curve a(t) b(t) such that a(0) = a 2, b(0) = b 2.

Picture of tangent space Denote by {a 1,..., a 5 } the row basis, by {b 1,..., b 7 } the column basis. The picture represents a 2 b 2 A B Consider a curve a(t) b(t) such that a(0) = a 2, b(0) = b 2. Taking the derivative at t = 0, get a (0) b(0) + a(0) b (0), so all derivatives span A b(0) + a(0) B and the two summands corresponds to the following

Picture of tangent space in three dimensions This is a rank one 3-dimensional tensor

Picture of tangent space in three dimensions This is a rank one 3-dimensional tensor and this is its tangent space

Salvador Dalì hypercube Here are better pictures

Rank two tangent space at a matrix This is a rank 2 matrix.

Rank two tangent space at a matrix This is a rank 2 matrix. This is the tangent space to this matrix, thanks to Terracini Lemma. The two summands meet in the two blue boxes.

Rank two tangent space at a tensor This is a rank 2 three-dimensional tensor.

Rank two tangent space at a tensor This is a rank 2 three-dimensional tensor. This is the tangent space to this tensor

Rank two tangent space at a tensor This is a rank 2 three-dimensional tensor. This is the tangent space to this tensor The two summands do not meet in three dimensions. There are no blue boxes. This has nice consequences for tensor decomposition, as we will see in a while. Maybe the Devil was sleeping...

Kronecker-Weierstrass The general 2 b c tensors were classified in XIX century by Kronecker and Weierstrass. The classification has two steps 1 There are canonical forms for the two cases 2 n n 2 n (n + 1) 2 The latter are building blocks for all the others.

The case 2 n n These are pencils of square matrices. The general pencil, up to the action of GL(n) GL(n), has the following two slices 1 1 1 1 λ 1 λ 2 λ 3 λ 4

The case 2 n (n + 1) The general 2 n (n + 1) pencil, up to the action of GL(n) GL(n + 1), has the following two slices 1 1 1 1 1 1 1 1

Kronecker-Weierstrass Theorem Theorem (Kronecker-Weierstrass) The general 2 b c tensor, with b < c, decomposes as a direct sum of several tensors of format 2 n (n + 1) and 2 (n + 1) (n + 2), for some n. Let see some examples a 2 2 4 tensor a 2 2 n tensor for n 4, padded with zeroes

2 b c decompositions a 2 3 5 tensor a 2 3 6 tensor a 2 3 n tensor for n 6, padded with zeroes

Fibonacci tensors With three slices, Fibonacci tensors occurr. Theorem (Kac, 1980) The general 3 b c tensor, with c 3b+ 5b 2 +1 2, decomposes as a direct sum of several tensors of format 3 F n F n+1 and 3 F n+1 F n+2, for some n, where F n = 0, 1, 3, 8, 21, 55,... are odd Fibonacci numbers.

Geometry of tensor decomposition A tensor decomposition of a tensor T is T = r i=1 D i with decomposable D i and minimal r (called the rank). The variety of decomposable tensors is the Segre variety. In many cases of interest (for d 3), tensor decomposition is unique (up to order), this allows to recover D i from T. From this point of view, tensors may have even more applications than matrices.

The Comon conjecture Conjecture A symmetric tensor T has a minimal symmetric decomposition, that is T = r i=1 D i where also D i are symmetric.

Topic models Topic models are statistical model for discovering the abstract topics that occur in large archives of documents.

Probabilistic algorithms for Topic Model Christos Papadimitriou and others gave in 1998 a probabilistic algorithm to attack this problem.

Tensor decomposition for Topic models A. Bhaskara, M. Charikar, A. Vijayaraghavan propose in 2013 to compute probabilities for the occurrence of three words, then they use tensor decomposition of a 3-tensor. Here uniqueness of tensor decomposition is crucial.

The cocktail party problem Suppose we have p persons in the room, speaking, p microphones spread over the room, we record sound at N discrete time steps. Because of the superposition of sound waves, we can model this as where, Y = MX, row i of X contains the sound produced by person i at each of the time steps. row i of Y contains the sound measured at microphone i at each of the time steps. Police wants to recover individual speeches.

The cyclically broken camera problem courtesy by Nick Vannieuvenhowen Giorgio Ottaviani, Universita di Firenze

The cyclically broken camera problem Recovered Original Giorgio Ottaviani, Universita di Firenze

The cyclically broken camera problem Recovered Original Giorgio Ottaviani, Universita di Firenze

The stochastic matrix completion problem Problem Given some entries of a matrix, is it possible to add the missing entries so that the matrix has rank 1, its entries sum to one, and it is nonnegative?

Example For example, the partial matrix 0.16 0.09 0.04 0.01

Example For example, the partial matrix 0.16 0.09 0.04 0.01 has a unique completion: 0.16 0.12 0.08 0.04 0.12 0.09 0.06 0.03 0.08 0.06 0.04 0.02. 0.04 0.03 0.02 0.01

Diagonal stochastic partial matrices Theorem (Kubjas,Rosen) Let M be an n n partial stochastic matrix, where n 2, with nonnegative observed entries along the diagonal: M = a 1... a n. Then M is completable if and only if n i=1 ai 1. In the special case n i=1 ai = 1, the partial matrix M has a unique completion.

Tensors Theorem (Kubjas,Rosen) Suppose we are given a partial probability tensor T (R n ) d with nonnegative observed entries a i along the diagonal, and all other entries unobserved. Then T is completable if and only if n i=1 a 1/d i 1.

Hyperdeterminant Theorem (Cayley) A tensor A can be reduced to the form if and only if Det(A) = 0 where (Det) is the equation of a irreducible hypersurface, called hyperdeterminant. When the triangle inequality n j i j n i is satisfied, the hyperdeterminant gives the dual variety of Segre variety of decomposable tensors.

Robeva equations Orthogonal setting again. E. Robeva gives (quadratical) equations for the tensors that can be reduced to diagonal form, up to orthogonal transformation.

The minimum distance problem Let X A n R be an algebraic variety, let p An R. We look for the points q X which minimize the euclidean distance d(p, q). A necessary condition, assuming q is a smooth point of X, is that the tangent space T q X be orthogonal to p q, this is the condition to get a critical point for the distance function d(p. ).

The case of matrices of bounded rank There is a relevant case when the minimum distance problem has been solved. Consider the affine space of n m matrices, and let X k be the variety of matrices of rank k. X 1 is the cone over the Segre variety P n 1 P m 1. X k is the k-secant variety of X 1. The matrices in X k which minimize the distance from A, are called the best rank k approximations of A.

Critical points in any rank. Lemma (SVD revisited) Let A = XDY be the SVD decomposition of A. The critical points of the distance function d A = d(a, ) from A to the variety X k of rank k matrices are given by Σ j {i1,...,i k }σ j x j y j for any subset of indices {i 1,..., i k }, where x i are the columns of X, y j are the rows of Y and D has σ i on the diagonal. The number of critical points for A of rank r k is ( r k). For a general m n matrix, assuming m n, it is ( ) m k.

Rank one approximation for tensors L.-H. Lim gave in 2005 a variational construction of critical rank one tensors, including a best rank one approximation. This includes singular t-ples and eigentensors (defined independently by L. Qi). E. Robeva work quoted before, fits actually in this area.

Counting eigentensors and singular d-ples of vectors The number of eigentensors in Sym d C n is (d 1) m 1 d 2 (Cartwright-Sturmfels) The number of singular d-ples in d C n is the coefficient of (t 1... t d ) m 1 in the polynomial ( ) m d j i t j t m i ( (O-Friedland) j i j) t t i i=1 These two computations has been performed by using respectively Toric Varieties and Chern Classes. Both these computations have been generalized with the concept of ED degree.

The Euclidean Distance Degree ED degree has the more general to measure the complexity of computing the distance from an object, which is in some sense a good measure of the complexity of the object itself. Joint work with J. Draisma, E. Horobet, B. Sturmfels, R. Thomas. Definition The Euclidean Distance(ED) degree of an affine variety X is the number of critical points of the euclidean distance d p from p to X, for general p. The ED degree of a projective variety X is the ED degree of its affine cone.

ED degree has a long history (Apollonius) A line has ED degree = 1. A circle has ED degree = 2. A parabola has ED degree = 3. A general conic has ED degree = 4.

The cardioid picture

Duality property The dual variety of X P(V ), can be identified, thanks to the metric, to a variety in the same P(V ). Theorem (Draisma-Horobet-O-Sturmfels-Thomas) Bijection of critical points for X and X Let X be the dual variety of X P(V ). Let u P(V ). If x X is a critical point for d(p, ) on X, then u x belongs to X and it is a critical point for d(p, ) on X. In particular Corollary ED degree X = ED degree X. Let u be a tensor and u be its best rank one approximation. Then the hyperdeterminant of u u is zero.

Thanks Thanks for your attention!