How to find good starting tensors for matrix multiplication

Similar documents
On the geometry of matrix multiplication

Complexity of Matrix Multiplication and Bilinear Problems

Powers of Tensors and Fast Matrix Multiplication

I. Approaches to bounding the exponent of matrix multiplication

Approaches to bounding the exponent of matrix multiplication

Inner Rank and Lower Bounds for Matrix Multiplication

Explicit tensors. Markus Bläser. 1. Tensors and rank

Fast Matrix Product Algorithms: From Theory To Practice

Fast matrix multiplication using coherent configurations

Algebras of minimal multiplicative complexity

Fast matrix multiplication

REPRESENTATION THEORY WEEK 5. B : V V k

On Strassen s Conjecture

Fast Matrix Multiplication: Limitations of the Laser Method

J.M. LANDSBERG AND MATEUSZ MICHA LEK

Complexity Theory of Polynomial-Time Problems

Matrix Multiplication

Definition 2.3. We define addition and multiplication of matrices as follows.

Ranks of Real Symmetric Tensors

Semisimple algebras of almost minimal rank over the reals

Fast reversion of power series

Group-theoretic approach to Fast Matrix Multiplication

* 8 Groups, with Appendix containing Rings and Fields.

Elementary Row Operations on Matrices

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

Geometric Complexity Theory and Matrix Multiplication (Tutorial)

Asymptotic Spectrum and Matrix Multiplication. ISSAC 2012, Grenoble

Algebras of Minimal Rank over Perfect Fields

Math 240 Calculus III

Representations of algebraic groups and their Lie algebras Jens Carsten Jantzen Lecture III

k times l times n times

PIT problems in the light of and the noncommutative rank algorithm

Math Linear Algebra Final Exam Review Sheet

Math Linear algebra, Spring Semester Dan Abramovich

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

Ma/CS 6b Class 12: Graphs and Matrices

INTRODUCTION TO LIE ALGEBRAS. LECTURE 1.

1.4 Solvable Lie algebras

Fast Approximate Matrix Multiplication by Solving Linear Systems

Problems in Linear Algebra and Representation Theory

Components and change of basis

Solution to Homework 1

REPRESENTATION THEORY OF S n

3.2 Real and complex symmetric bilinear forms

Algebraic Complexity Theory

MAT 2037 LINEAR ALGEBRA I web:

Linear Algebra Review

INTRODUCTION TO LIE ALGEBRAS. LECTURE 7.

CS 4424 Matrix multiplication

Improved bound for complexity of matrix multiplication

First we introduce the sets that are going to serve as the generalizations of the scalars.

Dynamic Matrix Rank.

Background on Chevalley Groups Constructed from a Root System

Matrices. Chapter Definitions and Notations

x y B =. v u Note that the determinant of B is xu + yv = 1. Thus B is invertible, with inverse u y v x On the other hand, d BA = va + ub 2

Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition

Homework 5 M 373K Mark Lindberg and Travis Schedler

A remarkable representation of the Clifford group

L(C G (x) 0 ) c g (x). Proof. Recall C G (x) = {g G xgx 1 = g} and c g (x) = {X g Ad xx = X}. In general, it is obvious that

Quadratic Forms. Marco Schlichting Notes by Florian Bouyer. January 16, 2012

Fast reversion of formal power series

Permutation representations and rational irreducibility

MTH 464: Computational Linear Algebra

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

TIGHT CLOSURE CONTENTS

Linear Algebra. Min Yan

Some notes on linear algebra

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

Integer multiplication with generalized Fermat primes

Algorithms and Data Structures Strassen s Algorithm. ADS (2017/18) Lecture 4 slide 1

Math 121 Homework 4: Notes on Selected Problems

Spectra of Semidirect Products of Cyclic Groups

Study Guide for Linear Algebra Exam 2

Ring Theory Problems. A σ

Lecture 11 The Radical and Semisimple Lie Algebras

OHSx XM511 Linear Algebra: Solutions to Online True/False Exercises

Tensor Rank is Hard to Approximate

9. Integral Ring Extensions

The Gelfand-Tsetlin Realisation of Simple Modules and Monomial Bases

Representations. 1 Basic definitions

Matrix & Linear Algebra

Jordan blocks. Defn. Let λ F, n Z +. The size n Jordan block with e-value λ is the n n upper triangular matrix. J n (λ) =

Determinants. Beifang Chen

Problem set 2. Math 212b February 8, 2001 due Feb. 27

1 Linear transformations; the basics

Tutorials. Algorithms and Data Structures Strassen s Algorithm. The Master Theorem for solving recurrences. The Master Theorem (cont d)

Real representations

Solutions to Example Sheet 1

Representation Theory

Formal power series rings, inverse limits, and I-adic completions of rings

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

Generic properties of Symmetric Tensors

SYMMETRIC DETERMINANTAL REPRESENTATIONS IN CHARACTERISTIC 2

Math 121 Homework 5: Notes on Selected Problems

MATH 323 Linear Algebra Lecture 6: Matrix algebra (continued). Determinants.

LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

1 Determinants. 1.1 Determinant

Lectures on Linear Algebra for IT

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS

Transcription:

How to find good starting tensors for matrix multiplication Markus Bläser Saarland University

Matrix multiplication z,... z,n..... z n,... z n,n = x,... x,n..... x n,... x n,n y,... y,n..... y n,... y n,n n z i,j = x i,k y k,j, k= i, j n entries are variables allowed operations: addition, multiplication, scalar multiplication

Strassen s algorithm ( ) ( z z 2 x x = 2 z 2 z 22 x 2 x 22 ) ( y y 2 y 2 y 22 ). p = (x + x 22 )(y + y 22 ), p 2 = (x + x 22 )y, p 3 = x (y 2 y 22 ), p 4 = x 22 ( y + y 2 ), p 5 = (x + x 2 )y 22, p 6 = ( x + x 2 )(y + y 2 ), p 7 = (x 2 x 22 )(y 2 + y 22 ). ( z z 2 z 2 z 22 ) ( p + p = 4 p 5 + p 7 p 3 + p 5 p 2 + p 4 p + p 3 p 2 + p 6 ).

Strassen s algorithm (2) 7 mults, 8 adds instead of 8 mults, 4 adds

Strassen s algorithm (2) 7 mults, 8 adds instead of 8 mults, 4 adds Observation: Strassen s algorithm works over any ring!

Strassen s algorithm (2) 7 mults, 8 adds instead of 8 mults, 4 adds Observation: Strassen s algorithm works over any ring! Recurse: ( ) ( ) ( ) =. C(n) 7 C(n/2) + O(n 2 ), C() = Theorem (Strassen) We can multiply n n-matrices with O(n log 2 7 ) = O(n 2.8 ) arithmetic operations

Tensor rank In general: bilinear forms b (X, Y),... b n (X, Y) in variables X = {x,..., x k } and Y = {y,..., y m }. Write n k m n b i z i = t h,i,j x h y i z j. j= h= i= j= t = (t h,i,j ) K k K m K n is the tensor corresponding to b,..., b n.

Tensor rank Definition u v w U V W is called a triad rank-one tensor. Definition (Rank) R(t) is the smallest r such that there are rank-one tensors t,..., t r with t = t + + t r. Lemma Let t U V W and t U V W. R(t t ) R(t) + R(t ) R(t t ) R(t)R(t )

Sums and products Direct sum t t (U U ) (V V ) (W W ): n k m k n m Tensor product t t (U U ) (V V ) (W W ):

Matrix multiplication tensor Example: 2 2-matrix multiplication 2, 2, 2 : x x 2 x 2 x 22 y y 2 y 2 y 22 z z 2 z 2 z 22 In general: t (h,h ),(i,i ),(j,j ) = δ h,iδ i,jδ j,h. Lemma R( k, m, n ) = R( n, k, m ) = = R( n, m, k ). k, m, n k, m, n = kk, mm, nn.

Strassen s algorithm and tensors Observation: Tensor product = Recursion Strassen s algorithm: 2, 2, 2 s = 2 s, 2 s, 2 s R( 2, 2, 2 s ) 7 s Definition (Exponent of matrix multiplication) ω = inf{τ R( n, n, n ) = O(n τ )} Strassen: ω log 2 7 2.8 Lemma If R( k, m, n ) r, then ω 3 log r log kmn.

What next? Maybe we can multiply 2 2-matrices with 6 multiplications? Theorem (Winograd) R( 2, 2, 2 ) 7 Open question (not so open anymore) Is there a small tensor n, n, n, say, n 0, which gives a better bound on the exponent than Strassen? Smirnov: R( 3, 3, 6 ) 40 ω 2.79

Border rank (example) Polynomial multiplication mod X 2 : (a 0 + a X)(b 0 + b X) = a 0 b }{{} 0 +(a b 0 + a 0 b )X + a }{{} b X 2 f 0 f 0 0 0 0 0 Observation R(t) = 3 However, t can be approximated by tensors of rank 2. t(ɛ) = (, ɛ) (, ɛ) (0, ɛ ) + (, 0) (, 0) (, ɛ ) 0 0 0 0 ɛ

Proof of observation restrictions Definition Let A : U U, B : V V, C : W W be homomorphism. (A B C)(u v w) = A(u) B(v) C(w) (A B C)t = r i= A(u i) B(v i ) C(w i ) for t = r i= u i v i w i. t t if there are A, B, C such that t = (A B C)t. ( restriction ). Lemma If t t, then R(t ) R(t) R(t) r iff t r. ( r diagonal of size r.)

Proof of observation 0 0 0 0 0 Let t = r i= u i v i w i. lin{w,..., w r } = K 2. Asume that w r = (, ). Let C be the projection along lin{w r } onto lin{(0, )}. (I I C)t = 0, which has rank 2.

Border rank Definition Let h N, t K k m n.. R h (t) = min{r u ρ K[ɛ] k, v ρ K[ɛ] m, w ρ K[ɛ] n : r u ρ v ρ w ρ = ɛ h t + O(ɛ h+ )}. ρ= 2. R(t) = min h R h (t). R(t) is called the border rank of t. Bini, Capovani, Lotti, Romani: R( 2, 2, 3 ) 0. Lemma If R( k, m, n ) r, then ω 3 Corollary ω 2.79. log r log kmn.

Schönhage s τ-theorem Schönhage: R( k,, n, (k )(n ), ) kn +. Theorem (Schönhage s τ-theorem) If R( p k i, m i, n i ) r with r > p then ω 3τ where τ is i= defined by p (k i m i n i ) τ = r. i= Corollary ω 2.55.

Strassen s tensor q Str = (e i e 0 e i + e }{{} 0 e i e i ) }{{} i= q,,,,q = q (e 0 + ɛe i ) (e 0 + ɛe i ) e i ɛ ɛ e 0 e 0 i= q e i + O(ɛ) i=

Rank versus border rank Theorem R(Str) = 2q. Let Str = r i= u i v i w i. W.l.o.g. assume that u r / lin{e 0,..., e q }. Let A be the projection along u r onto lin{e 0,..., e q }. Let B be the projection along e q onto lin{e 0,..., e q }. R(A I B) Str) R(Str). (A I B) Str is like Str with one inner tensor now being q,,. Do this q times and kill q triads. We are left with a matrix of rank q. Gap of almost 2 between rank and border rank.

Laser method Think of Strassen s tensor having an outer and an inner structure: Cut Str into (combinatorial) cubiods! inner tensors: q,,,,, q outer structure: Put in every cubiod that is nonzero., 2,. (Str π Str π 2 Str) s has inner tensors x, y, z with xyz = q 3s, outer tensor 2 s, 2 s, 2 s.

Degeneration Definition. Let t = r u ρ v ρ w ρ K k m n, A(ɛ) K[ɛ] k k, ρ= B(ɛ) K[ɛ] m m, and C(ɛ) K[ɛ] n n. Define (A(ɛ) B(ɛ) C(ɛ))t = r A(ɛ)u ρ B(ɛ)v ρ C(ɛ)w ρ. ρ= 2. t is a degeneration of t K k m n ( t t ), if there are A(ɛ), B(ɛ), C(ɛ), and q such that ɛ q t = (A(ɛ) B(ɛ) C(ɛ))t + O(ɛ q+ ). Remark R(t) r t r

Laser method (2) A degeneration (A(ɛ), B(ɛ), C(ɛ)) is called monomial if all entries are monomials. Lemma (Strassen) 3 4 n2 n, n, n by a monomial degeneration. inner tensors x, y, z with xyz = q 3s, outer tensor 2 s, 2 s, 2 s. 2 2s independent matrix products with x, y, z with xyz = q 3s Now apply the τ-theorem! Corollary (Strassen) ω 2.48

Coppersmith Winograd tensor q ɛ 5 CW = ɛ (e 0 + ɛ 2 e i ) (e 0 + ɛ 2 e i ) (e 0 + ɛ 2 e i ) i= q q q (e 0 + ɛ 3 e i ) (e 0 + ɛ 3 e i ) (e 0 + ɛ 3 e i ) i= i= i= + ( qɛ) e 0 e 0 e 0 + O(ɛ 6 )

Coppersmith Winograd tensor Remark (last time, a doable open question) R(CW) = 2q +

Laser method (3) CW has inner structure q,,,, q,,,, q. outer structure 0 0 0 0 0 There is a general method how to degenerate large diagonals from arbitrary tensors. apply to outer tensor Corollary (Coppersmith & Winograd) ω 2.4 Coppersmith & Winograd, Stothers, Vassilevska-Williams, LeGall: ω 2.37...

What did we learn so far?

What did we learn so far? How to multiply matrices of astronomic sizes fast!

What did we learn so far? How to multiply matrices of astronomic sizes fast! If we want to multiply matrices of astronomic sizes even faster, we need tensors with border rank close to max{dim U, dim V, dim W} with a rich structure

What did we learn so far? How to multiply matrices of astronomic sizes fast! If we want to multiply matrices of astronomic sizes even faster, we need tensors with border rank close to max{dim U, dim V, dim W} with a rich structure or completely new methods.

Cheap approaches that do not work (not yet?) Main tools: R R 2, 2, 2 7 7 2, 2, 3 [9,0] 2, 2, 4 4 [2,4] 2, 3, 3 [4,5] [0,5] 3, 3, 3 [9,23] [5,20] Rank: substition method (Pan), de Groote s twist of it Border rank: vanishing equations (Strassen, Lickteig, Landsberg & Ottaviani) in combination with substitution method (Landsberg & Michalek, B & Lysikov) Did not find any upper bounds

Characterization problem Definition S n (q) = {t K n K n K n R(t) q}, X n (q) = {t K n K n K n R(t) q}. These definitions are in complexity-theoretic terms. We need algebraic terms. But: {t t q } is not very useful We need easy to check algebraic criteria. Remark: all tensors considered are tight.

S n (n) Theorem t S n (n) iff t = n

S n (n) Theorem t S n (n) iff t = n The multiplication in any finite dimensional algebra A can be described by a set of bilinear forms. tensor t A Example: A ɛ = K[X]/(X n ɛ) = K n A ɛ K[X]/(X n ) R(A) = 2n. Theorem (Alder Strassen) R(A) 2 dim A number of maximal twosided ideals.

X n (n) Definition Let t U V W. t is U -generic ( V, W ) if the U-slices (V, W) contain an invertible element. Proposition (B & Lysikov) Let t be U - and V -generic. Then there is an algebra A with structural tensor t A such that t A = t.

X n (n) Theorem (B & Lysikov) Let A and B be algebras with tensors t A and t B. Then t A GL 3 n t B iff t A GL n t B. Theorem (B & Lysikov) Let t be U - and V -generic. Then t X n (n) iff there is an algebra A such that t A = t and ta GL n n

Smoothable algebras Definition An algebra A of dimension n of the form K[X,..., X m ]/I for some ideal I is called smoothable if I is a degeneration of some ideal whose zero set consists of n distinct points. Theorem (B & Lysikov) Let t be U - and V -generic. Then t X n (n) iff there is a smoothable algebra A such that t A = t.

Examples Cartwright et al.: All (commutative) algebras of dimension 7 are smoothable. All algebras generated by two elements are smoothable. All algebras with dim rad(a) 2 / rad(a) 3 = All algebras defined by a monomial ideal. Str + has minimal border rank. Its structural tensor is isomorphic to k[x,..., X q ]/(X i X j i, j q) CW + has minimal border rank. Its structural tensor is isomorphic to k[x,..., X q+ ]/(X i X j, X 2 i X2 j, X3 i i j)

Comon s conjecture symmetric tensor = invariant under permutation of dimensions symmetric rank = use symmetric rank-one tensors Conjecture (Comon) For symmetric tensors, the rank equals the symmetric rank. Proposition The border rank Comon conjecture is true for -generic tensors of minimal border rank.

X n (n + ) Theorem t S n (n + ) \ S n (n) iff t is isomorphic to the multiplication tensors in the algebras K[X]/(X 2 ) K n 2 or T 2 K n 3. where T 2 is the algebra of upper triangular 2 2-matrices. Open question (Doable) What about X n (n + ) (for -generic tensors)?

The asymptotic rank of CW We know that R(CW q ) = q + 2. For fast matrix multiplication, good upper bounds on R(CW N q ) are sufficient. In particular, R(CW N 3 ) /N 3 implies ω = 2. Theorem (B. & Lysikov) R(CW N q ) (q + ) N + 2 N.