Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition
|
|
- Ginger Doyle
- 5 years ago
- Views:
Transcription
1 Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Grey Ballard SIAM onference on omputational Science & Engineering March, 207
2 ollaborators This is joint work with... Austin Benson - Stanford University hristian Ikenmeyer - Max-Planck Institute for Informatics Tammy Kolda - Sandia National Laboratories JM Landsberg - Texas A&M University Kathryn Rouse - Wake Forest University Nick Ryder - University of alifornia Berkeley Ballard 4
3 Fast Matrix Multiplication Definition Fast matrix multiplication algorithms require o(n 3 ) arithmetic operations to multiply n n matrices. Examples Strassen [Str69]: O(n log 2 7 )=O(n 2.8 ) oppersmith-winograd [W87]: O(n ) Le Gall [LG4]: O(n ) Ballard 5
4 Strassen s Algorithm Strassen s algorithm uses 7 multiplies for 2 2 multiplication apple apple apple 2 A A = 2 B B A 2 A 22 B 2 B 22 lassical Algorithm M = A B M 2 = A 2 B 2 M 3 = A B 2 M 4 = A 2 B 22 M 5 = A 2 B M 6 = A 22 B 2 M 7 = A 2 B 2 Strassen s Algorithm M = (A + A 22 ) (B + B 22 ) M 2 = (A 2 + A 22 ) B M 3 = A (B 2 B 22 ) M 4 = A 22 (B 2 B ) M 5 = (A + A 2 ) B 22 M 6 = (A 2 A ) (B + B 2 ) M 7 = (A 2 A 22 ) (B 2 + B 22 ) M 8 = A 22 B 22 = M + M 2 2 = M 3 + M 4 2 = M 5 + M 6 22 = M 7 + M 8 = M + M 4 M 5 + M 7 2 = M 3 + M 5 2 = M 2 + M 4 22 = M M 2 + M 3 + M 6 Ballard 6
5 Strassen s Algorithm Strassen s algorithm uses 7 multiplies for 2 2 multiplication apple apple apple 2 A A = 2 B B A 2 A 22 B 2 B 22 For n n matrices, we split into quadrants and use recursion Flop count recurrence: F(n) =7 F(n/2)+O(n 2 ) F () = F(n) =O n log 7 2 log Strassen s Algorithm M = (A + A 22 ) (B + B 22 ) M 2 = (A 2 + A 22 ) B M 3 = A (B 2 B 22 ) M 4 = A 22 (B 2 B ) M 5 = (A + A 2 ) B 22 M 6 = (A 2 A ) (B + B 2 ) M 7 = (A 2 A 22 ) (B 2 + B 22 ) = M + M 4 M 5 + M 7 2 = M 3 + M 5 2 = M 2 + M 4 22 = M M 2 + M 3 + M 6 Ballard 6
6 Recursion allows us to focus on base case 2 2 apple a a 2 a 2 a 22 apple apple b b 2 c c = 2 b 2 b 22 c 2 c 22 multiplies flop count O n 2.58 O n 2.8 O n 3 Ballard 7
7 Recursion allows us to focus on base case 2 2 apple a a 2 a 2 a 22 apple apple b b 2 c c = 2 b 2 b 22 c 2 c 22 multiplies flop count O n 2.58 O n 2.8 O n 3 Ballard 7
8 Recursion allows us to focus on base case 2 2 apple a a 2 a 2 a 22 apple apple b b 2 c c = 2 b 2 b 22 c 2 c 22 multiplies flop count O n 2.58 O n 2.8 O n a a 2 a 3 b b 2 b 3 c c 2 c 3 4a 2 a 22 a b 2 b 22 b 23 5 = 4c 2 c 22 c 23 5 a 3 a 32 a 33 b 3 b 32 b 33 c 3 c 32 c 33 multiplies flop count O n 2.68 O n 2.77 O n 2.85 O n 3 Ballard 7
9 Searching for a base case algorithm Finding a base case algorithm corresponds to computing an exact P decomposition of a particular 3D tensor = + + RX T = u r v r w r r= *Note the = sign: we re not looking for an approximation Ballard 8
10 2 2 matrix multiplication as a tensor operation a a A B = 2 b b 2 c c = 2 = a 2 a 22 b 2 b 22 c 2 c 22 is equivalent to 0 a T a 2 b = T Ba a 2 A Bb 2 b 2 A = Bc c 2 A = c a 22 b 22 c 22 0 b 0 c where T is a tensor with the following slices: 0 T = A T 2 = 0 A T 3 = 0 0 A T 4 = A Ballard 9
11 2 2 matrix multiplication as a tensor operation a a A B = 2 b b 2 c c = 2 = a 2 a 22 b 2 b 22 c 2 c 22 is equivalent to 0 a T a 2 b = T Ba a 2 A Bb 2 b 2 A = Bc c 2 A = c a 22 b 22 c 22 0 b 0 c For example: a b T 2 Ba a 2 A Bb 2 2@ b 2 A = a a 2 a 2 a 22 a 22 b 22 0 b Bb 2 b 2 A = a 2b +a 22 b 2 = c 2 b 22 Ballard 9
12 General matrix multiplication tensor hm, P, Ni means multiplying M P by P N P N N M A P B M Matrix multiplication tensor T is MP PN MN Number of s in T is MPN hm, P, Ni hn, M, Pi hp, N, Mi hp, M, Ni hm, N, Pi hn, P, Mi Ballard 0
13 Matrix multiplication using low-rank decomposition Here s the matrix multiplication as tensor operation again: T a 2 b = c Here s our low-rank decomposition: RX T = u r v r w r r= Here s an encoding of our matrix multiplication algorithm: T a 2 b = RX (u r v r w r ) a 2 b = r= RX r= h i (a T u r ) (b T v r ) w r = c Ballard
14 onnection between factor matrices and algorithm Strassen s algorithm Strassen s factor matrices: M = (A + A 22 ) (B + B 22 ) M 2 = (A 2 + A 22 ) B M 3 = A (B 2 B 22 ) M 4 = A 22 (B 2 B ) M 5 = (A + A 2 ) B 22 M 6 = (A 2 A ) (B + B 2 ) M 7 = (A 2 A 22 ) (B 2 + B 22 ) = M + M 4 M 5 + M 7 2 = M 3 + M 5 2 = M 2 + M 4 22 = M M 2 + M 3 + M U = B A V = B A W = B A U, V, W matrices encode the algorithm Ballard 2
15 onnection between factor matrices and algorithm Strassen s algorithm Strassen s factor matrices: M = (A + A 22 ) (B + B 22 ) M 2 = (A 2 + A 22 ) B M 3 = A (B 2 B 22 ) M 4 = A 22 (B 2 B ) M 5 = (A + A 2 ) B 22 M 6 = (A 2 A ) (B + B 2 ) M 7 = (A 2 A 22 ) (B 2 + B 22 ) = M + M 4 M 5 + M 7 2 = M 3 + M 5 2 = M 2 + M 4 22 = M M 2 + M 3 + M 6 U V W M M 2 M 3 M 4 M 5 M 6 M 7 A A 2 A 2 A 22 B B 2 B 2 B U, V, W matrices encode the algorithm Ballard 2
16 How do you solve it? Problem: Find U, V, W such that T = P u r v r w r the problem is NP-complete (for general T) many combinatorial formulations of the problem efficient numerical methods can compute low-rank approximations typical approach is alternating least squares (ALS) pitfall: getting stuck at local minima > 0 pitfall: facing ill-conditioned linear least squares problems pitfall: numerical solution is good only to machine precision we seek exact, discrete, and sparse solutions Ballard 3
17 Alternating least squares with regularization Most successful scheme due to Smirnov [Smi3] Repeat U = arg min U T (U) U(W V) T 2 F + U Ũ 2 F 2 V = arg min V T (V ) V(W U) T 2 F + V Ṽ 2 F 3 W = arg min W Until convergence T (W ) W(V U) T 2 F + W W 2 F Art of optimization scheme in tinkering with, Ũ, Ṽ, W (each iteration) Ballard 4
18 Discovered algorithms (a subset) Algorithm Multiplies Multiplies Speedup per base case (fast) (classical) recursive step! 0 h2, 2, 3i [BB5] 2 9% 2.89 h2, 2, 5i [BB5] 8 20 % 2.89 h2, 2, 2i [Str69] 7 8 4% 2.8 h2, 2, 4i [BB5] 4 6 4% 2.85 h3, 3, 3i [BB5] % 2.85 h2, 3, 3i [BB5] % 2.8 h2, 3, 4i [BB5] % 2.83 h2, 4, 4i [BB5] % 2.82 h3, 3, 4i [BB5] % 2.82 h3, 4, 4i [Smi7] % 2.82 h3, 3, 6i [Smi3] % 2.77 h2, 2, 3i* [BRL79] % 2.78 h3, 3, 3i* [Sch8] % 2.77 Ballard 5
19 Are these algorithms faster in practice? Effective GFLOPS Square Matrix Multiplication [BB5] Dimension (n) Ballard 6
20 Are these algorithms faster in practice? Effective GFLOPS Square Matrix Multiplication [BB5] 22 MKL STRASSEN 20 BINI <3,2,3> 8 <3,3,3> <4,2,4> <3,3,6> Dimension (n) Ballard 6
21 Rectangular Matrix Multiplication [HRMvdG7] k m = n = 4400, level, AB, core, Actual 34 Effective GFLOPS (2 m n k/time) Effe Ballard m , 2, 2 2, 3, 2 2, 3, 4 2, 4, 3 2, 5, 2 3, 2, 2 3, 2, 3 3, 2, 4 3, 3, 2 3, 3, 3 3, 3, 6 3, 4, 2 3, 4, 3 3, 5, 3 3, 6, 3 4, 2, 3 4, 2, 4 4, 3, 2 4, 3, 3 6, 3, 3 BLIS 4, 4, 2 5, 2, 2 4, 2, 2 M KL Effective GFLOPS (2 m n k/time) Effe Are these algorithms faster in practice? k m = n = 4400, level, Naive, core, Actual m 6 =
22 How big can we go? urrent numerical techniques are hitting their limits tensor size grows like N 6 if M = P = N number of variables grows faster than 3N 4 if M = P = N Nothing new has been found for h4, 4, 4i Strassen s h2, 2, 2i algorithm can be used twice an we exploit properties particular to matrix multiplication? Ballard 7
23 yclic symmetry of square matrix multiplication Let M be the matrix multiplication tensor for M = P = N M has cyclic symmetry: m ijk = m kij = m jki Ballard 8
24 yclic symmetry of square matrix multiplication Let M be the matrix multiplication tensor for M = P = N M has cyclic symmetry: m ijk = m kij = m jki This means if U, V, W is a solution, then so are W, U, V and V, W, U Ballard 8
25 yclic symmetry of square matrix multiplication Let M be the matrix multiplication tensor for M = P = N M has cyclic symmetry: m ijk = m kij = m jki This means if U, V, W is a solution, then so are W, U, V and V, W, U Is this property reflected in the low-rank decomposition? X u r v r w r X w r u r v r X v r w r u r? r r r Ballard 8
26 yclic invariance of Strassen s algorithm U = V = W = A A A Ballard 9
27 yclic invariance of Strassen s algorithm U = V = W = A A A If you cyclically permute U, V, W, you get the same rank-one components in a different order Ballard 9
28 Searching for cyclic-invariant solutions U A B D N 2 V A D B N 2 W A D B N 2 S T T R = S + 3T T Ballard 20
29 yclic-invariant decompositions Decomposition with no constraints RX T = u r v r w r r= Decomposition with cyclic-invariant constraint SX TX T = a s a s a s + (b t c t d t + d t b t c t + c t d t b t ) s= t= Number of variables reduced by factor of 3, but expression is no longer multilinear (not linear in A) Ballard 2
30 New rank-23 cyclic-invariant solutions for h3, 3, 3i A B D U = A B D V = A D B W = A D B Rank-23 is the best known exact rank for h3, 3, 3i; many previous solutions exist but none are cyclic invariant We computed cyclic-invariant solutions with S = 2, 5, Ballard 22
31 What about h4, 4, 4i? Performing two steps of Strassen s algorithm yields rank-49 cyclic-invariant solution No known exact decomposition of rank < 49 cyclic invariant or otherwise Ballard 23
32 What about h4, 4, 4i? Performing two steps of Strassen s algorithm yields rank-49 cyclic-invariant solution No known exact decomposition of rank < 49 cyclic invariant or otherwise No success yet in computing cyclic-invariant solutions but the truth is out there Ballard 23
33 Summary Discovering fast matrix multiplication algorithms corresponds to computing an exact P decomposition Any new algorithm can be implemented efficiently Exploiting symmetry of matrix multiplication can reduce the size of the problem, want to tackle h4, 4, 4i Still exploring how to use more structure to scale up to larger dimensions Ballard 24
34 Thank You! Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Grey Ballard Ballard 25
35 More structure... Transposition symmetry where t jik = t 0 ijk T 0 = T P 2 P 3 P and P is the vec-permutation matrix Ballard 26
36 More structure... Transposition symmetry where t jik = t 0 ijk T 0 = T P 2 P 3 P and P is the vec-permutation matrix This is derived from the fact that AB = implies B T A T = T Ballard 26
37 More structure... Multiplication invariance T = T Y T X 2 Z T Y 3 Z T X where X, Y, Z are nonsingular matrices Ballard 27
38 More structure... Multiplication invariance T = T Y T X 2 Z T Y 3 Z T X where X, Y, Z are nonsingular matrices This is derived from the fact that AB = implies XAY YBZ = XZ Ballard 27
39 Example algorithm: h4, 2, 4i Partition matrices like this: A A 2 apple A 2 A 22 7 B B 2 B 3 B 4 4A 3 A 32 5 = B 2 B 22 B 23 B A 4 A Take 26 linear combos of A ij s according to U (68 adds) 2 Take 26 linear combos of B ij s according to V (52 adds) 3 Perform 26 multiplies (recursively) 4 Take linear combos of outputs to form ij s acc. to W (69 adds) lassical algorithm performs 32 multiplies yielding a possible speedup of 23% per step Ballard 28
40 References I Austin R. Benson and Grey Ballard. A framework for practical parallel fast matrix multiplication. In Proceedings of the 20th AM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 205, pages 42 53, New York, NY, USA, February 205. AM. D. Bini, M. apovani, F. Romani, and G. Lotti. O(n ) complexity for n n approximate matrix multiplication. Information Processing Letters, 8(5): , 979. D. oppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. In Proceedings of the Nineteenth Annual AM Symposium on Theory of omputing, STO 87, pages 6, New York, NY, USA, 987. AM. Jianyu Huang, Leslie Rice, Devin A. Matthews, and Robert van de Geijn. Generating families of practical fast matrix multiplication algorithms. In Proceedings of the 3st IEEE International Parallel and Distributed Processing Symposium, 207. To appear. François Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic omputation, ISSA 4, pages , New York, NY, USA, 204. AM. A. Schönhage. Partial and total matrix multiplication. SIAM Journal on omputing, 0(3): , 98. Ballard 29
41 References II A.V. Smirnov. The bilinear complexity and practical algorithms for matrix multiplication. omputational Mathematics and Mathematical Physics, 53(2):78 795, 203. Alexey Smirnov. Several bilinear algorithms for matrix multiplication. Technical report, ResearchGate, 207. V. Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 3: , /BF Ballard 30
On the geometry of matrix multiplication
On the geometry of matrix multiplication J.M. Landsberg Texas A&M University Spring 2018 AMS Sectional Meeting 1 / 27 A practical problem: efficient linear algebra Standard algorithm for matrix multiplication,
More informationPowers of Tensors and Fast Matrix Multiplication
Powers of Tensors and Fast Matrix Multiplication François Le Gall Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo Simons Institute, 12 November
More informationComplexity of Matrix Multiplication and Bilinear Problems
Complexity of Matrix Multiplication and Bilinear Problems François Le Gall Graduate School of Informatics Kyoto University ADFOCS17 - Lecture 3 24 August 2017 Overview of the Lectures Fundamental techniques
More informationI. Approaches to bounding the exponent of matrix multiplication
I. Approaches to bounding the exponent of matrix multiplication Chris Umans Caltech Based on joint work with Noga Alon, Henry Cohn, Bobby Kleinberg, Amir Shpilka, Balazs Szegedy Modern Applications of
More informationStrassen s Algorithm for Tensor Contraction
Strassen s Algorithm for Tensor Contraction Jianyu Huang, Devin A. Matthews, Robert A. van de Geijn The University of Texas at Austin September 14-15, 2017 Tensor Computation Workshop Flatiron Institute,
More informationA non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications
A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications Alexandre Sedoglavic To cite this version: Alexandre Sedoglavic. A non-commutative algorithm for multiplying (7 7) matrices
More informationHow to find good starting tensors for matrix multiplication
How to find good starting tensors for matrix multiplication Markus Bläser Saarland University Matrix multiplication z,... z,n..... z n,... z n,n = x,... x,n..... x n,... x n,n y,... y,n..... y n,... y
More informationStrassen-like algorithms for symmetric tensor contractions
Strassen-like algorithms for symmetric tensor contractions Edgar Solomonik Theory Seminar University of Illinois at Urbana-Champaign September 18, 2017 1 / 28 Fast symmetric tensor contractions Outline
More informationFast reversion of formal power series
Fast reversion of formal power series Fredrik Johansson LFANT, INRIA Bordeaux RAIM, 2016-06-29, Banyuls-sur-mer 1 / 30 Reversion of power series F = exp(x) 1 = x + x 2 2! + x 3 3! + x 4 G = log(1 + x)
More information9. Numerical linear algebra background
Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization
More informationA non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications
A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications Alexandre Sedoglavic To cite this version: Alexandre Sedoglavic. A non-commutative algorithm for multiplying (7 7) matrices
More informationSpectrum and Pseudospectrum of a Tensor
Spectrum and Pseudospectrum of a Tensor Lek-Heng Lim University of California, Berkeley July 11, 2008 L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 1 / 27 Matrix eigenvalues
More informationFast Matrix Product Algorithms: From Theory To Practice
Introduction and Definitions The τ-theorem Pan s aggregation tables and the τ-theorem Software Implementation Conclusion Fast Matrix Product Algorithms: From Theory To Practice Thomas Sibut-Pinote Inria,
More informationOn Strassen s Conjecture
On Strassen s Conjecture Elisa Postinghel (KU Leuven) joint with Jarek Buczyński (IMPAN/MIMUW) Daejeon August 3-7, 2015 Elisa Postinghel (KU Leuven) () On Strassen s Conjecture SIAM AG 2015 1 / 13 Introduction:
More informationHyperdeterminants, secant varieties, and tensor approximations
Hyperdeterminants, secant varieties, and tensor approximations Lek-Heng Lim University of California, Berkeley April 23, 2008 Joint work with Vin de Silva L.-H. Lim (Berkeley) Tensor approximations April
More informationBlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas
BlockMatrixComputations and the Singular Value Decomposition ATaleofTwoIdeas Charles F. Van Loan Department of Computer Science Cornell University Supported in part by the NSF contract CCR-9901988. Block
More informationAvoiding Communication in Linear Algebra
Avoiding Communication in Linear Algebra Grey Ballard Microsoft Research Faculty Summit July 15, 2014 Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation,
More informationFast Approximate Matrix Multiplication by Solving Linear Systems
Electronic Colloquium on Computational Complexity, Report No. 117 (2014) Fast Approximate Matrix Multiplication by Solving Linear Systems Shiva Manne 1 and Manjish Pal 2 1 Birla Institute of Technology,
More informationLU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark
DM559 Linear and Integer Programming LU Factorization Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [Based on slides by Lieven Vandenberghe, UCLA] Outline
More information(Group-theoretic) Fast Matrix Multiplication
(Group-theoretic) Fast Matrix Multiplication Ivo Hedtke Data Structures and Efficient Algorithms Group (Prof Dr M Müller-Hannemann) Martin-Luther-University Halle-Wittenberg Institute of Computer Science
More informationEigenvalues of tensors
Eigenvalues of tensors and some very basic spectral hypergraph theory Lek-Heng Lim Matrix Computations and Scientific Computing Seminar April 16, 2008 Lek-Heng Lim (MCSC Seminar) Eigenvalues of tensors
More informationTHE COMPLEXITY OF THE QUATERNION PROD- UCT*
1 THE COMPLEXITY OF THE QUATERNION PROD- UCT* Thomas D. Howell Jean-Claude Lafon 1 ** TR 75-245 June 1975 2 Department of Computer Science, Cornell University, Ithaca, N.Y. * This research was supported
More informationTensor Decompositions and Applications
Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =
More informationApproaches to bounding the exponent of matrix multiplication
Approaches to bounding the exponent of matrix multiplication Chris Umans Caltech Based on joint work with Noga Alon, Henry Cohn, Bobby Kleinberg, Amir Shpilka, Balazs Szegedy Simons Institute Sept. 7,
More informationLecture 2. Tensor Iterations, Symmetries, and Rank. Charles F. Van Loan
Structured Matrix Computations from Structured Tensors Lecture 2. Tensor Iterations, Symmetries, and Rank Charles F. Van Loan Cornell University CIME-EMS Summer School June 22-26, 2015 Cetraro, Italy Structured
More informationI Don t Care about BLAS
I Don t Care about BLAS Devin Matthews Institute for Computational Engineering and Sciences, UT Austin #smallmoleculesmatter BLIS Retreat, Sept. 18, 217 1 From QC to DGEMM H ˆR i = E ˆR i Simple eigenproblem
More information9. Numerical linear algebra background
Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization
More informationExplicit tensors. Markus Bläser. 1. Tensors and rank
Explicit tensors Markus Bläser Abstract. This is an expository article the aim of which is to introduce interested students and researchers to the topic of tensor rank, in particular to the construction
More informationInner Rank and Lower Bounds for Matrix Multiplication
Inner Rank and Lower Bounds for Matrix Multiplication Joel Friedman University of British Columbia www.math.ubc.ca/ jf Jerusalem June 19, 2017 Joel Friedman (UBC) Inner Rank and Lower Bounds June 19, 2017
More informationMATRICES. a m,1 a m,n A =
MATRICES Matrices are rectangular arrays of real or complex numbers With them, we define arithmetic operations that are generalizations of those for real and complex numbers The general form a matrix of
More informationALGORITHMS FOR CENTROSYMMETRIC AND SKEW-CENTROSYMMETRIC MATRICES. Iyad T. Abu-Jeib
ALGORITHMS FOR CENTROSYMMETRIC AND SKEW-CENTROSYMMETRIC MATRICES Iyad T. Abu-Jeib Abstract. We present a simple algorithm that reduces the time complexity of solving the linear system Gx = b where G is
More informationTensor Rank is Hard to Approximate
Electronic Colloquium on Computational Complexity, Report No. 86 (2018) Tensor Rank is Hard to Approximate Joseph Swernofsky - KTH Royal Institute of Technology April 20, 2018 Abstract We prove that approximating
More informationNotes on vectors and matrices
Notes on vectors and matrices EE103 Winter Quarter 2001-02 L Vandenberghe 1 Terminology and notation Matrices, vectors, and scalars A matrix is a rectangular array of numbers (also called scalars), written
More informationarxiv: v1 [cs.cc] 25 Oct 2016
THE GEOMETRY OF RANK DECOMPOSITIONS OF MATRIX MULTIPLICATION I: 2 2 MATRICES LUCA CHIANTINI, CHRISTIAN IKENMEYER, J.M. LANDSBERG, AND GIORGIO OTTAVIANI arxiv:60.08364v [cs.cc] 25 Oct 206 Abstract. This
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationParallel Tensor Compression for Large-Scale Scientific Data
Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel
More informationLU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version
1 LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version Amal Khabou Advisor: Laura Grigori Université Paris Sud 11, INRIA Saclay France SIAMPP12 February 17, 2012 2
More informationCyclops Tensor Framework: reducing communication and eliminating load imbalance in massively parallel contractions
Cyclops Tensor Framework: reducing communication and eliminating load imbalance in massively parallel contractions Edgar Solomonik 1, Devin Matthews 3, Jeff Hammond 4, James Demmel 1,2 1 Department of
More informationAn introduction to parallel algorithms
An introduction to parallel algorithms Knut Mørken Department of Informatics Centre of Mathematics for Applications University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 1/26
More informationELEMENTARY LINEAR ALGEBRA
ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,
More informationA matrix over a field F is a rectangular array of elements from F. The symbol
Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted
More informationCHAPTER 6. Direct Methods for Solving Linear Systems
CHAPTER 6 Direct Methods for Solving Linear Systems. Introduction A direct method for approximating the solution of a system of n linear equations in n unknowns is one that gives the exact solution to
More informationGeneralizing of ahigh Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures
Generalizing of ahigh Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures Duc Kien Nguyen 1,IvanLavallee 2,Marc Bui 2 1 CHArt -Ecole Pratique des Hautes Etudes &UniversitéParis
More informationCopyright 2000, Kevin Wayne 1
//8 Fast Integer Division Too (!) Schönhage Strassen algorithm CS 8: Algorithm Design and Analysis Integer division. Given two n-bit (or less) integers s and t, compute quotient q = s / t and remainder
More informationAlgebraic models for higher-order correlations
Algebraic models for higher-order correlations Lek-Heng Lim and Jason Morton U.C. Berkeley and Stanford Univ. December 15, 2008 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations
More informationBlock-tridiagonal matrices
Block-tridiagonal matrices. p.1/31 Block-tridiagonal matrices - where do these arise? - as a result of a particular mesh-point ordering - as a part of a factorization procedure, for example when we compute
More informationMODEL ANSWERS TO THE THIRD HOMEWORK
MODEL ANSWERS TO THE THIRD HOMEWORK 1 (i) We apply Gaussian elimination to A First note that the second row is a multiple of the first row So we need to swap the second and third rows 1 3 2 1 2 6 5 7 3
More informationMatrix Multiplication
Matrix Multiplication Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. n c ij = a ik b kj k=1 c 11 c 12 c 1n c 21 c 22 c 2n c n1 c n2 c nn = a 11 a 12 a 1n
More information5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)
5.1 Banded Storage u = temperature u= u h temperature at gridpoints u h = 1 u= Laplace s equation u= h u = u h = grid size u=1 The five-point difference operator 1 u h =1 uh (x + h, y) 2u h (x, y)+u h
More informationMatrices: 2.1 Operations with Matrices
Goals In this chapter and section we study matrix operations: Define matrix addition Define multiplication of matrix by a scalar, to be called scalar multiplication. Define multiplication of two matrices,
More informationSpeeding up the Four Russians Algorithm by About One More Logarithmic Factor
Speeding up the Four Russians Algorithm by About One More Logarithmic Factor Timothy M. Chan September 19, 2014 Abstract We present a new combinatorial algorithm for Boolean matrix multiplication that
More informationAn Even Order Symmetric B Tensor is Positive Definite
An Even Order Symmetric B Tensor is Positive Definite Liqun Qi, Yisheng Song arxiv:1404.0452v4 [math.sp] 14 May 2014 October 17, 2018 Abstract It is easily checkable if a given tensor is a B tensor, or
More informationComputer Vision, Convolutions, Complexity and Algebraic Geometry
Computer Vision, Convolutions, Complexity and Algebraic Geometry D. V. Chudnovsky, G.V. Chudnovsky IMAS Polytechnic Institute of NYU 6 MetroTech Center Brooklyn, NY 11201 December 6, 2012 Fast Multiplication:
More informationParallel Singular Value Decomposition. Jiaxing Tan
Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector
More informationNIELINIOWA OPTYKA MOLEKULARNA
NIELINIOWA OPTYKA MOLEKULARNA chapter 1 by Stanisław Kielich translated by:tadeusz Bancewicz http://zon8.physd.amu.edu.pl/~tbancewi Poznan,luty 2008 ELEMENTS OF THE VECTOR AND TENSOR ANALYSIS Reference
More informationGraph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication
Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication Grey Ballard James Demmel Olga Holtz Benjamin Lipshitz Oded Schwartz Electrical Engineering and Computer Sciences
More informationFast Matrix Multiplication*
mathematics of computation, volume 28, number 125, January, 1974 Triangular Factorization and Inversion by Fast Matrix Multiplication* By James R. Bunch and John E. Hopcroft Abstract. The fast matrix multiplication
More informationStrassen-like algorithms for symmetric tensor contractions
Strassen-lie algorithms for symmetric tensor contractions Edgar Solomoni University of Illinois at Urbana-Champaign Scientfic and Statistical Computing Seminar University of Chicago April 13, 2017 1 /
More informationPreliminaries and Complexity Theory
Preliminaries and Complexity Theory Oleksandr Romanko CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 16, 2006 Introduction Book structure: 2 Part I Linear Algebra
More informationCS 4424 Matrix multiplication
CS 4424 Matrix multiplication 1 Reminder: matrix multiplication Matrix-matrix product. Starting from a 1,1 a 1,n A =.. and B = a n,1 a n,n b 1,1 b 1,n.., b n,1 b n,n we get AB by multiplying A by all columns
More informationIntroduction to Algorithms 6.046J/18.401J/SMA5503
Introduction to Algorithms 6.046J/8.40J/SMA5503 Lecture 3 Prof. Piotr Indyk The divide-and-conquer design paradigm. Divide the problem (instance) into subproblems. 2. Conquer the subproblems by solving
More informationDense LU factorization and its error analysis
Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,
More informationAlgorithms for structured matrix-vector product of optimal bilinear complexity
Algorithms for structured matrix-vector product of optimal bilinear complexity arxiv:160306658v1 mathna] Mar 016 Ke Ye Computational Applied Mathematics Initiative Department of Statistics University of
More informationDiscrete Orthogonal Polynomials on Equidistant Nodes
International Mathematical Forum, 2, 2007, no. 21, 1007-1020 Discrete Orthogonal Polynomials on Equidistant Nodes Alfredo Eisinberg and Giuseppe Fedele Dip. di Elettronica, Informatica e Sistemistica Università
More informationLecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan
From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 4. Tensor-Related Singular Value Decompositions Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010
More informationLinear Equations and Matrix
1/60 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Gaussian Elimination 2/60 Alpha Go Linear algebra begins with a system of linear
More informationReview of Matrices and Block Structures
CHAPTER 2 Review of Matrices and Block Structures Numerical linear algebra lies at the heart of modern scientific computing and computational science. Today it is not uncommon to perform numerical computations
More informationCyclic Reduction History and Applications
ETH Zurich and Stanford University Abstract. We discuss the method of Cyclic Reduction for solving special systems of linear equations that arise when discretizing partial differential equations. In connection
More informationCyclops Tensor Framework
Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r
More informationYale university technical report #1402.
The Mailman algorithm: a note on matrix vector multiplication Yale university technical report #1402. Edo Liberty Computer Science Yale University New Haven, CT Steven W. Zucker Computer Science and Appled
More informationOn Solving Large Algebraic. Riccati Matrix Equations
International Mathematical Forum, 5, 2010, no. 33, 1637-1644 On Solving Large Algebraic Riccati Matrix Equations Amer Kaabi Department of Basic Science Khoramshahr Marine Science and Technology University
More informationPracticality of Large Scale Fast Matrix Multiplication
Practicality of Large Scale Fast Matrix Multiplication Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz and Oded Schwartz UC Berkeley IWASEP June 5, 2012 Napa Valley, CA Research supported by
More informationHigh Performance Parallel Tucker Decomposition of Sparse Tensors
High Performance Parallel Tucker Decomposition of Sparse Tensors Oguz Kaya INRIA and LIP, ENS Lyon, France SIAM PP 16, April 14, 2016, Paris, France Joint work with: Bora Uçar, CNRS and LIP, ENS Lyon,
More informationWeek 15-16: Combinatorial Design
Week 15-16: Combinatorial Design May 8, 2017 A combinatorial design, or simply a design, is an arrangement of the objects of a set into subsets satisfying certain prescribed properties. The area of combinatorial
More informationComputing rank-width exactly
Computing rank-width exactly Sang-il Oum Department of Mathematical Sciences KAIST 335 Gwahangno Yuseong-gu Daejeon 305-701 South Korea September 9, 2008 Abstract We prove that the rank-width of an n-vertex
More informationDiscrete Structures Lecture Sequences and Summations
Introduction Good morning. In this section we study sequences. A sequence is an ordered list of elements. Sequences are important to computing because of the iterative nature of computer programs. The
More informationComplexity Theory of Polynomial-Time Problems
Complexity Theory of Polynomial-Time Problems Lecture 8: (Boolean) Matrix Multiplication Karl Bringmann Recall: Boolean Matrix Multiplication given n n matrices A, B with entries in {0,1} compute matrix
More informationCS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra
CS227-Scientific Computing Lecture 4: A Crash Course in Linear Algebra Linear Transformation of Variables A common phenomenon: Two sets of quantities linearly related: y = 3x + x 2 4x 3 y 2 = 2.7x 2 x
More information3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions
A. LINEAR ALGEBRA. CONVEX SETS 1. Matrices and vectors 1.1 Matrix operations 1.2 The rank of a matrix 2. Systems of linear equations 2.1 Basic solutions 3. Vector spaces 3.1 Linear dependence and independence
More informationFast Matrix Multiplication: Limitations of the Laser Method
Electronic Colloquium on Computational Complexity, Report No. 154 (2014) Fast Matrix Multiplication: Limitations of the Laser Method Andris Ambainis University of Latvia Riga, Latvia andris.ambainis@lu.lv
More informationEfficient algorithms for symmetric tensor contractions
Efficient algorithms for symmetric tensor contractions Edgar Solomonik 1 Department of EECS, UC Berkeley Oct 22, 2013 1 / 42 Edgar Solomonik Symmetric tensor contractions 1/ 42 Motivation The goal is to
More informationComputation of the mtx-vec product based on storage scheme on vector CPUs
BLAS: Basic Linear Algebra Subroutines BLAS: Basic Linear Algebra Subroutines BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix Computation of the mtx-vec product based on storage scheme on
More informationCS 580: Algorithm Design and Analysis
CS 580: Algorithm Design and Analysis Jeremiah Blocki Purdue University Spring 208 Announcement: Homework 3 due February 5 th at :59PM Final Exam (Tentative): Thursday, May 3 @ 8AM (PHYS 203) Recap: Divide
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 1. Basic Linear Algebra Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Example
More informationExploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning
Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Erin C. Carson, Nicholas Knight, James Demmel, Ming Gu U.C. Berkeley SIAM PP 12, Savannah, Georgia, USA, February
More informationOn the Exponent of the All Pairs Shortest Path Problem
On the Exponent of the All Pairs Shortest Path Problem Noga Alon Department of Mathematics Sackler Faculty of Exact Sciences Tel Aviv University Zvi Galil Department of Computer Science Sackler Faculty
More informationLecture 20 FUNDAMENTAL Theorem of Finitely Generated Abelian Groups (FTFGAG)
Lecture 20 FUNDAMENTAL Theorem of Finitely Generated Abelian Groups (FTFGAG) Warm up: 1. Let n 1500. Find all sequences n 1 n 2... n s 2 satisfying n i 1 and n 1 n s n (where s can vary from sequence to
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More informationTensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra
Tensors Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra This lecture is divided into two parts. The first part,
More information12. Cholesky factorization
L. Vandenberghe ECE133A (Winter 2018) 12. Cholesky factorization positive definite matrices examples Cholesky factorization complex positive definite matrices kernel methods 12-1 Definitions a symmetric
More informationLinear Systems of n equations for n unknowns
Linear Systems of n equations for n unknowns In many application problems we want to find n unknowns, and we have n linear equations Example: Find x,x,x such that the following three equations hold: x
More informationUsing Hankel structured low-rank approximation for sparse signal recovery
Using Hankel structured low-rank approximation for sparse signal recovery Ivan Markovsky 1 and Pier Luigi Dragotti 2 Department ELEC Vrije Universiteit Brussel (VUB) Pleinlaan 2, Building K, B-1050 Brussels,
More informationNext topics: Solving systems of linear equations
Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:
More informationIntroduction to Algorithms
Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that
More informationNumerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J Olver 3 Review of Matrix Algebra Vectors and matrices are essential for modern analysis of systems of equations algebrai, differential, functional, etc In this
More informationMAT 2037 LINEAR ALGEBRA I web:
MAT 237 LINEAR ALGEBRA I 2625 Dokuz Eylül University, Faculty of Science, Department of Mathematics web: Instructor: Engin Mermut http://kisideuedutr/enginmermut/ HOMEWORK 2 MATRIX ALGEBRA Textbook: Linear
More informationA Computation- and Communication-Optimal Parallel Direct 3-body Algorithm
A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm Penporn Koanantakool and Katherine Yelick {penpornk, yelick}@cs.berkeley.edu Computer Science Division, University of California,
More informationVECTORS, TENSORS AND INDEX NOTATION
VECTORS, TENSORS AND INDEX NOTATION Enrico Nobile Dipartimento di Ingegneria e Architettura Università degli Studi di Trieste, 34127 TRIESTE March 5, 2018 Vectors & Tensors, E. Nobile March 5, 2018 1 /
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationReview of Vectors and Matrices
A P P E N D I X D Review of Vectors and Matrices D. VECTORS D.. Definition of a Vector Let p, p, Á, p n be any n real numbers and P an ordered set of these real numbers that is, P = p, p, Á, p n Then P
More information