Strassen-like algorithms for symmetric tensor contractions

Size: px
Start display at page:

Download "Strassen-like algorithms for symmetric tensor contractions"

Transcription

1 Strassen-lie algorithms for symmetric tensor contractions Edgar Solomoni University of Illinois at Urbana-Champaign Scientfic and Statistical Computing Seminar University of Chicago April 13, / 27 Fast symmetric tensor contractions

2 Outline 1 Introduction 2 Applications of tensor symmetry 3 Exploiting symmetry in matrix products 4 Exploiting symmetry in tensor contractions 5 Numerical error analysis 6 Exploiting partial-symmetry in tensor contractions 7 Communication cost analysis 8 Summary and conclusion 2 / 27 Fast symmetric tensor contractions

3 Terminology A tensor T R n 1 n d has order d (i.e. d modes / indices dimensions n 1 -by- -by-n d (in this tal, usually each n i = n elements T i 1...i d = T i where i {1,..., n} d We say a tensor is symmetric if for any j, {1,..., n} T i 1...i j...i...i d = T i 1...i...i j...i d A tensor is antisymmetric (sew-symmetric if for any j, {1,..., n} T i 1...i j...i...i d = ( 1T i 1...i...i j...i d A tensor is partially-symmetric if such index interchanges are restricted to be within subsets of {1,..., n}, e.g. T ij l = T ji l = T ji l = T ij l 3 / 27 Fast symmetric tensor contractions

4 Tensor contractions We wor with contractions of tensors A of order s + v, and B of order v + t into C of order s + t, defined as C ij = A i B j {1,...,n} v requires O(s + t + v multiplications and additions }{{} ω assumes an index ordering, but does not lose generality wors with any symmetries of A and B is extensible to symmetries of C via symmetrization (sum all permutations of modes in C, denoted [C] ij generalizes simple matrix operations, e.g. (s, t, v = (1, 0, 1, }{{} (s, t, v = (1, 1, 0, }{{} (s, t, v = (1, 1, 1 }{{} matrix-vector product vector outer product matrix-matrix product 4 / 27 Fast symmetric tensor contractions

5 Applications of symmetric tensor contractions Symmetric and Hermitian matrix operations are part of the BLAS matrix-vector products: symv (symm, hemv, (hemm symmetrized outer product: syr2 (syr2, her2, (her2 these operations dominate symmetric/hermitian diagonalization Hanel matrices are order 2 log 2 (n partially-symmetric tensors H = where H 11, H 21, H 22 are also Hanel. [ H11 H T 21 H 21 H 22 In general, partially-symmetric tensors are nested symmetric tensors a nonsymmetric matrix is a vector of vectors T ij l = T ji l = T ji l = T ij l is a symmetric matrix of symmetric matrices ] 5 / 27 Fast symmetric tensor contractions

6 Applications of partially-symmetric tensor contractions High-accuracy methods in computational quantum chemistry solve the multi-electron Schrödinger equation H Ψ = E Ψ, where H is a linear operator, but Ψ is a function of all electrons use wavefunction ansatze lie Ψ Ψ ( = e T ( Ψ ( 1 where Ψ (0 is a determinant function and T ( is an order 2 tensor, acting as a multilinear excitation operator on the electrons are most commonly versions of coupled-cluster methods which use the above ansatze for {2, 3, 4} (CCSD, CCSDT, CCSDTQ solve iteratively for T (, where each iteration has cost O(n 2+2, dominated by contractions of partially antisymmetric tensors for example, a dominant contraction in CCSD ( = 2 is Z a i c = n n b=1 j=1 T ab ij V j b c where T ab ij = T ba ij = T ba ji = T ab ji. We ll show an algorithm that requires n 6 rather than 2n 6 operations. 6 / 27 Fast symmetric tensor contractions

7 Symmetric matrix times vector Let b be a vector of length n with elements in ϱ Let A be an n-by-n symmetric matrix with elements in ϱ We multiply matrix A by b, A ij = A ji c = A b n c i = A ij b j }{{} j=1 nonsymmetric This usual form has cost (ignoring low-order terms here and later T symv (ϱ, n = µ ϱ n 2 + ν ϱ n 2 where µ ϱ is the cost of multiplication and ν ϱ of addition 7 / 27 Fast symmetric tensor contractions

8 Fast symmetric matrix times vector We can perform symv using fewer element-wise multiplications, n n c i = A ij (b i + b j A ij b i }{{} j=1 j=1 symmetric }{{} requires only n mults. A ij (b i + b j is symmetric, requires ( n+1 2 multiplications ( nj=1 A ij b i may be computed with n multiplications The total cost of the new form is T symv(ϱ, n = µ ϱ 1 2 n2 + ν ϱ 5 2 n2 This formulation is cheaper when µ ϱ > 3ν ϱ Form symm the formulation is cheaper when µ ϱ > ν ϱ 8 / 27 Fast symmetric tensor contractions

9 Symmetric ran-2 update Consider a ran-2 outer product of vectors a and b of length n into symmetric matrix C C = a b T + b a T [ C ij = a b T] a i b j + a j b i ij }{{}}{{} nonsymmetric permutation Usually computed via the n 2 multiplications a i b j with the cost T syr2 (ϱ, n = µ ϱ n 2 + ν ϱ n 2. 9 / 27 Fast symmetric tensor contractions

10 Fast symmetric ran-2 update We may compute the ran-2 update via a symmetric intermediate quantity C ij = (a i + a j (b i + b j }{{} a i b i a j b j }{{} symmetric requires only n mults in total We can compute all (a i + a j (b i + b j in ( n+1 2 multiplications The total cost is then given to leading order by T syr2(ϱ, n = µ ϱ 1 2 n2 + ν ϱ 5 2 n2. T syr2 (ϱ, n < T syr2(ϱ, n when µ ϱ > 3ν ϱ T syr2k (ϱ, n, K < T syr2k(ϱ, n, K when µ ϱ > ν ϱ 10 / 27 Fast symmetric tensor contractions

11 Symmetrized product of symmetric matrices Given symmetric matrices A, B of dimension n on non-associative commutative ring ϱ, we see to compute the anticommutator of A and B C = A B + B A n n C ij = [A B] ij A i B j + A j B i }{{}}{{} =1 =1 nonsymmetric permutation The above equations require n 3 multiplications and n 3 adds for a total cost of T syrmm (ϱ, n = µ ϱ n 3 + ν ϱ n 3. Note that the symmetrized product defines a non-associative commutative ring (the Jordan ring over the set of symmetric matrices. 11 / 27 Fast symmetric tensor contractions

12 Fast symmetrized product of symmetric matrices We can combine the ideas from the fast routines for symv and syr C ij = (A ij + A i + A j (B ij + B i + B j }{{} A ij ( n+2 symmetric, requires 3 mults ( B ij + B i + B j } ( {{} n+1 requires 2 mults A i B i A j B j }{{} requires ( n+1 2 mults ( B ij A ij + A i + A j } ( {{} n+1 requires 2 mults The reformulation requires ( n 3 multiplications to leading order, T syrmm(ϱ, n = µ ϱ 1 6 n3 + ν ϱ 5 3 n3, which is faster than T syrmm when µ ϱ > (4/5ν ϱ. 12 / 27 Fast symmetric tensor contractions

13 Fast symmetrized product of symmetric matrices We can rewrite that algorithm in terms using symmetrization notation: C ij = [AB] ij = (A ij + A i + A j (B ij + B i + B j } {{} [A] ij [B] ij ( ( A ij B ij + B i + B j B ij A ij + A i + A j }{{} A ij [B] ij }{{} B ij [A] ij A i B i A j B j }{{} [A i 1 B] ij = [A] ij [B] ij A ij [B] ij B ij [A] ij [A 1 B] ij where A 1 B = A i B j 13 / 27 Fast symmetric tensor contractions

14 Comparison to Strassen s algorithm [n ω /(s!t!v!] / #multiplications (speed-up over classical direct evaluation alg Symmetry preserving alg. vs Strassen s alg. (s=t=v=ω/3 Strassen s algorithm Sym. preserving ω=6 Sym. preserving ω= e+06 n s /s! (matrix dimension 14 / 27 Fast symmetric tensor contractions

15 Fully symmetric tensor contractions We can now state the general symmetric tensor contraction algorithms, given A of order s + v, and B of order v + t into C of order s + t, defined as we define the (nonsymmetrized contraction as C = A v B where C ij = A i B j {1,...,n} v We can then define the symmetrized tensor contraction as C i = [A v B] i The usual method first computes A v B with cost ( ( ( n n n T syctr(ϱ, n, s, t, v = µ ϱ + ν ϱ s t v ( n s ( n t 15 / 27 Fast symmetric tensor contractions ( n v.

16 Fast fully-symmetric contraction algorithm The fast algorithm is defined as follows (using ω = s + t + v C i = [A] i [B] i {1,...,n} } v {{} ( n+ω 1 symmetric, requires ω v multiplications ( ( [A] ip [B] iq p+q=1 {1,...,n} v p q p {1,...,n} p q {1,...,n} }{{ q } requires O(n ω 1 multiplications min(s,t [A v+r B] i r=1 }{{} requires O(n ω 1 multiplications The leading order cost is T syctr(ϱ, n, s, t, v = µ ϱ ( n + ν ϱ ω ( n ω [ ( ω + t 16 / 27 Fast symmetric tensor contractions ( ω + s ( ] ω. v

17 Reduction in operation count of fast algorithm with respect to standard reduction factor Reduction in operation count for different entry types (s+t=ω entries are matrices (s+t=ω entries are complex (s+t=ω entries are real ω Reduction in operation count for different entry types (s+t+v=ω entries are matrices (s+t+v=ω entries are complex (s+t+v=ω entries are real (s, t, v values for left and right graph tabulated below ω ω Left graph (1, 0, 0 (1, 1, 0 (2, 1, 0 (2, 2, 0 (3, 2, 0 (3, 3, 0 Right graph (1, 0, 0 (1, 1, 0 (1, 1, 1 (2, 1, 1 (2, 2, 1 (2, 2, 2 17 / 27 Fast symmetric tensor contractions

18 Theoretical error bounds We express error bounds in terms of γ n = precision. nɛ 1 nɛ, where ɛ is the machine Let Ψ be the standard algorithm and Φ be the fast algorithm. The error bound for the standard algorithm arises from matrix multiplication ( ( n ω fl (Ψ(A, B C γ m A B where m =. v v The following error bound holds for the fast algorithm ( n fl (Φ(A, B C γ m A B where m = 3 v ( ω t ( ω s. 18 / 27 Fast symmetric tensor contractions

19 Stability of symmetry preserving algorithms 19 / 27 Fast symmetric tensor contractions

20 Nesting the fast algorithm For partially-(antisymmetric contractions we can nest the new algorithm over each group of symmetric modes reduction in mults can translate to reduction in the number of operations for Hanel matrices, yields sub O(n 2 algorithm, but not O(n or O(n log(n as reduction in mults is a factor of two only in leading order but for coupled-cluster contractions, significant reductions in cost can be achieved CCSD 1.3X on a typical system CCSDT 2.1X on a typical system CCSDTQ 5.7X on a typical system 20 / 27 Fast symmetric tensor contractions

21 Communication cost of the standard algorithm We consider communication bandwidth cost on a sequential machine with cache size M. The intermediate formed by the standard algorithm may be computed via matrix multiplication with communication cost, (( n ( n ( n s t W (n, s, t, v, M = Θ v + M ( n + s + v ( ( n n +. t + v s + t The cost of symmetrizing the resulting intermediate is low-order or the same. 21 / 27 Fast symmetric tensor contractions

22 Communication cost of the fast algorithm We can lower bound the cost of the fast algorithm using the Hölder-Brascamp-Lieb inequality. An algorithm that blocs Z symmetrically nearly attains the cost ( n [( ( ( ] W ω ω ω ω (n, s, t, v, M = O( M ω/(ω min(s,t,v + + t s v ( ( ( n n n s + v t + v s + t which is not far from the lower bound and attains it when s = t = v. 22 / 27 Fast symmetric tensor contractions

23 factor of reduction in communication volume Reduction in communication (W/W fast alg comm reduction forr s+t+v=ω fast alg comm reduction for s+t=ω ω 23 / 27 Fast symmetric tensor contractions

24 Further communication lower bounds We can use bilinear algorithm formulations to derive comm. lower bounds symmetry preserving tensor contraction algorithms have arbitrary order projections from order d set bilinear algorithms 1 provide a more general framewor a bilinear algorithm is defined by matrices F (A, F (B, F (C, c = F (C [(F (AT a (F (BT b] where is the Hadamard (pointwise product communication lower bounds derived based on matrix ran 2 1 Pan, Springer, S., Hoefler, Demmel, in preparation 24 / 27 Fast symmetric tensor contractions

25 Communication cost of symmetry preserving algorithms For contraction of order s + v tensor with order v + t tensor 3 symmetry preserving algorithm requires (s+v+t! s!v!t! fewer multiplies matrix-vector-lie algorithms (min(s, v, t = 0 vertical communication dominated by largest tensor horizontal communication asymptotically greater if only unique elements are stored and s v t matrix-matrix-lie algorithms (min(s, v, t > 0 vertical and horizontal communication costs asymptotically greater for symmetry preserving algorithm when s v t 3 S., Hoefler, Demmel; Technical Report, ETH Zurich, / 27 Fast symmetric tensor contractions

26 Summary of results The following table lists the leading order number of multiplications F required by the standard algorithm and F by the fast algorithm for various cases of symmetric tensor contractions, ω s t v F F applications n 2 n 2 /2 syr2, syr2, her2, her n 2 n 2 /2 symv, symm, hemv, hemm n 3 n 3 /6 symmetrized matmul ( s+t+v s t v n ω any symmetric tensor contraction ( n ( n ( n s t v High-level conclusions: The fast symmetric contraction algorithms provide interesting potential arithmetic cost improvements for complex BLAS routines and partially symmetric tensor contractions. However, the new algorithms require more communication per flop, incur more numerical error, and usually unable to exploit fused-multiply-add units or bloced matrix multiplication primitives. 26 / 27 Fast symmetric tensor contractions

27 Acnowledgements and references Collaborators on various parts: James Demmel Torsten Hoefler Devin Matthews S., Demmel; Technical Report, ETH Zurich, / 27 Fast symmetric tensor contractions

28 Bacup slides 28 / 27 Fast symmetric tensor contractions

29 The fast algorithm for computing C forms the following intermediates with ( n ω multiplications (where ω = s + t + v, ( Z i = V i = + W i = j χ(i ( j χ(i ( ( A j A j B l l χ(i ( 1 l χ(i ( A j 1 j χ(i ( ( A j j χ(i l χ(i C i = Z i V i ( W j j χ(i l χ(i B l B l B l 29 / 27 Fast symmetric tensor contractions

Strassen-like algorithms for symmetric tensor contractions

Strassen-like algorithms for symmetric tensor contractions Strassen-like algorithms for symmetric tensor contractions Edgar Solomonik Theory Seminar University of Illinois at Urbana-Champaign September 18, 2017 1 / 28 Fast symmetric tensor contractions Outline

More information

Contracting symmetric tensors using fewer multiplications

Contracting symmetric tensors using fewer multiplications Research Collection Journal Article Contracting symmetric tensors using fewer multiplications Authors: Solomonik, Edgar; Demmel, James W. Publication Date: 2016 Permanent Link: https://doi.org/10.3929/ethz-a-010345741

More information

Efficient algorithms for symmetric tensor contractions

Efficient algorithms for symmetric tensor contractions Efficient algorithms for symmetric tensor contractions Edgar Solomonik 1 Department of EECS, UC Berkeley Oct 22, 2013 1 / 42 Edgar Solomonik Symmetric tensor contractions 1/ 42 Motivation The goal is to

More information

Low Rank Bilinear Algorithms for Symmetric Tensor Contractions

Low Rank Bilinear Algorithms for Symmetric Tensor Contractions Low Rank Bilinear Algorithms for Symmetric Tensor Contractions Edgar Solomonik ETH Zurich SIAM PP, Paris, France April 14, 2016 1 / 14 Fast Algorithms for Symmetric Tensor Contractions Tensor contractions

More information

Algorithms as multilinear tensor equations

Algorithms as multilinear tensor equations Algorithms as multilinear tensor equations Edgar Solomonik Department of Computer Science ETH Zurich Technische Universität München 18.1.2016 Edgar Solomonik Algorithms as multilinear tensor equations

More information

Provably efficient algorithms for numerical tensor algebra

Provably efficient algorithms for numerical tensor algebra Provably efficient algorithms for numerical tensor algebra Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley Dissertation Defense Adviser: James Demmel Aug 27, 2014 Edgar Solomonik

More information

Cyclops Tensor Framework

Cyclops Tensor Framework Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r

More information

Cyclops Tensor Framework: reducing communication and eliminating load imbalance in massively parallel contractions

Cyclops Tensor Framework: reducing communication and eliminating load imbalance in massively parallel contractions Cyclops Tensor Framework: reducing communication and eliminating load imbalance in massively parallel contractions Edgar Solomonik 1, Devin Matthews 3, Jeff Hammond 4, James Demmel 1,2 1 Department of

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

2.5D algorithms for distributed-memory computing

2.5D algorithms for distributed-memory computing ntroduction for distributed-memory computing C Berkeley July, 2012 1/ 62 ntroduction Outline ntroduction Strong scaling 2.5D factorization 2/ 62 ntroduction Strong scaling Solving science problems faster

More information

Quantum Computing Lecture 2. Review of Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces

More information

Towards an algebraic formalism for scalable numerical algorithms

Towards an algebraic formalism for scalable numerical algorithms Towards an algebraic formalism for scalable numerical algorithms Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign Householder Symposium XX June 22, 2017 Householder

More information

CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform

CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform Edgar Solomonik University of Illinois at Urbana-Champaign September 21, 2016 Fast

More information

Practicality of Large Scale Fast Matrix Multiplication

Practicality of Large Scale Fast Matrix Multiplication Practicality of Large Scale Fast Matrix Multiplication Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz and Oded Schwartz UC Berkeley IWASEP June 5, 2012 Napa Valley, CA Research supported by

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

Attempts at relativistic QM

Attempts at relativistic QM Attempts at relativistic QM based on S-1 A proper description of particle physics should incorporate both quantum mechanics and special relativity. However historically combining quantum mechanics and

More information

Quantum Field Theory

Quantum Field Theory Quantum Field Theory PHYS-P 621 Radovan Dermisek, Indiana University Notes based on: M. Srednicki, Quantum Field Theory 1 Attempts at relativistic QM based on S-1 A proper description of particle physics

More information

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y) 5.1 Banded Storage u = temperature u= u h temperature at gridpoints u h = 1 u= Laplace s equation u= h u = u h = grid size u=1 The five-point difference operator 1 u h =1 uh (x + h, y) 2u h (x, y)+u h

More information

11 Block Designs. Linear Spaces. Designs. By convention, we shall

11 Block Designs. Linear Spaces. Designs. By convention, we shall 11 Block Designs Linear Spaces In this section we consider incidence structures I = (V, B, ). always let v = V and b = B. By convention, we shall Linear Space: We say that an incidence structure (V, B,

More information

Strassen s Algorithm for Tensor Contraction

Strassen s Algorithm for Tensor Contraction Strassen s Algorithm for Tensor Contraction Jianyu Huang, Devin A. Matthews, Robert A. van de Geijn The University of Texas at Austin September 14-15, 2017 Tensor Computation Workshop Flatiron Institute,

More information

MTH 464: Computational Linear Algebra

MTH 464: Computational Linear Algebra MTH 464: Computational Linear Algebra Lecture Outlines Exam 2 Material Prof. M. Beauregard Department of Mathematics & Statistics Stephen F. Austin State University February 6, 2018 Linear Algebra (MTH

More information

Matrices and Vectors

Matrices and Vectors Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 2 Systems of Linear Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Clifford Algebras and Spin Groups

Clifford Algebras and Spin Groups Clifford Algebras and Spin Groups Math G4344, Spring 2012 We ll now turn from the general theory to examine a specific class class of groups: the orthogonal groups. Recall that O(n, R) is the group of

More information

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark DM559 Linear and Integer Programming LU Factorization Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [Based on slides by Lieven Vandenberghe, UCLA] Outline

More information

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory Physics 202 Laboratory 5 Linear Algebra Laboratory 5 Physics 202 Laboratory We close our whirlwind tour of numerical methods by advertising some elements of (numerical) linear algebra. There are three

More information

Using BLIS for tensor computations in Q-Chem

Using BLIS for tensor computations in Q-Chem Using BLIS for tensor computations in Q-Chem Evgeny Epifanovsky Q-Chem BLIS Retreat, September 19 20, 2016 Q-Chem is an integrated software suite for modeling the properties of molecular systems from first

More information

A communication-avoiding parallel algorithm for the symmetric eigenvalue problem

A communication-avoiding parallel algorithm for the symmetric eigenvalue problem A communication-avoiding parallel algorithm for the symmetric eigenvalue problem Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign Householder Symposium XX June

More information

Algebraic aspects of Hadamard matrices

Algebraic aspects of Hadamard matrices Algebraic aspects of Hadamard matrices Padraig Ó Catháin University of Queensland 22 February 2013 Overview Difference set Relative difference set Symmetric Design Hadamard matrix Overview 1 Hadamard matrices

More information

Algorithms as Multilinear Tensor Equations

Algorithms as Multilinear Tensor Equations Algorithms as Multilinear Tensor Equations Edgar Solomonik Department of Computer Science ETH Zurich University of Colorado, Boulder February 23, 2016 Edgar Solomonik Algorithms as Multilinear Tensor Equations

More information

How to find good starting tensors for matrix multiplication

How to find good starting tensors for matrix multiplication How to find good starting tensors for matrix multiplication Markus Bläser Saarland University Matrix multiplication z,... z,n..... z n,... z n,n = x,... x,n..... x n,... x n,n y,... y,n..... y n,... y

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 13 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath Parallel Numerical Algorithms

More information

A new product on 2 2 matrices

A new product on 2 2 matrices A new product on 2 2 matrices L Kramer, P Kramer, and V Man ko October 17, 2018 arxiv:1802.03048v1 [math-ph] 8 Feb 2018 Abstract We study a bilinear multiplication rule on 2 2 matrices which is intermediate

More information

L. Vandenberghe EE133A (Spring 2017) 3. Matrices. notation and terminology. matrix operations. linear and affine functions.

L. Vandenberghe EE133A (Spring 2017) 3. Matrices. notation and terminology. matrix operations. linear and affine functions. L Vandenberghe EE133A (Spring 2017) 3 Matrices notation and terminology matrix operations linear and affine functions complexity 3-1 Matrix a rectangular array of numbers, for example A = 0 1 23 01 13

More information

A communication-avoiding parallel algorithm for the symmetric eigenvalue problem

A communication-avoiding parallel algorithm for the symmetric eigenvalue problem A communication-avoiding parallel algorithm for the symmetric eigenvalue problem Edgar Solomonik 1, Grey Ballard 2, James Demmel 3, Torsten Hoefler 4 (1) Department of Computer Science, University of Illinois

More information

Lecture 2 Some Sources of Lie Algebras

Lecture 2 Some Sources of Lie Algebras 18.745 Introduction to Lie Algebras September 14, 2010 Lecture 2 Some Sources of Lie Algebras Prof. Victor Kac Scribe: Michael Donovan From Associative Algebras We saw in the previous lecture that we can

More information

Massachusetts Institute of Technology Physics Department

Massachusetts Institute of Technology Physics Department Massachusetts Institute of Technology Physics Department Physics 8.32 Fall 2006 Quantum Theory I October 9, 2006 Assignment 6 Due October 20, 2006 Announcements There will be a makeup lecture on Friday,

More information

Software implementation of correlated quantum chemistry methods. Exploiting advanced programming tools and new computer architectures

Software implementation of correlated quantum chemistry methods. Exploiting advanced programming tools and new computer architectures Software implementation of correlated quantum chemistry methods. Exploiting advanced programming tools and new computer architectures Evgeny Epifanovsky Q-Chem Septermber 29, 2015 Acknowledgments Many

More information

Kai Sun. University of Michigan, Ann Arbor. Collaborators: Krishna Kumar and Eduardo Fradkin (UIUC)

Kai Sun. University of Michigan, Ann Arbor. Collaborators: Krishna Kumar and Eduardo Fradkin (UIUC) Kai Sun University of Michigan, Ann Arbor Collaborators: Krishna Kumar and Eduardo Fradkin (UIUC) Outline How to construct a discretized Chern-Simons gauge theory A necessary and sufficient condition for

More information

Lecture 5. Hartree-Fock Theory. WS2010/11: Introduction to Nuclear and Particle Physics

Lecture 5. Hartree-Fock Theory. WS2010/11: Introduction to Nuclear and Particle Physics Lecture 5 Hartree-Fock Theory WS2010/11: Introduction to Nuclear and Particle Physics Particle-number representation: General formalism The simplest starting point for a many-body state is a system of

More information

Plan for the rest of the semester. ψ a

Plan for the rest of the semester. ψ a Plan for the rest of the semester ϕ ψ a ϕ(x) e iα(x) ϕ(x) 167 Representations of Lorentz Group based on S-33 We defined a unitary operator that implemented a Lorentz transformation on a scalar field: and

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 12 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted for noncommercial,

More information

Chapter 12 Block LU Factorization

Chapter 12 Block LU Factorization Chapter 12 Block LU Factorization Block algorithms are advantageous for at least two important reasons. First, they work with blocks of data having b 2 elements, performing O(b 3 ) operations. The O(b)

More information

Ax = b. Systems of Linear Equations. Lecture Notes to Accompany. Given m n matrix A and m-vector b, find unknown n-vector x satisfying

Ax = b. Systems of Linear Equations. Lecture Notes to Accompany. Given m n matrix A and m-vector b, find unknown n-vector x satisfying Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T Heath Chapter Systems of Linear Equations Systems of Linear Equations Given m n matrix A and m-vector

More information

Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning

Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Erin C. Carson, Nicholas Knight, James Demmel, Ming Gu U.C. Berkeley SIAM PP 12, Savannah, Georgia, USA, February

More information

Communication Lower Bounds for Programs that Access Arrays

Communication Lower Bounds for Programs that Access Arrays Communication Lower Bounds for Programs that Access Arrays Nicholas Knight, Michael Christ, James Demmel, Thomas Scanlon, Katherine Yelick UC-Berkeley Scientific Computing and Matrix Computations Seminar

More information

Lecture 37: Principal Axes, Translations, and Eulerian Angles

Lecture 37: Principal Axes, Translations, and Eulerian Angles Lecture 37: Principal Axes, Translations, and Eulerian Angles When Can We Find Principal Axes? We can always write down the cubic equation that one must solve to determine the principal moments But if

More information

Quantum Field Theory

Quantum Field Theory Quantum Field Theory PHYS-P 621 Radovan Dermisek, Indiana University Notes based on: M. Srednicki, Quantum Field Theory 1 Attempts at relativistic QM based on S-1 A proper description of particle physics

More information

Quantum Field Theory

Quantum Field Theory Quantum Field Theory PHYS-P 621 Radovan Dermisek, Indiana University Notes based on: M. Srednicki, Quantum Field Theory 1 Attempts at relativistic QM based on S-1 A proper description of particle physics

More information

2 Computing complex square roots of a real matrix

2 Computing complex square roots of a real matrix On computing complex square roots of real matrices Zhongyun Liu a,, Yulin Zhang b, Jorge Santos c and Rui Ralha b a School of Math., Changsha University of Science & Technology, Hunan, 410076, China b

More information

CS 4424 Matrix multiplication

CS 4424 Matrix multiplication CS 4424 Matrix multiplication 1 Reminder: matrix multiplication Matrix-matrix product. Starting from a 1,1 a 1,n A =.. and B = a n,1 a n,n b 1,1 b 1,n.., b n,1 b n,n we get AB by multiplying A by all columns

More information

Trace Inequalities for a Block Hadamard Product

Trace Inequalities for a Block Hadamard Product Filomat 32:1 2018), 285 292 https://doiorg/102298/fil1801285p Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://wwwpmfniacrs/filomat Trace Inequalities for

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 2 Systems of Linear Equations Copyright c 2001. Reproduction permitted only for noncommercial,

More information

Definition 2.3. We define addition and multiplication of matrices as follows.

Definition 2.3. We define addition and multiplication of matrices as follows. 14 Chapter 2 Matrices In this chapter, we review matrix algebra from Linear Algebra I, consider row and column operations on matrices, and define the rank of a matrix. Along the way prove that the row

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization

More information

Parallel-in-time integrators for Hamiltonian systems

Parallel-in-time integrators for Hamiltonian systems Parallel-in-time integrators for Hamiltonian systems Claude Le Bris ENPC and INRIA Visiting Professor, The University of Chicago joint work with X. Dai (Paris 6 and Chinese Academy of Sciences), F. Legoll

More information

Communication avoiding parallel algorithms for dense matrix factorizations

Communication avoiding parallel algorithms for dense matrix factorizations Communication avoiding parallel dense matrix factorizations 1/ 44 Communication avoiding parallel algorithms for dense matrix factorizations Edgar Solomonik Department of EECS, UC Berkeley October 2013

More information

Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition

Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Grey Ballard SIAM onference on omputational Science & Engineering March, 207 ollaborators This is joint work with... Austin Benson

More information

Representations of Lorentz Group

Representations of Lorentz Group Representations of Lorentz Group based on S-33 We defined a unitary operator that implemented a Lorentz transformation on a scalar field: How do we find the smallest (irreducible) representations of the

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

Block Lanczos Tridiagonalization of Complex Symmetric Matrices

Block Lanczos Tridiagonalization of Complex Symmetric Matrices Block Lanczos Tridiagonalization of Complex Symmetric Matrices Sanzheng Qiao, Guohong Liu, Wei Xu Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7 ABSTRACT The classic

More information

Numerical Linear and Multilinear Algebra in Quantum Tensor Networks

Numerical Linear and Multilinear Algebra in Quantum Tensor Networks Numerical Linear and Multilinear Algebra in Quantum Tensor Networks Konrad Waldherr October 20, 2013 Joint work with Thomas Huckle QCCC 2013, Prien, October 20, 2013 1 Outline Numerical (Multi-) Linear

More information

Griffiths Chapter 1. Dan Wysocki. February 12, = A exp( λu 2 ) d u. e cx2 d x = 1 = λ = A = π. λ exp( λ(x a)2 ) λ exp( λ(x a)2 ) = πa2.

Griffiths Chapter 1. Dan Wysocki. February 12, = A exp( λu 2 ) d u. e cx2 d x = 1 = λ = A = π. λ exp( λ(x a)2 ) λ exp( λ(x a)2 ) = πa2. Griffiths Chapter 1 Dan Wysocki February 12, 215 Problem 1. Consider the gaussian distribution ρ) = A ep λ a) 2 ), where A, a, and λ are positive real constants. a. Use Equation 1.16 to determine A. 1

More information

The kernel structure of rectangular Hankel and Toeplitz matrices

The kernel structure of rectangular Hankel and Toeplitz matrices The ernel structure of rectangular Hanel and Toeplitz matrices Martin H. Gutnecht ETH Zurich, Seminar for Applied Mathematics ØØÔ»»ÛÛÛºÑ Ø º Ø Þº» Ñ 3rd International Conference on Structured Matrices

More information

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional)

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional) Section 2.4 Section Summary Sequences. o Examples: Geometric Progression, Arithmetic Progression Recurrence Relations o Example: Fibonacci Sequence Summations Special Integer Sequences (optional) Sequences

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

1 Matrices and matrix algebra

1 Matrices and matrix algebra 1 Matrices and matrix algebra 1.1 Examples of matrices A matrix is a rectangular array of numbers and/or variables. For instance 4 2 0 3 1 A = 5 1.2 0.7 x 3 π 3 4 6 27 is a matrix with 3 rows and 5 columns

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

DETERMINANTS, COFACTORS, AND INVERSES

DETERMINANTS, COFACTORS, AND INVERSES DEERMIS, COFCORS, D IVERSES.0 GEERL Determinants originally were encountered in the solution of simultaneous linear equations, so we will use that perspective here. general statement of this problem is:

More information

7.1 Creation and annihilation operators

7.1 Creation and annihilation operators Chapter 7 Second Quantization Creation and annihilation operators. Occupation number. Anticommutation relations. Normal product. Wick s theorem. One-body operator in second quantization. Hartree- Fock

More information

Remarks on Extremization Problems Related To Young s Inequality

Remarks on Extremization Problems Related To Young s Inequality Remarks on Extremization Problems Related To Young s Inequality Michael Christ University of California, Berkeley University of Wisconsin May 18, 2016 Part 1: Introduction Young s convolution inequality

More information

Math 671: Tensor Train decomposition methods II

Math 671: Tensor Train decomposition methods II Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016 Table of Contents 1 What we ve talked about so far: 2 The Tensor Train decomposition

More information

Quantum entanglement and symmetry

Quantum entanglement and symmetry Journal of Physics: Conference Series Quantum entanglement and symmetry To cite this article: D Chrucisi and A Kossaowsi 2007 J. Phys.: Conf. Ser. 87 012008 View the article online for updates and enhancements.

More information

Chapter 2 The Group U(1) and its Representations

Chapter 2 The Group U(1) and its Representations Chapter 2 The Group U(1) and its Representations The simplest example of a Lie group is the group of rotations of the plane, with elements parametrized by a single number, the angle of rotation θ. It is

More information

Review of linear algebra

Review of linear algebra Review of linear algebra 1 Vectors and matrices We will just touch very briefly on certain aspects of linear algebra, most of which should be familiar. Recall that we deal with vectors, i.e. elements of

More information

THE DIRAC EQUATION (A REVIEW) We will try to find the relativistic wave equation for a particle.

THE DIRAC EQUATION (A REVIEW) We will try to find the relativistic wave equation for a particle. THE DIRAC EQUATION (A REVIEW) We will try to find the relativistic wave equation for a particle. First, we introduce four dimensional notation for a vector by writing x µ = (x, x 1, x 2, x 3 ) = (ct, x,

More information

Quantum Theory and Group Representations

Quantum Theory and Group Representations Quantum Theory and Group Representations Peter Woit Columbia University LaGuardia Community College, November 1, 2017 Queensborough Community College, November 15, 2017 Peter Woit (Columbia University)

More information

Maximum-weighted matching strategies and the application to symmetric indefinite systems

Maximum-weighted matching strategies and the application to symmetric indefinite systems Maximum-weighted matching strategies and the application to symmetric indefinite systems by Stefan Röllin, and Olaf Schenk 2 Technical Report CS-24-7 Department of Computer Science, University of Basel

More information

A PDE VERSION OF THE CONSTANT VARIATION METHOD AND THE SUBCRITICALITY OF SCHRÖDINGER SYSTEMS WITH ANTISYMMETRIC POTENTIALS.

A PDE VERSION OF THE CONSTANT VARIATION METHOD AND THE SUBCRITICALITY OF SCHRÖDINGER SYSTEMS WITH ANTISYMMETRIC POTENTIALS. A PDE VERSION OF THE CONSTANT VARIATION METHOD AND THE SUBCRITICALITY OF SCHRÖDINGER SYSTEMS WITH ANTISYMMETRIC POTENTIALS. Tristan Rivière Departement Mathematik ETH Zürich (e mail: riviere@math.ethz.ch)

More information

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product Level-1 BLAS: SAXPY BLAS-Notation: S single precision (D for double, C for complex) A α scalar X vector P plus operation Y vector SAXPY: y = αx + y Vectorization of SAXPY (αx + y) by pipelining: page 8

More information

Random Fermionic Systems

Random Fermionic Systems Random Fermionic Systems Fabio Cunden Anna Maltsev Francesco Mezzadri University of Bristol December 9, 2016 Maltsev (University of Bristol) Random Fermionic Systems December 9, 2016 1 / 27 Background

More information

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3 Pivoting Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3 In the previous discussions we have assumed that the LU factorization of A existed and the various versions could compute it in a stable manner.

More information

Isotropic harmonic oscillator

Isotropic harmonic oscillator Isotropic harmonic oscillator 1 Isotropic harmonic oscillator The hamiltonian of the isotropic harmonic oscillator is H = h m + 1 mω r (1) = [ h d m dρ + 1 ] m ω ρ, () ρ=x,y,z a sum of three one-dimensional

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize

More information

8-1: Backpropagation Prof. J.C. Kao, UCLA. Backpropagation. Chain rule for the derivatives Backpropagation graphs Examples

8-1: Backpropagation Prof. J.C. Kao, UCLA. Backpropagation. Chain rule for the derivatives Backpropagation graphs Examples 8-1: Backpropagation Prof. J.C. Kao, UCLA Backpropagation Chain rule for the derivatives Backpropagation graphs Examples 8-2: Backpropagation Prof. J.C. Kao, UCLA Motivation for backpropagation To do gradient

More information

Chapter 10. Quantum algorithms

Chapter 10. Quantum algorithms Chapter 10. Quantum algorithms Complex numbers: a quick review Definition: C = { a + b i : a, b R } where i = 1. Polar form of z = a + b i is z = re iθ, where r = z = a 2 + b 2 and θ = tan 1 y x Alternatively,

More information

Symmetries, Groups, and Conservation Laws

Symmetries, Groups, and Conservation Laws Chapter Symmetries, Groups, and Conservation Laws The dynamical properties and interactions of a system of particles and fields are derived from the principle of least action, where the action is a 4-dimensional

More information

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Compute the Fourier transform on the first register to get x {0,1} n x 0. CS 94 Recursive Fourier Sampling, Simon s Algorithm /5/009 Spring 009 Lecture 3 1 Review Recall that we can write any classical circuit x f(x) as a reversible circuit R f. We can view R f as a unitary

More information

Linear Algebra and Eigenproblems

Linear Algebra and Eigenproblems Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

More information

(1.1) In particular, ψ( q 1, m 1 ; ; q N, m N ) 2 is the probability to find the first particle

(1.1) In particular, ψ( q 1, m 1 ; ; q N, m N ) 2 is the probability to find the first particle Chapter 1 Identical particles 1.1 Distinguishable particles The Hilbert space of N has to be a subspace H = N n=1h n. Observables Ân of the n-th particle are self-adjoint operators of the form 1 1 1 1

More information

CS100: DISCRETE STRUCTURES. Lecture 3 Matrices Ch 3 Pages:

CS100: DISCRETE STRUCTURES. Lecture 3 Matrices Ch 3 Pages: CS100: DISCRETE STRUCTURES Lecture 3 Matrices Ch 3 Pages: 246-262 Matrices 2 Introduction DEFINITION 1: A matrix is a rectangular array of numbers. A matrix with m rows and n columns is called an m x n

More information

Powers of Tensors and Fast Matrix Multiplication

Powers of Tensors and Fast Matrix Multiplication Powers of Tensors and Fast Matrix Multiplication François Le Gall Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo Simons Institute, 12 November

More information

Complexity of Matrix Multiplication and Bilinear Problems

Complexity of Matrix Multiplication and Bilinear Problems Complexity of Matrix Multiplication and Bilinear Problems François Le Gall Graduate School of Informatics Kyoto University ADFOCS17 - Lecture 3 24 August 2017 Overview of the Lectures Fundamental techniques

More information

Chapter 2. Divide-and-conquer. 2.1 Strassen s algorithm

Chapter 2. Divide-and-conquer. 2.1 Strassen s algorithm Chapter 2 Divide-and-conquer This chapter revisits the divide-and-conquer paradigms and explains how to solve recurrences, in particular, with the use of the master theorem. We first illustrate the concept

More information

arxiv: v2 [cond-mat.str-el] 28 Apr 2010

arxiv: v2 [cond-mat.str-el] 28 Apr 2010 Simulation of strongly correlated fermions in two spatial dimensions with fermionic Projected Entangled-Pair States arxiv:0912.0646v2 [cond-mat.str-el] 28 Apr 2010 Philippe Corboz, 1 Román Orús, 1 Bela

More information

ICS141: Discrete Mathematics for Computer Science I

ICS141: Discrete Mathematics for Computer Science I ICS4: Discrete Mathematics for Computer Science I Dept. Information & Computer Sci., Jan Stelovsky based on slides by Dr. Baek and Dr. Still Originals by Dr. M. P. Frank and Dr. J.L. Gross Provided by

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information