Matrix Multiplication

Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. n c ij = a ik b kj k=1 c 11 c 12 c 1n c 21 c 22 c 2n c n1 c n2 c nn = a 11 a 12 a 1n a 21 a 22 a 2n a n1 a n2 a nn b 11 b 12 b 1n b 21 b 22 b 2n b n1 b n2 b nn Brute force. Θ(n 3 ) arithmetic operations. Fundamental question. Can we improve upon brute force? 2

Matrix Multiplication: Warmup Divide-and-conquer. Divide: partition A and B into ½n-by-½n blocks. Conquer: multiply 8 ½n-by-½n recursively. Combine: add appropriate products using 4 matrix additions. C 11 C 12 = A 11 A 12 B 11 B 12 C 21 C 22 A 21 A 22 B 21 B 22 ( ) + ( A 12 B 21 ) ( ) + ( A 12 B 22 ) ( ) + ( A 22 B 21 ) ( ) + ( A 22 B 22 ) C 11 = A 11 B 11 C 12 = A 11 B 12 C 21 = A 21 B 11 C 22 = A 21 B 12 ( ) T(n) = 8T n/2 + Θ(n2 ) recursive calls add, form submatrices T(n) = Θ(n 3 ) 3

Matrix Multiplication: Key Idea Key idea. multiply 2-by-2 block matrices with only multiplications. C 11 C 12 = C 21 C 22 A 11 A 12 B 11 B 12 A 21 A 22 B 21 B 22 P 1 = A 11 (B 12 B 22 ) P 2 = ( A 11 + A 12 ) B 22 C 11 = P 5 + P 4 P 2 + P 6 C 12 = P 1 + P 2 C 21 = P 3 + P 4 C 22 = P 5 + P 1 P 3 P P 3 = ( A 21 + A 22 ) B 11 P 4 = A 22 (B 21 B 11 ) P 5 = ( A 11 + A 22 ) (B 11 + B 22 ) P 6 = ( A 12 A 22 ) (B 21 + B 22 ) P = ( A 11 A 21 ) (B 11 + B 12 ) multiplications. 18 = 10 + 8 additions (or subtractions). 4

Fast Matrix Multiplication Fast matrix multiplication. (Strassen, 1969) Divide: partition A and B into ½n-by-½n blocks. Compute: 14 ½n-by-½n matrices via 10 matrix additions. Conquer: multiply ½n-by-½n matrices recursively. Combine: products into 4 terms using 8 matrix additions. Analysis. Assume n is a power of 2. T(n) = # arithmetic operations. ( ) T(n) = T n /2 + Θ(n2 ) recursive calls add, subtract T(n) = Θ(n log 2 ) = O(n 2.81 ) 5

Fast Matrix Multiplication in Practice Implementation issues. Sparsity. Caching effects. Numerical stability. Odd matrix dimensions. Crossover to classical algorithm around n = 128. Common misperception: "Strassen is only a theoretical curiosity." Advanced Computation Group at Apple Computer reports 8x speedup on G4 Velocity Engine when n ~ 2,500. Range of instances where it's useful is a subject of controversy. Remark. Can "Strassenize" Ax=b, determinant, eigenvalues, and other matrix ops. 6

Θ(n log 2 ) = O(n 2.81 ) Θ(n log 3 21 ) = O(n 2. ) Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) Θ(n log 3 21 ) = O(n 2. ) Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Θ(n log 3 21 ) = O(n 2. ) Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Q. Two 3-by-3 matrices with only 21 scalar multiplications? A. Also impossible. Q. Two 0-by-0 matrices with only 143,640 scalar multiplications? Θ(n log 3 21 ) = O(n 2. ) Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Q. Two 3-by-3 matrices with only 21 scalar multiplications? A. Also impossible. Q. Two 0-by-0 matrices with only 143,640 scalar multiplications? A. Yes! [Pan, 1980] Θ(n log 3 21 ) = O(n 2. ) Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Q. Two 3-by-3 matrices with only 21 scalar multiplications? A. Also impossible. Q. Two 0-by-0 matrices with only 143,640 scalar multiplications? A. Yes! [Pan, 1980] Θ(n log 3 21 ) = O(n 2. ) Decimal wars. Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Q. Two 3-by-3 matrices with only 21 scalar multiplications? A. Also impossible. Q. Two 0-by-0 matrices with only 143,640 scalar multiplications? A. Yes! [Pan, 1980] Θ(n log 3 21 ) = O(n 2. ) Decimal wars. December, 199: O(n 2.521813 ). Θ(n log 0 143640 ) = O(n 2.80 )

Q. Multiply two 2-by-2 matrices with only scalar multiplications? Θ(n log 2 ) = O(n 2.81 ) A. Impossible. [Hopcroft and Kerr, 191] Q. Two 3-by-3 matrices with only 21 scalar multiplications? A. Also impossible. Q. Two 0-by-0 matrices with only 143,640 scalar multiplications? A. Yes! [Pan, 1980] Θ(n log 3 21 ) = O(n 2. ) Decimal wars. December, 199: O(n 2.521813 ). January, 1980: O(n 2.521801 ). Θ(n log 0 143640 ) = O(n 2.80 )

Best known. O(n 2.36 ) [Coppersmith-Winograd, 198.] Conjecture. O(n 2+ε ) for any ε > 0. Caveat. Theoretical improvements to Strassen are progressively less practical. 8