Generalizing of ahigh Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures

Size: px
Start display at page:

Download "Generalizing of ahigh Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures"

Transcription

1 Generalizing of ahigh Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures Duc Kien Nguyen 1,IvanLavallee 2,Marc Bui 2 1 CHArt -Ecole Pratique des Hautes Etudes &UniversitéParis 8, France Kien.Duc-Nguyen@univ-paris8.fr 2 LaISC -Ecole Pratique des Hautes Etudes, France Ivan.Lavallee@ephe.sorbonne.fr Marc.Bui@ephe.sorbonne.fr Abstract: Strassen s algorithm to multiply two n n matrices reduces the asymptotic operation count from O(n 3 ) of the traditional algorithm to O(n 2.81 ),thus designing efficient parallelizing for this algorithm becomes essential. In this paper, we present our generalizing of a parallel Strassen implementation which obtained a very nice performance on an Intel Paragon: faster 20% for n 1000 and more than 100% for n 5000 in comparison to the parallel traditional algorithms (as Fox, Cannon). Our method can be applied to all the matrix multiplication algorithms on distributed memory computers that use Strassen s algorithm at the system level, hence it gives us compatibility to find better parallel implementations of Strassen salgorithm. 1 Introduction Matrix multiplication (MM) is one of the most fundamental operations in linear algebra and serves as the main building block in manydifferent algorithms, including the solution of systems of linear equations, matrix inversion, evaluation of the matrix determinant and the transitive closure of a graph. In several cases the asymptotic complexities of these algorithms depend directly on the complexity of matrix multiplication -which motivates the study of possibilities to speed up matrix multiplication. Also, the inclusion of matrix multiplication in many benchmarks points at its role as a determining factor for the performance of high speed computations. Strassen was the first to introduce abetter algorithm [Str69] for MM with O(N log 2 7 )than the traditional one which needs O(N 3 ) operations. Then Winograd variant [Win71] of Strassen salgorithm has the same exponent butaslightly lower constant as the number of additions/subtractions is reduced from 18 down to 15. The record of complexity owed to Coppersmith and Winograd is O(N ),resulted from arithmetic aggregation [CW90]. However, only Winograd s algorithm and Strassen s algorithm offer better performance than traditional algorithm for matrices of practical sizes, say, less than [LPS92]. The 359

2 full potential of these algorithms can be realized only on large matrices, which require large machines such as parallel computers. Thus, designing efficient parallel implementations for theses algorithms becomes essential. This research was started when a paper by Chung-Chiang Chou, Yuefan Deng, Gang Li, and Yuan Wang [CDLW95] on the Strassen parallelizing came to our attention. Their implementation already obtained a nice performance: in comparison to the parallel traditional algorithms (as Fox, Cannon) on an Intel Paragon, it s faster 20% for n 1000 and more than 100% for n The principle of this implementation is to parallelize the Strassen s algorithm at the system level -i.e. to stop on the recursion level r of execution tree -and the calculation of the products of sub matrices is locally performed by the processors. The most significant point here is to determine the sub matrices after having recursively executed r time the Strassen s formula (these sub matrices are corresponding to the nodes of level r in the execution tree of Strassen salgorithm) and then to find the result matrix from these sub matrices (corresponding to the process of backtracking the execution tree). It is simple to solvethis problem for asequential machine, but it s much harder for a parallel machine. With a definite value of r, we can manually do it like [CDLW95], [LD95], and [GSv96] made (r =1,2)but the solution for the general case has not been found. In this paper, we present our method to determine all the nodes at the unspecified level r in the execution tree of Strassen s algorithm, and to show the expression representing the relation between the result matrix and the sub matrices at the level recursion r; this expression allows us to calculate directly the result matrix from the sub matrices calculated by parallel matrix multiplication algorithms at the bottom level. By combining this result with the matrix multiplication algorithms at the bottom level, we have a generalizing of the high performance parallel Strassen implementation in [CDLW95]. It can be applied to all the matrix multiplication algorithms on distributed memory computers that use Strassen s algorithm at the system level, besides the running time for these algorithms decreases when the recursion level increases hence this general solution gives us compatibility to find better implementations (which correspond with a definite value of the recursive level and adefinite matrix multiplication algorithm at the bottom level). 2 Background 2.1 Strassen s Algorithm We start by considering the formation of the matrix product Q = XY,where Q m n, X m k,and Y k n. Wewill assume that m, n, and k are all even integers. By partitioning X = X00 X 01,Y = X 10 X 11 Y00 Y 01 Q00 Q,Q= 01 Y 10 Y 11 Q 10 Q

3 where Q ij m 2 n 2,X ij m 2 k 2,and Y ij k 2 n 2,it can be shown [Win71, GL89] that the following computations compute Q = XY : M 0 =(X 00 + M 11 )(Y 00 + Y 11) M 1 =(X 10 + X 11 )Y 00 M 2 = X 00 (Y 01 Y 11 ) M 3 = X 11 ( Y 00 + Y 10 ) M 4 =(X 00 + X 01 )Y 11 M 5 =(X 10 X 00 )(Y 00 + Y 01 ) M 6 =(X 01 X 11 )(Y 10 + Y 11 ) Q 00 = M 0 + M 3 M 4 + M 6 Q 01 = M 1 + M 3 Q 10 = M 2 + M 4 Q 11 = M 0 + M 2 M 1 + M 5 (1) The Strassen s algorithm does the above computation recursively until one of the dimensions of the matrices is AHigh Performance Parallel Strassen Implementation In this section, we will see the principle of the hight performance parallel Strassen implementation presented in [CDLW95], which is foundation for our generalizing. First, decompose the matrix X into 2 2blocks of sub matrices X ij where i, j =0,1. Seconde, decompose further these four sub matrices into four 2 2(i.e. 4 4) blocks of sub matrices x ij where i, j =0,1,2,3. X= X00 X 01 = X 10 X 11 x 00 x 01 x 02 x 03 x 10 x 11 x 12 x 13 x 20 x 21 x 22 x 23 x 30 x 31 x 32 x 33 Similarly, perform the same decomposition on matrix Y and get: Y00 Y Y = 01 = Y 10 Y 11 y 00 y 01 y 02 y 03 y 10 y 11 y 12 y 13 y 20 y 21 y 22 y 23 y 30 y 31 y 32 y 33 Then, use the Strassen s formula to multiply the matrices X and Y,and get the following 361

4 sevenmatrix multiplication expressions: M 0 =(X 00 + X 11 )(Y 00 + Y 11 ) M 1 =(X 10 + X 11 )Y 00 M 2 = X 00 (Y 01 Y 11 ) M 3 = X 11 ( Y 00 + Y 10 ) M 4 =(X 00 + X 01 )Y 11 M 5 =( X 00 + X 10 )(Y 00 + Y 01 ) M 6 =(X 01 X 11 )(Y 10 + Y 11 ) Next, apply the Strassen s formula to these seven expressions to obtain 49 matrix multiplication expressions on sub matrices x and y. Taking M 0 as an example: M 00 =(x 00 + x 22 + x 11 + x 33 )(y 00 + y 22 + y 11 + y 33 ) M 01 =(x 10 + x 32 + x 11 + x 33 )(y 00 + y 22 ) M 02 =(x 00 + x 22 )(y 01 + y 23 y 11 y 33 ) M 03 =(x 11 + x 33 )(y 10 + y 32 y 00 y 22 ) M 04 =(x 00 + x 22 + x 01 + x 23 )(y 11 + y 33 ) M 05 =(x 10 + x 32 x 00 x 22 )(y 00 + y 22 + y 01 + y 23 ) M 06 =(x 01 + x 23 x 11 x 33 )(y 10 + y 32 + y 11 + y 33 ) Similarly,each of the remaining six matrix multiplication expressions M i for i =1,2,...,6 can also be expanded into six groups of matrix multiplications in terms of x and y. M 10 =(x 20 + x 22 + x 31 + x 33 )(y 00 + y 11 ) M 11 =(x 30 + x 32 + x 31 + x 33 ) y 00 M 12 =(x 20 + x 22 )(y 01 y 11 ) M 13 =(x 31 + x 33 )(y 10 y 00 ) M 14 =(x 20 + x 22 + x 21 + x 23 ) y 11 M 15 =(x 30 + x 32 x 20 x 22 )(y 00 + y 01 ) M 16 =(x 21 + x 23 x 31 x 33 )(y 10 + y 11 ) M 20 =(x 00 + x 11 )(y 02 y 22 + y 13 y 33 ) M 21 =(x 10 + x 11 )(y 02 y 22 ) M 22 = x 00 (y 03 y 23 y 13 + y 33 ) M 23 = x 11 (y 12 y 32 y 02 + y 22 ) M 24 =(x 00 + x 01 )(y 13 y 33 ) M 25 =(x 10 x 00 )(y 02 y 22 + y 03 y 23 ) M 26 =(x 01 x 11 )(y 12 y 32 + y 13 y 33 ) M 30 =(x 22 + x 33 )(y 20 y 00 + y 31 y 11 ) M 31 =(x 32 + x 33 )(y 20 y 00 ) M 32 = x 22 (y 21 y 01 y 31 + y 11 ) M 33 = x 33 (y 30 y 10 y 20 + y 00 ) M 34 =(x 22 + x 23 )(y 31 y 11 ) M 35 =(x 32 x 22 )(y 20 y 00 + y 21 y 01 ) M 36 =(x 22 + x 33 )(y 30 y 10 + y 31 y 11 ) 362

5 M 40 =(x 00 + x 02 + x 11 + x 13 )(y 22 + y 33 ) M 41 =(x 10 + x 12 + x 11 + x 13 ) y 22 M 42 =(x 00 + x 02 )(y 23 y 33 ) M 43 =(x 11 + x 13 )(y 32 y 22 ) M 44 =(x 00 + x 02 + x 01 + x 03 ) y 33 M 45 =(x 10 + x 13 x 00 x 02 )(y 22 + y 23 ) M 46 =(x 01 + x 03 x 11 x 13 )(y 32 + y 33 ) M 50 =(x 20 x 00 + x 31 x 11 )(y 00 + y 02 + y 11 + y 13 ) M 51 =(x 30 x 10 + x 31 x 11 )(y 00 + y 02 ) M 52 =(x 20 x 00 )(y 01 + y 03 y 11 y 13 ) M 53 =(x 31 x 11 )(y 10 + y 12 y 00 y 02 ) M 54 =(x 20 x 00 + x 21 x 01 )(y 11 + y 13 ) M 55 =(x 30 x 10 x 20 + x 00 )(y 00 + y 02 + y 01 + y 03 ) M 56 =(x 21 x 01 x 31 + x 11 )(y 10 + y 12 + y 11 + y 13 ) M 60 =(x 02 x 22 + x 13 x 22 )(y 20 + y 22 + y 31 + y 33 ) M 61 =(x 12 x 32 + x 13 x 33 )(y 20 + y 22 ) M 62 =(x 02 x 22 )(y 21 + y 23 y 31 y 33 ) M 63 =(x 13 x 33 )(y 30 + y 32 y 20 y 22 ) M 64 =(x 02 x 22 + x 03 x 23 )(y 31 + y 33 ) M 65 =(x 12 x 32 x 02 + x 22 )(y 20 + y 22 + y 21 + y 23 ) M 66 =(x 03 x 23 x 13 + x 33 )(y 30 + y 32 + y 31 + y 33 ) After finishing these 49 matrix multiplications, we need to combine the resulting M ij where i, j =0,1,...,6to form the final product matrix. q 00 q 01 q 02 q 03 Q00 Q Q = 01 = q 10 q 11 q 12 q 13 Q 10 Q 11 q 20 q 21 q 22 q 23 q 30 q 31 q 32 q 33 1, if i =4 1, if i =1 First, define some variables δ i = and γ 1otherwise i = 1otherwise blocks of sub matrices forming the product matrix Q can be written as:,the 4x4 q 00 = i S 1 δ i (M i0 + M i3 M i4 + M i6 ) q 01 = i S 1 δ i (M i2 + M i4 ) q 02 = i S 3 M i0 + M i3 M i4 + M i6 q 03 = i S 3 M i2 + M i4 q 10 = i S 1 δ i (M i1 + M i3 ) q 11 = i S 1 δ i (M i0 + M i2 M i1 + M i5 ) q 12 = i S 3 M i1 + M i3 q 13 = i S 3 M i0 + M i2 M i1 + M i5 363

6 Figure 1: Principle of the Strassen parallelizing in [CDLW95]. q 20 = i S 2 M i0 + M i3 M i4 + M i6 q 21 = i S 2 M i2 + M i4 q 22 = i S 4 γ i (M i0 + M i3 M i4 + M i6 ) q 23 = i S 4 γ i (M i2 + M i4 ) q 30 = i S 2 M i1 + M i3 q 31 = i S 2 M i0 + M i2 M i1 + M i5 q 32 = i S 4 γ i (M i1 + M i3 ) q 33 = i S 4 γ i (M i0 + M i2 M i1 + M i5 ) As you saw above,it is not very simple although they have only 49 matrix multiplications. It become great complicated if we want to go further -when we have 343, 2401 or more matrix multiplications. 3 Generalizing of the Parallel Strassen Implementation The principle of the method that has been presented is to parallelize the Strassen s algorithm at the system level -i.e. to stop on the recursion level r of execution tree -and the calculation of the products of sub matrices is locally performed by the processors. The most important point here is to determine the sub matrices after having applied r time the Strassen s formula, and to find the result matrix from the products of these sub matrices. In the preceding works, the solutions are given with fixed values of r (= 1, 2). But the solution for the general case has not been found. Such are the problems with which we are confronted and the solution will be presented in this section. 364

7 3.1 Recursion Removal infast Matrix Multiplication We represent the Strassen s formula: m l = x ij SX(l, i, j) i,j=0,1 l =0 6 and q ij = 6 l=0 m l SQ(l, i, j) i,j=0,1 y ij SY (l, i, j) (2) with Each of 7 k product can be represented as in the following: m l = x ij SX k (l, i, j) y ij SY k (l, i, j) i,j=0,n 1 l =0...7 k 1 and q ij = 7k 1 m l SQ k (l, i, j) l=0 i,j=0,n 1 (3) In fact, SX = SX 1,SY = SY 1,SQ = SQ 1. Now we have to determine values of matrices SX k,sy k,and SQ k from SX 1,SY 1,and SQ 1.Inorder to obtain this, we extend the definition of tensor product in [KHJS90] for arrays of arbitrary dimensions as followed: Definition. Let Aand Bare arrays of same dimension l and of size m 1 m 2... m l,n 1 n 2... n l respectively. Then the tensor product (TP) is an array of same dimension and of size m 1 n 1 m 2 n 2... m l n l defined by replacing each element of A with the product of the element and B. P = A B where P [i 1,i 2,..., i l ]=A[k 1,k 2,..., k l ] B [h 1,h 2,..., h l ],i j = k j n j +h j with 1 j l; Let P = n A i =(...(A 1 A 2 ) A 3 )... A n ) with A i is array of dimension l and of size m i1 m i2... m il.the following theorem allows computing directly elements of P Theorem. P [j 1,j 2,..., j l ]= n A i [h i1,h i2,..., h il ] where j k = n n m rk. s=1 365 h sk r=s+1 (4)

8 Proof. We prove the theorem by induction. With n =1,the proof is trivial. With n =2, it is true by the definition. Suppose it is true with n 1. Weshow that it is true with n. We have P n 1 [t 1,t 2,..., t l ]= n 1 A i [h i1,h i2,..., h il ] where t k = n 1 n 1 m rk with 1 k l; and then P n = P n 1 A n. By definition P n [j 1,j 2,..., j l ]=P n 1 [p 1,p 2,..., p l ] A n [h n1,h n2,..., h nl ]= where j k = p k m nk + h nk = m nk n 1 n 1 h sk m rk + h nk s=1 r=s+1 = n 1 n m rk + h nk = n n m rk s=1 h sk r=s+1 The theorem is proved. s=1 h sk r=s+1 s=1 n h sk r=s+1 A i [h i1,h i2,..., h il ] In particular,ifall A i have the same size m 1 m 2... m l,wehave P [j 1,j 2,..., j l ]= n A i [h i1,h i2,..., h il ] where j k = n hsk m n s k. Remark. j k = n s=1 hsk m n s k s=1 is a jk s factorization in base m k.wenote a = a 1 a 2...a l(b) the a s factorization in base b hence P [j 1,j 2,..., j l ]= n A i [h i1,h i2,..., h il ] then j k = h i1 h i2...h in(mk ). Now wereturn to our algorithm. We have following theorem: Theorem. SX k = k SX SY k = k SY SQ k = k SQ Proof. We prove the theorem by induction. Clearly it is true with k =1.Suppose it is true with k 1. The algorithm sexecution tree is balanced with depth k and degree 7. Thanks to (3), we have at the level k 1 of the tree: (5) M l = 0 i,j 2 k l 7 k 1 1 X k 1,ij SX k 1 (l, i, j) Then thanks to (2) at the level k we have 0 i,j 2 k 1 1 Y k 1,ij SY k 1 (l, i, j) 366

9 M l [l ]= 0 i,j 1 0 i,j 1 0 l 7 k l 6 = 0 i,j 1 0 i,j 1 0 i,j 2 k i,j 2 k i,j 2 k i,j 2 k l 7 k l 6 X k 1,ij [i,j ]SX k 1 (l, i, j) Y k 1,ij [i,j ]SY k 1 (l, i, j) SX(l,i,j ) SY (l,i,j ) X k 1,ij [i,j ]SX k 1 (l, i, j)sx(l,i,j ) Y k 1,ij [i,j ]SY k 1 (l, i, j)sy (l,i,j ) (6) where X k 1,ij [i,j ],Y k 1,ij [i,j ]are 2 k 2 k matrices obtained by division X k 1,ij,Y k 1,ij in 4sub matrices (i,j indicate the sub matrix s quarter). We present l, l in the base 7,and i, j, i,j in the base 2 and remark that X k 1,ij [i,j ]= X k ii 2, jj 2.Then (6) becomes M[ll (7)] = X k [ii (2), jj (2)]SX k 1 (l, i, j)sx(l,i,j ) 0 ii (2),jj (2) 2k ii (2),jj (2) 2k 1 1 Y k [ii (2), jj (2)]SY k 1 (l, i, j)sy (l,i,j ) 0 ll (7) 7 k 1 1 In addition, we have directly from (3): M[ll (7)] = X k [ii (2), jj (2)]SX k ll (2) (7), ii (2), jj 0 ii (2),jj (2) 2k ii (2),jj (2) 2k 1 1 Y k [ii (2), jj (2)]SY k ll (7), ii (2), jj (2) 0 ll (7) 7 k 1 1 Compare (7) and (8) we have SX k ll7, ii 2, jj 2 = SX k 1 (l, i, j) SX (l,i,j ) SY k ll7, ii 2, jj 2 = SY k 1 (l, i, j) SY (l,i,j ) (7) (8) 367

10 By definition, we have Similarly The theorem is proved. SX k = SX k 1 SX = k SX SY k = SY k 1 SY = k SY SQ k = SQ k 1 SQ = k SQ Thanks to Theorem 3.1 and Remark 3.1 we have SX k (l, i, j) = SY k (l, i, j) = SQ k (l, i, j) = k r=1 k r=1 k r=1 SX (l r,i r,j r ) SY (l r,i r,j r ) SQ(l r,i r,j r ) (9) Apply (9) in (3) we have nodes leafs m l and all the elements of result matrix. 3.2 Generalizing Now weknown how toparallelize Strassen s algorithm in general case: firstwe stop at therecursion level r, thanks to the expressions (9) and (3),we have the entire corresponding sub matrices: r M l = X ij SX (l t,i t,j t ) i, j =0,2 r t=1 1 r (10) Y ij SY (l t,i t,j t ) i, j =0,2 r t=1 1 l =0...7 r 1 with X ij = x i 2 k r,j 2 k r x i 2 k r,j 2 k r +2 k r 1... x i 2 k r +2 k r 1,j 2 k r... x i 2 k r +2 k r 1,j 2 k r +2 k r 1 Y ij = y i 2 k r,j 2... y k r i 2 k r,j 2 k r +2 k r y i 2 k r +2 k r 1,j 2 k r... y i 2 k r +2 k r 1,j 2 k r +2 k r 1 i =0,2 r 1,j =0,2 r 1 368

11 The product M l of the sub matrices i =0,2 r 1 j=0,2 r 1 r X ij SX (l t,i t,j t ) t=1 r and Y ij SY (l t,i t,j t ) is locally calculated on each processor i =0,2 r t=1 1 j=0,2 r 1 by the sequential matrix multiplication algorithms. Finally, thanks to (9) &(3) we have directly sub matrix elements of result matrix by applying matrix additions instead of backtracking manually the recursive tree to calculate the root in [LD95], [CDLW95], and [GSv96]: Q ij = 7r 1 = 7r 1 l=0 M l SQ r (l, i, j) l=0 r (11) M l SQ(l t,i t,j t ) t=1 4 Conclusion We have presented ageneral scalable parallelization for all the matrix multiplication algorithms on distributed memory computers that use Strassen s algorithm at inter-processor level. The running time for these algorithms decreases when the recursion level increases hence this general solution gives us compatibility to find better algorithms (which correspond with adefinite value of the recursive leveland adefinite matrix multiplication algorithm at the bottom level). And from a different view, we have generalized the Strassen s formula for the case where the matrices are divided into 2 k parts (the case k =2gives usoriginal formulas) thus we have a whole new direction to parallelize the Strassen s algorithm. In addition, we are applying these ideas to all the fast matrix multiplication algorithms. References [CDLW95] Chung-Chiang Chou, Yuefan Deng, Gang Li, and Yuan Wang. Parallelizing Strassen s Method for Matrix Multiplication on Distributed Memory MIMD architectures. Computers and Math. with Applications, 30(2):4 9, [CW90] [GL89] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 9(3): , G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, 2nd edition,

12 [GSv96] Brian Grayson, Ajay Shah, and Robert van de Geijn. A High Performance Parallel Strassen Implementation. Parallel Processing Letters,6(1):3 12, [KHJS90] B. Kumar, Chua-Huang Huang, Rodney W. Johnson, and P. Sadayappan. A tensor product formulation of Strassen s matrix multiplication algorithm. Applied Mathematics Letters, 3(3):67 71, [LD95] [LPS92] Qingshan Luo and John B. Drake. A scalable parallel Strassen s matrix multiplication algorithm for distributed memory computers. In Proceedings of the 1995 ACM symposium on Applied computing, pages , Nashville, Tennessee, United States, ACM Press. J. Laderman, V. Y. Pan, and H. X. Sha. On Practical Algorithms for Accelerated Matrix Multiplication. Linear Adgebra and Its Applications,162: , [Str69] Volker Strassen. Gaussian Elimination is not Optimal. Numer. Math., 13: , [Win71] Shmuel Winograd. On multiplication of 2 x 2 matrices. Linear Algebra and its Applications, 4: ,

Practicality of Large Scale Fast Matrix Multiplication

Practicality of Large Scale Fast Matrix Multiplication Practicality of Large Scale Fast Matrix Multiplication Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz and Oded Schwartz UC Berkeley IWASEP June 5, 2012 Napa Valley, CA Research supported by

More information

Chapter 2. Divide-and-conquer. 2.1 Strassen s algorithm

Chapter 2. Divide-and-conquer. 2.1 Strassen s algorithm Chapter 2 Divide-and-conquer This chapter revisits the divide-and-conquer paradigms and explains how to solve recurrences, in particular, with the use of the master theorem. We first illustrate the concept

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

On the Exponent of the All Pairs Shortest Path Problem

On the Exponent of the All Pairs Shortest Path Problem On the Exponent of the All Pairs Shortest Path Problem Noga Alon Department of Mathematics Sackler Faculty of Exact Sciences Tel Aviv University Zvi Galil Department of Computer Science Sackler Faculty

More information

An introduction to parallel algorithms

An introduction to parallel algorithms An introduction to parallel algorithms Knut Mørken Department of Informatics Centre of Mathematics for Applications University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 1/26

More information

Complexity of Matrix Multiplication and Bilinear Problems

Complexity of Matrix Multiplication and Bilinear Problems Complexity of Matrix Multiplication and Bilinear Problems François Le Gall Graduate School of Informatics Kyoto University ADFOCS17 - Lecture 3 24 August 2017 Overview of the Lectures Fundamental techniques

More information

Yale university technical report #1402.

Yale university technical report #1402. The Mailman algorithm: a note on matrix vector multiplication Yale university technical report #1402. Edo Liberty Computer Science Yale University New Haven, CT Steven W. Zucker Computer Science and Appled

More information

1 Caveats of Parallel Algorithms

1 Caveats of Parallel Algorithms CME 323: Distriuted Algorithms and Optimization, Spring 2015 http://stanford.edu/ reza/dao. Instructor: Reza Zadeh, Matroid and Stanford. Lecture 1, 9/26/2015. Scried y Suhas Suresha, Pin Pin, Andreas

More information

(Group-theoretic) Fast Matrix Multiplication

(Group-theoretic) Fast Matrix Multiplication (Group-theoretic) Fast Matrix Multiplication Ivo Hedtke Data Structures and Efficient Algorithms Group (Prof Dr M Müller-Hannemann) Martin-Luther-University Halle-Wittenberg Institute of Computer Science

More information

Dynamic Matrix Rank.

Dynamic Matrix Rank. Dynamic Matrix Rank Gudmund Skovbjerg Frandsen 1 and Peter Frands Frandsen 2 1 BRICS, University of Aarhus, Denmark gudmund@daimi.au.dk 2 Rambøll Management, Aarhus, Denmark Peter.Frandsen@r-m.com Abstract.

More information

Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition

Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Discovering Fast Matrix Multiplication Algorithms via Tensor Decomposition Grey Ballard SIAM onference on omputational Science & Engineering March, 207 ollaborators This is joint work with... Austin Benson

More information

Powers of Tensors and Fast Matrix Multiplication

Powers of Tensors and Fast Matrix Multiplication Powers of Tensors and Fast Matrix Multiplication François Le Gall Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo Simons Institute, 12 November

More information

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x Technical Report CS-93-08 Department of Computer Systems Faculty of Mathematics and Computer Science University of Amsterdam Stability of Gauss-Huard Elimination for Solving Linear Systems T. J. Dekker

More information

Finding a Heaviest Triangle is not Harder than Matrix Multiplication

Finding a Heaviest Triangle is not Harder than Matrix Multiplication Finding a Heaviest Triangle is not Harder than Matrix Multiplication Artur Czumaj Department of Computer Science New Jersey Institute of Technology aczumaj@acm.org Andrzej Lingas Department of Computer

More information

Introduction to Algorithms 6.046J/18.401J/SMA5503

Introduction to Algorithms 6.046J/18.401J/SMA5503 Introduction to Algorithms 6.046J/8.40J/SMA5503 Lecture 3 Prof. Piotr Indyk The divide-and-conquer design paradigm. Divide the problem (instance) into subproblems. 2. Conquer the subproblems by solving

More information

Algorithmic Approach to Counting of Certain Types m-ary Partitions

Algorithmic Approach to Counting of Certain Types m-ary Partitions Algorithmic Approach to Counting of Certain Types m-ary Partitions Valentin P. Bakoev Abstract Partitions of integers of the type m n as a sum of powers of m (the so called m-ary partitions) and their

More information

Inner Rank and Lower Bounds for Matrix Multiplication

Inner Rank and Lower Bounds for Matrix Multiplication Inner Rank and Lower Bounds for Matrix Multiplication Joel Friedman University of British Columbia www.math.ubc.ca/ jf Jerusalem June 19, 2017 Joel Friedman (UBC) Inner Rank and Lower Bounds June 19, 2017

More information

Fast reversion of formal power series

Fast reversion of formal power series Fast reversion of formal power series Fredrik Johansson LFANT, INRIA Bordeaux RAIM, 2016-06-29, Banyuls-sur-mer 1 / 30 Reversion of power series F = exp(x) 1 = x + x 2 2! + x 3 3! + x 4 G = log(1 + x)

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 4: Divide and Conquer (I) Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ Divide and Conquer ( DQ ) First paradigm or framework DQ(S)

More information

Matrix Multiplication

Matrix Multiplication Matrix Multiplication Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. n c ij = a ik b kj k=1 c 11 c 12 c 1n c 21 c 22 c 2n c n1 c n2 c nn = a 11 a 12 a 1n

More information

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem CS 577 Introduction to Algorithms: Jin-Yi Cai University of Wisconsin Madison In the last class, we described InsertionSort and showed that its worst-case running time is Θ(n 2 ). Check Figure 2.2 for

More information

Fast Matrix Product Algorithms: From Theory To Practice

Fast Matrix Product Algorithms: From Theory To Practice Introduction and Definitions The τ-theorem Pan s aggregation tables and the τ-theorem Software Implementation Conclusion Fast Matrix Product Algorithms: From Theory To Practice Thomas Sibut-Pinote Inria,

More information

A Divide-and-Conquer Algorithm for Functions of Triangular Matrices

A Divide-and-Conquer Algorithm for Functions of Triangular Matrices A Divide-and-Conquer Algorithm for Functions of Triangular Matrices Ç. K. Koç Electrical & Computer Engineering Oregon State University Corvallis, Oregon 97331 Technical Report, June 1996 Abstract We propose

More information

Automatica, 33(9): , September 1997.

Automatica, 33(9): , September 1997. A Parallel Algorithm for Principal nth Roots of Matrices C. K. Koc and M. _ Inceoglu Abstract An iterative algorithm for computing the principal nth root of a positive denite matrix is presented. The algorithm

More information

How to find good starting tensors for matrix multiplication

How to find good starting tensors for matrix multiplication How to find good starting tensors for matrix multiplication Markus Bläser Saarland University Matrix multiplication z,... z,n..... z n,... z n,n = x,... x,n..... x n,... x n,n y,... y,n..... y n,... y

More information

Experience in Factoring Large Integers Using Quadratic Sieve

Experience in Factoring Large Integers Using Quadratic Sieve Experience in Factoring Large Integers Using Quadratic Sieve D. J. Guan Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 guan@cse.nsysu.edu.tw April 19, 2005 Abstract

More information

SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS SOLVING LINEAR SYSTEMS We want to solve the linear system a, x + + a,n x n = b a n, x + + a n,n x n = b n This will be done by the method used in beginning algebra, by successively eliminating unknowns

More information

Parallel Integer Polynomial Multiplication Changbo Chen, Svyatoslav Parallel Integer Covanov, Polynomial FarnamMultiplication

Parallel Integer Polynomial Multiplication Changbo Chen, Svyatoslav Parallel Integer Covanov, Polynomial FarnamMultiplication Parallel Integer Polynomial Multiplication Parallel Integer Polynomial Multiplication Changbo Chen 1 Svyatoslav Covanov 2,3 Farnam Mansouri 2 Marc Moreno Maza 2 Ning Xie 2 Yuzhen Xie 2 1 Chinese Academy

More information

y b where U. matrix inverse A 1 ( L. 1 U 1. L 1 U 13 U 23 U 33 U 13 2 U 12 1

y b where U. matrix inverse A 1 ( L. 1 U 1. L 1 U 13 U 23 U 33 U 13 2 U 12 1 LU decomposition -- manual demonstration Instructor: Nam Sun Wang lu-manualmcd LU decomposition, where L is a lower-triangular matrix with as the diagonal elements and U is an upper-triangular matrix Just

More information

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee

Algorithm Analysis Divide and Conquer. Chung-Ang University, Jaesung Lee Algorithm Analysis Divide and Conquer Chung-Ang University, Jaesung Lee Introduction 2 Divide and Conquer Paradigm 3 Advantages of Divide and Conquer Solving Difficult Problems Algorithm Efficiency Parallelism

More information

Logarithmic and Exponential Equations and Change-of-Base

Logarithmic and Exponential Equations and Change-of-Base Logarithmic and Exponential Equations and Change-of-Base MATH 101 College Algebra J. Robert Buchanan Department of Mathematics Summer 2012 Objectives In this lesson we will learn to solve exponential equations

More information

An O(n 3 log log n/ log n) Time Algorithm for the All-Pairs Shortest Path Problem

An O(n 3 log log n/ log n) Time Algorithm for the All-Pairs Shortest Path Problem An O(n 3 log log n/ log n) Time Algorithm for the All-Pairs Shortest Path Problem Tadao Takaoka Department of Computer Science University of Canterbury Christchurch, New Zealand October 27, 2004 1 Introduction

More information

Ack: 1. LD Garcia, MTH 199, Sam Houston State University 2. Linear Algebra and Its Applications - Gilbert Strang

Ack: 1. LD Garcia, MTH 199, Sam Houston State University 2. Linear Algebra and Its Applications - Gilbert Strang Gaussian Elimination CS6015 : Linear Algebra Ack: 1. LD Garcia, MTH 199, Sam Houston State University 2. Linear Algebra and Its Applications - Gilbert Strang The Gaussian Elimination Method The Gaussian

More information

ALGORITHMS FOR CENTROSYMMETRIC AND SKEW-CENTROSYMMETRIC MATRICES. Iyad T. Abu-Jeib

ALGORITHMS FOR CENTROSYMMETRIC AND SKEW-CENTROSYMMETRIC MATRICES. Iyad T. Abu-Jeib ALGORITHMS FOR CENTROSYMMETRIC AND SKEW-CENTROSYMMETRIC MATRICES Iyad T. Abu-Jeib Abstract. We present a simple algorithm that reduces the time complexity of solving the linear system Gx = b where G is

More information

Chapter 3 - From Gaussian Elimination to LU Factorization

Chapter 3 - From Gaussian Elimination to LU Factorization Chapter 3 - From Gaussian Elimination to LU Factorization Maggie Myers Robert A. van de Geijn The University of Texas at Austin Practical Linear Algebra Fall 29 http://z.cs.utexas.edu/wiki/pla.wiki/ 1

More information

Compute the Fourier transform on the first register to get x {0,1} n x 0.

Compute the Fourier transform on the first register to get x {0,1} n x 0. CS 94 Recursive Fourier Sampling, Simon s Algorithm /5/009 Spring 009 Lecture 3 1 Review Recall that we can write any classical circuit x f(x) as a reversible circuit R f. We can view R f as a unitary

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS

QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS DIANNE P. O LEARY AND STEPHEN S. BULLOCK Dedicated to Alan George on the occasion of his 60th birthday Abstract. Any matrix A of dimension m n (m n)

More information

Northwest High School s Algebra 1

Northwest High School s Algebra 1 Northwest High School s Algebra 1 Summer Review Packet 2015 DUE THE FIRST DAY OF SCHOOL Student Name This packet has been designed to help you review various mathematical topics that will be necessary

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

SERIES

SERIES SERIES.... This chapter revisits sequences arithmetic then geometric to see how these ideas can be extended, and how they occur in other contexts. A sequence is a list of ordered numbers, whereas a series

More information

K K.OA.2 1.OA.2 2.OA.1 3.OA.3 4.OA.3 5.NF.2 6.NS.1 7.NS.3 8.EE.8c

K K.OA.2 1.OA.2 2.OA.1 3.OA.3 4.OA.3 5.NF.2 6.NS.1 7.NS.3 8.EE.8c K.OA.2 1.OA.2 2.OA.1 3.OA.3 4.OA.3 5.NF.2 6.NS.1 7.NS.3 8.EE.8c Solve addition and subtraction word problems, and add and subtract within 10, e.g., by using objects or drawings to Solve word problems that

More information

Pre-Calculus I. For example, the system. x y 2 z. may be represented by the augmented matrix

Pre-Calculus I. For example, the system. x y 2 z. may be represented by the augmented matrix Pre-Calculus I 8.1 Matrix Solutions to Linear Systems A matrix is a rectangular array of elements. o An array is a systematic arrangement of numbers or symbols in rows and columns. Matrices (the plural

More information

THE COMPLEXITY OF THE QUATERNION PROD- UCT*

THE COMPLEXITY OF THE QUATERNION PROD- UCT* 1 THE COMPLEXITY OF THE QUATERNION PROD- UCT* Thomas D. Howell Jean-Claude Lafon 1 ** TR 75-245 June 1975 2 Department of Computer Science, Cornell University, Ithaca, N.Y. * This research was supported

More information

Algebra & Trig. I. For example, the system. x y 2 z. may be represented by the augmented matrix

Algebra & Trig. I. For example, the system. x y 2 z. may be represented by the augmented matrix Algebra & Trig. I 8.1 Matrix Solutions to Linear Systems A matrix is a rectangular array of elements. o An array is a systematic arrangement of numbers or symbols in rows and columns. Matrices (the plural

More information

The Accelerated Euclidean Algorithm

The Accelerated Euclidean Algorithm The Accelerated Euclidean Algorithm Sidi Mohamed Sedjelmaci To cite this version: Sidi Mohamed Sedjelmaci The Accelerated Euclidean Algorithm Laureano Gonzales-Vega and Thomas Recio Eds 2004, University

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra By: David McQuilling; Jesus Caban Deng Li Jan.,31,006 CS51 Solving Linear Equations u + v = 8 4u + 9v = 1 A x b 4 9 u v = 8 1 Gaussian Elimination Start with the matrix representation

More information

An Integrative Model for Parallelism

An Integrative Model for Parallelism An Integrative Model for Parallelism Victor Eijkhout ICERM workshop 2012/01/09 Introduction Formal part Examples Extension to other memory models Conclusion tw-12-exascale 2012/01/09 2 Introduction tw-12-exascale

More information

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2 CMPSCI611: Three Divide-and-Conquer Examples Lecture 2 Last lecture we presented and analyzed Mergesort, a simple divide-and-conquer algorithm. We then stated and proved the Master Theorem, which gives

More information

('')''* = 1- $302. It is common to include parentheses around negative numbers when they appear after an operation symbol.

('')''* = 1- $302. It is common to include parentheses around negative numbers when they appear after an operation symbol. 2.2 ADDING INTEGERS Adding Integers with the Same Sign We often associate the + and - symbols with positive and negative situations. We can find the sum of integers by considering the outcome of these

More information

CSCI Final Project Report A Parallel Implementation of Viterbi s Decoding Algorithm

CSCI Final Project Report A Parallel Implementation of Viterbi s Decoding Algorithm CSCI 1760 - Final Project Report A Parallel Implementation of Viterbi s Decoding Algorithm Shay Mozes Brown University shay@cs.brown.edu Abstract. This report describes parallel Java implementations of

More information

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n Class Note #14 Date: 03/01/2006 [Overall Information] In this class, we studied an algorithm for integer multiplication, which improved the running time from θ(n 2 ) to θ(n 1.59 ). We then used some of

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,

More information

Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David

Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David 1.2.05 1 Topic Overview Sources of overhead in parallel programs. Performance metrics for parallel systems. Effect of granularity on

More information

Solving Systems of Linear Equations. Classification by Number of Solutions

Solving Systems of Linear Equations. Classification by Number of Solutions Solving Systems of Linear Equations Case 1: One Solution Case : No Solution Case 3: Infinite Solutions Independent System Inconsistent System Dependent System x = 4 y = Classification by Number of Solutions

More information

Integer Multiplication

Integer Multiplication Integer Multiplication in almost linear time Martin Fürer CSE 588 Department of Computer Science and Engineering Pennsylvania State University 1/24/08 Karatsuba algebraic Split each of the two factors

More information

Fast reversion of power series

Fast reversion of power series Fast reversion of power series Fredrik Johansson November 2011 Overview Fast power series arithmetic Fast composition and reversion (Brent and Kung, 1978) A new algorithm for reversion Implementation results

More information

YOU CAN BACK SUBSTITUTE TO ANY OF THE PREVIOUS EQUATIONS

YOU CAN BACK SUBSTITUTE TO ANY OF THE PREVIOUS EQUATIONS The two methods we will use to solve systems are substitution and elimination. Substitution was covered in the last lesson and elimination is covered in this lesson. Method of Elimination: 1. multiply

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation

Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation M A Hasan and C Negre Abstract We study Dickson bases for binary field representation Such representation

More information

A polynomial expression is the addition or subtraction of many algebraic terms with positive integer powers.

A polynomial expression is the addition or subtraction of many algebraic terms with positive integer powers. LEAVING CERT Honours Maths notes on Algebra. A polynomial expression is the addition or subtraction of many algebraic terms with positive integer powers. The degree is the highest power of x. 3x 2 + 2x

More information

Cost-Constrained Matchings and Disjoint Paths

Cost-Constrained Matchings and Disjoint Paths Cost-Constrained Matchings and Disjoint Paths Kenneth A. Berman 1 Department of ECE & Computer Science University of Cincinnati, Cincinnati, OH Abstract Let G = (V, E) be a graph, where the edges are weighted

More information

Parallel Scientific Computing

Parallel Scientific Computing IV-1 Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication. Direct method for solving a linear equation. Gaussian Elimination. Iterative method for solving a linear equation.

More information

Algebra One Dictionary

Algebra One Dictionary Algebra One Dictionary Page 1 of 17 A Absolute Value - the distance between the number and 0 on a number line Algebraic Expression - An expression that contains numbers, operations and at least one variable.

More information

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ). CMPSCI611: Verifying Polynomial Identities Lecture 13 Here is a problem that has a polynomial-time randomized solution, but so far no poly-time deterministic solution. Let F be any field and let Q(x 1,...,

More information

Primary Decompositions of Powers of Ideals. Irena Swanson

Primary Decompositions of Powers of Ideals. Irena Swanson Primary Decompositions of Powers of Ideals Irena Swanson 2 Department of Mathematics, University of Michigan Ann Arbor, Michigan 48109-1003, USA iswanson@math.lsa.umich.edu 1 Abstract Let R be a Noetherian

More information

A NOTE ON STRATEGY ELIMINATION IN BIMATRIX GAMES

A NOTE ON STRATEGY ELIMINATION IN BIMATRIX GAMES A NOTE ON STRATEGY ELIMINATION IN BIMATRIX GAMES Donald E. KNUTH Department of Computer Science. Standford University. Stanford2 CA 94305. USA Christos H. PAPADIMITRIOU Department of Computer Scrence and

More information

Group-theoretic approach to Fast Matrix Multiplication

Group-theoretic approach to Fast Matrix Multiplication Jenya Kirshtein 1 Group-theoretic approach to Fast Matrix Multiplication Jenya Kirshtein Department of Mathematics University of Denver Jenya Kirshtein 2 References 1. H. Cohn, C. Umans. Group-theoretic

More information

Tight Bounds on the Diameter of Gaussian Cubes

Tight Bounds on the Diameter of Gaussian Cubes Tight Bounds on the Diameter of Gaussian Cubes DING-MING KWAI AND BEHROOZ PARHAMI Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106 9560, USA Email: parhami@ece.ucsb.edu

More information

On Strassen s Conjecture

On Strassen s Conjecture On Strassen s Conjecture Elisa Postinghel (KU Leuven) joint with Jarek Buczyński (IMPAN/MIMUW) Daejeon August 3-7, 2015 Elisa Postinghel (KU Leuven) () On Strassen s Conjecture SIAM AG 2015 1 / 13 Introduction:

More information

Linear Solving for Sign Determination

Linear Solving for Sign Determination Linear Solving for Sign Determination Daniel Perrucci arxiv:0911.5707v1 [math.ag] 30 Nov 2009 Abstract We give a specific method to solve with quadratic complexity the linear systems arising in known algorithms

More information

Ma/CS 6b Class 12: Graphs and Matrices

Ma/CS 6b Class 12: Graphs and Matrices Ma/CS 6b Class 2: Graphs and Matrices 3 3 v 5 v 4 v By Adam Sheffer Non-simple Graphs In this class we allow graphs to be nonsimple. We allow parallel edges, but not loops. Incidence Matrix Consider a

More information

25. Strassen s Fast Multiplication of Matrices Algorithm and Spreadsheet Matrix Multiplications

25. Strassen s Fast Multiplication of Matrices Algorithm and Spreadsheet Matrix Multiplications 25.1 Introduction 25. Strassen s Fast Multiplication of Matrices Algorithm and Spreadsheet Matrix Multiplications We will use the notation A ij to indicate the element in the i-th row and j-th column of

More information

Fast Approximate Matrix Multiplication by Solving Linear Systems

Fast Approximate Matrix Multiplication by Solving Linear Systems Electronic Colloquium on Computational Complexity, Report No. 117 (2014) Fast Approximate Matrix Multiplication by Solving Linear Systems Shiva Manne 1 and Manjish Pal 2 1 Birla Institute of Technology,

More information

A Fast Euclidean Algorithm for Gaussian Integers

A Fast Euclidean Algorithm for Gaussian Integers J. Symbolic Computation (2002) 33, 385 392 doi:10.1006/jsco.2001.0518 Available online at http://www.idealibrary.com on A Fast Euclidean Algorithm for Gaussian Integers GEORGE E. COLLINS Department of

More information

Johns Hopkins Math Tournament Proof Round: Automata

Johns Hopkins Math Tournament Proof Round: Automata Johns Hopkins Math Tournament 2018 Proof Round: Automata February 9, 2019 Problem Points Score 1 10 2 5 3 10 4 20 5 20 6 15 7 20 Total 100 Instructions The exam is worth 100 points; each part s point value

More information

Introduction. An Introduction to Algorithms and Data Structures

Introduction. An Introduction to Algorithms and Data Structures Introduction An Introduction to Algorithms and Data Structures Overview Aims This course is an introduction to the design, analysis and wide variety of algorithms (a topic often called Algorithmics ).

More information

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i ) Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical

More information

An Introduction to NeRDS (Nearly Rank Deficient Systems)

An Introduction to NeRDS (Nearly Rank Deficient Systems) (Nearly Rank Deficient Systems) BY: PAUL W. HANSON Abstract I show that any full rank n n matrix may be decomposento the sum of a diagonal matrix and a matrix of rank m where m < n. This decomposition

More information

Implementations of 3 Types of the Schreier-Sims Algorithm

Implementations of 3 Types of the Schreier-Sims Algorithm Implementations of 3 Types of the Schreier-Sims Algorithm Martin Jaggi m.jaggi@gmx.net MAS334 - Mathematics Computing Project Under supervison of Dr L.H.Soicher Queen Mary University of London March 2005

More information

Scope and Sequence Mathematics Algebra 2 400

Scope and Sequence Mathematics Algebra 2 400 Scope and Sequence Mathematics Algebra 2 400 Description : Students will study real numbers, complex numbers, functions, exponents, logarithms, graphs, variation, systems of equations and inequalities,

More information

The Divide-and-Conquer Design Paradigm

The Divide-and-Conquer Design Paradigm CS473- Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm CS473 Lecture 4 1 The Divide-and-Conquer Design Paradigm 1. Divide the problem (instance) into subproblems. 2. Conquer the subproblems

More information

5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns

5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns 5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns (1) possesses the solution and provided that.. The numerators and denominators are recognized

More information

MATH STUDENT BOOK. 11th Grade Unit 10

MATH STUDENT BOOK. 11th Grade Unit 10 MATH STUDENT BOOK 11th Grade Unit 10 Unit 10 ALGEBRA II REVIEW MATH 1110 ALGEBRA II REVIEW INTRODUCTION 3 1. INTEGERS, OPEN SENTENCES, AND GRAPHS 5 INTEGERS 5 OPEN SENTENCES 10 GRAPHS 18 SELF TEST 1 24

More information

Math 370 Homework 2, Fall 2009

Math 370 Homework 2, Fall 2009 Math 370 Homework 2, Fall 2009 (1a) Prove that every natural number N is congurent to the sum of its decimal digits mod 9. PROOF: Let the decimal representation of N be n d n d 1... n 1 n 0 so that N =

More information

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Automatic Reproduction of a Genius Algorithm: Strassen s Algorithm Revisited by Genetic Search Seunghyun Oh and Byung-Ro Moon, Member, IEEE Abstract In 1968,

More information

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Section Summary. Relations and Functions Properties of Relations. Combining Relations Chapter 9 Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations Closures of Relations (not currently included

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

Enhancing Scalability of Sparse Direct Methods

Enhancing Scalability of Sparse Direct Methods Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.

More information

Introduction to Kleene Algebras

Introduction to Kleene Algebras Introduction to Kleene Algebras Riccardo Pucella Basic Notions Seminar December 1, 2005 Introduction to Kleene Algebras p.1 Idempotent Semirings An idempotent semiring is a structure S = (S, +,, 1, 0)

More information

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

Notes on the Matrix-Tree theorem and Cayley s tree enumerator Notes on the Matrix-Tree theorem and Cayley s tree enumerator 1 Cayley s tree enumerator Recall that the degree of a vertex in a tree (or in any graph) is the number of edges emanating from it We will

More information

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Direct Methods for Solving Linear Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview General Linear Systems Gaussian Elimination Triangular Systems The LU Factorization

More information

MATRICES. a m,1 a m,n A =

MATRICES. a m,1 a m,n A = MATRICES Matrices are rectangular arrays of real or complex numbers With them, we define arithmetic operations that are generalizations of those for real and complex numbers The general form a matrix of

More information

Chapter 5 Divide and Conquer

Chapter 5 Divide and Conquer CMPT 705: Design and Analysis of Algorithms Spring 008 Chapter 5 Divide and Conquer Lecturer: Binay Bhattacharya Scribe: Chris Nell 5.1 Introduction Given a problem P with input size n, P (n), we define

More information

Chapter 22 Fast Matrix Multiplication

Chapter 22 Fast Matrix Multiplication Chapter 22 Fast Matrix Multiplication A simple but extremely valuable bit of equipment in matrix multiplication consists of two plain cards, with a re-entrant right angle cut out of one or both of them

More information

Strassen-like algorithms for symmetric tensor contractions

Strassen-like algorithms for symmetric tensor contractions Strassen-like algorithms for symmetric tensor contractions Edgar Solomonik Theory Seminar University of Illinois at Urbana-Champaign September 18, 2017 1 / 28 Fast symmetric tensor contractions Outline

More information

Math Lecture 3 Notes

Math Lecture 3 Notes Math 1010 - Lecture 3 Notes Dylan Zwick Fall 2009 1 Operations with Real Numbers In our last lecture we covered some basic operations with real numbers like addition, subtraction and multiplication. This

More information

Math 2 Variable Manipulation Part 1 Algebraic Equations

Math 2 Variable Manipulation Part 1 Algebraic Equations Math 2 Variable Manipulation Part 1 Algebraic Equations 1 PRE ALGEBRA REVIEW OF INTEGERS (NEGATIVE NUMBERS) Numbers can be positive (+) or negative (-). If a number has no sign it usually means that it

More information