TMA4205 Numerical Liear Algebra The Poisso problem i R 2 : diagoalizatio methods September 3, 2007 c Eiar M Røquist Departmet of Mathematical Scieces NTNU, N-749 Trodheim, Norway All rights reserved
A direct method based o diagoalizatio We cosider here the umerical solutio of the Poisso problem based o fiite differeces I particular, we focus o a particular method for solvig the liear system of equatios i oe ad two space dimesios The method is based o diagoalizatio, ad we first explai the basic approach i the cotext of the oedimesioal Poisso problem: u xx = f i Ω = (0, ), u(0) = u() = 0 Assume that we use a uiform fiite differece grid give by: x i = x 0 + ih, i = 0,,, The correspodig system of algebraic equatios ca be writte as: h 2 2 2 2 2 u u 2 u = f f 2 f, where u i is a approximatio to u(x i ) = u(ih), i =,,, f i = f(x i ), ad u 0 = u = 0 due to the specified boudary coditios Let us write this system as where h 2 T u = f 2 2 T = 2, u = 2 u u, f = f f, ad h is the grid size or mesh size Sice T is symmetric positive defiite, it ca be diagoalized (Recall that a real, symmetric matrix is a ormal matrix) 2
Diagoalizatio of T Diagoalizatio of T meas that we wish to fid the eigevalues λ j ad the eigevectors q j of T, where T q j = λ j q j, j =,,, λ j > 0 q T k q j = δ jk (positive eigevalues), (orthoormal eigevectors) The We collect all the eigevectors q j ito the (orthoormal) matrix Q, Q = [q, q 2,, q ] T Q = Q Λ where λ Λ = diag(λ,, λ ) = λ Sice Q T Q = I = Q T = Q, ad T = Q Λ Q T () or Q T T Q = Λ (diagoal) 3
The fiite differece approximatio ca thus be computed as follows: g h 2 f g : O() operatios T u = g Q Λ Q T u = g Λ Q T u }{{} eu = Q T g }{{} eg g : O( 2 ) operatios Λũ = g ũ = Λ g Q T u = ũ u = Q ũ ũ : O() operatios u : O( 2 ) operatios Note that the trasformatios g = Q T g ad u = Q ũ are matrix-vector products operatios I summary, we ca compute u i O() + O( 2 ) + O() + O( 2 ) O( 2 ) floatig-poit operatios Hece, we ca solve our fiite differece system i ( ) ukows i O( 2 ) operatios This is ot competitive with a direct solutio algorithm based upo LU-factorizatio (Gaussia elimiatio) of a tridiagoal matrix, which ca be doe i O() operatios (sice the badwidth is equal to oe) Let us also compare the memory requiremet: O( 2 ) for the diagoalizatio approach (we eed to store Q); O() for a tridiagoal direct solver Agai, the diagoalizatio approach is ot competitive So, why bother? The aswer is that the diagoalizatio approach becomes more iterestig i R 2 I additio, it turs out that it is possible to use the Fast Fourier Trasform (FFT) to lower the computatioal complexity 4
2 The Poisso problem i R 2 The two-dimesioal Poisso problem o the uit square is give by 2 u = f i Ω = (0, ) (0, ), u = 0 o Ω, (2) where 2 u = 2 u x 2 + 2 u y 2 (x, y ) (x 0,y 0 ) h h Figure : A uiform fiite differece grid Agai, usig the otatio u i,j u(x i, y j ) = u(ih, jh) ad f i,j = f(x i, y j ), ad discretizig (2) usig the 5-poit stecil (see Figure ), the discrete equatios read: (u i+,j 2u i,j + u i,j ) h 2 (u i,j+ 2u i,j + u i,j ) h 2 = f i,j i, j (3) 2 Diagoalizatio Let u, u, U = u, u, 5
ad The, 2 0 2 T = 2 0 2 (T U) ij = 2u i,j u i+,j, i =, (T U) ij = u i,j + 2u i,j u i+,j, 2 i 2, (T U) ij = u i,j + 2u i,j, i = ad thus, Similarly, ( 2 h (T U) u 2 ij x 2 ( 2 h (U T ) u 2 ij y 2 ) ) (4) i,j (5) i,j Our fiite differece system (3) ca thus be expressed as or h 2 (T U + U T ) ij = f i,j for i, j, where T U + U T = G (6) f, f, G = h 2 f, f, 6
Combiig () ad (6) we get Q Λ Q T U + U Q Λ Q T = G (7) Multiplyig (7) from the right with Q ad from the left with Q T, ad usig the fact that Q T Q = I, we get: Λ Q T U Q }{{} e U + Q T U QΛ = Q T G Q }{{}}{{} e U Hece, (6) may be solved i three steps: e G Step ): Compute Step 2): Solve G = Q T G Q matrix-matrix products Λ Ũ + Ũ Λ = G or λ i ũ i,j + ũ i,j λ j = g i,j, i, j (λ i + λ j ) ũ i,j = g i,j, i, j ũ i,j = g i,j λ i + λ j i, j Step 3): Compute Here, U = Q Ũ QT matrix-matrix products U, Ũ, G, Q, Q T R ( ) ( ) 7
2 Computatioal cost The umber of degrees-of-freedom (or ukows), N, is N = ( ) 2 O( 2 ) ( ) Step ) Step 2) G = O( 3 ) {}}{ Q T G Q }{{} O( 3 ) O( 3 ) operatios Step 3) ũ i,j = g i,j λ i + λ j O( 2 ) operatios U = O( 3 ) {}}{ Q Ũ QT }{{} O( 3 ) O( 3 ) operatios I summary, we ca compute the discrete solutio, U, i O( 3 ) = O(N 3/2 ) operatios Note: this method is a example of a direct method 22 Compariso with other direct methods Computatioal cost Method Operatios (N op ) Memory requiremet (M) Diagoalizatio O(N 3/2 ) = O( 3 ) O(N) = O( 2 ) Baded LU O(Nb 2 ) = O( 4 ) O(Nb) = O( 3 ) Full LU O(N 3 ) = O( 6 ) O(N 2 ) = O( 4 ) Table : Computatioal cost ad memory requiremet for direct methods baded solver, we have used a badwidth b O() For the We coclude that the diagoalizatio method is much more attractive i R 2 tha i R The umber of floatig-poit operatios per degree-of-freedom is O(), while the memory requiremet is close to optimal (ie, scalable) 8
23 The matrices Q ad Λ The computatioal cost associated with the diagoalizatio approach tacitly assumes that we kow the eigevector matrix Q ad the correspodig eigevalues Let us therefore derive explicit expressios for these To this ed, cosider first the cotiuous eigevalue problem with solutios u xx = λu i Ω = (0, ), u(0) = u() = 0, u j(x) = si(jπx), λ j = j 2 π 2, j =, 2,, Cosider ow the discrete eigevalue problem T q j = λ j q j Try eigevector solutios which correspod to the cotiuous eigefuctios u j(x) sampled at the grid poits x i, i =,,, ie, ( q j ) i = u j(x i ) = si(jπx i ) = si(jπ(ih)), ( ) ijπ = si ( h = ) Operatig o q j with T gives ( ( )) ( ) jπ ijπ (T q j ) i = 2 cos si }{{}}{{} λ j (eq j ) i Hece, our try was successful: operatig o q j with T gives a multiple of q j 9
I order to proceed, set q j = α q j, ad choose α such that q j is ormalized: q T j q j =, ( ) 2 ijπ (q j ) i = si, i, j, ( ( )) jπ λ j = 2 cos For j, we observe that Sice h =, we have ( ( λ j 2 )) j 2 π 2 + j2 π 2 2 2 2 λ j h 2 j 2 π 2 = h 2 λ j for j Sice the approximatio of the oe-dimesioal Laplace operator o our fiite differece grid is equal to h 2 T (ad ot T ), this is the same as sayig that the first, lowest eigevalues (ad eigevectors) for the cotiuous case are well approximated by our fiite differece formulatio Note that i this case ad that ideed Q ij = (q j ) i = Q T = Q 2 si ( ijπ ), i, j, From the compariso of computatioal cost show earlier (see Table ), the diagoalizatio approach to solvig the discrete Poisso problem appears promisig 0
Questios: Is the memory requiremet optimal? 2 Ca the matrix-matrix multiplicatios be doe fast? 3 Ca we do better? 4 Ca the diagoalizatio method be exteded to three space dimesios? 5 Ca the diagoalizatio method be used o geeral domais? The aswer to the first questio is yes: we eed to store O( 2 ) floatig poit umbers for O( 2 ) ukows Note that we store the ukows as a multidimesioal array istead of as a log vector Also ote that we ever form the global system matrix i the two-dimesioal case The aswer to the secod questio is yes Matrix-matrix multiplicatio is oe of the fastest floatig-poit tasks o moder microprocessors The best performace is ormally achieved usig the appropriate BLAS (Basic Liear Algebra Subrouties) library fuctio sice this library is optimized for each particular microprocessor Matrix-matrix multiplicatio is oe of the operatios which comes closest to maximum theoretical performace (ie, the maximum umber of floatig poit operatios per secod) The reaso for this is that the umber of operatios per memory referece is high: O( 3 ) floatig poit operatios ad O( 2 ) memory refereces Note that brigig data to ad from memory is ofte the bottleeck whe it comes to performace The aswer to the third questio is yes The matrix-vector multiplicatio w = Q v (or w = Q T v ca alteratively be doe via the Discrete Sie Trasform (DST) This is, of course, related to the fact that the colums of Q represet the cotiuous eigefuctios u j(x) = si(jπx), j =,,, sampled at the iteral grid poits However, the DST is agai related to the Discrete Fourier Trasform (DFT), which ca be performed most efficietly usig the Fast Fourier Trasform (FFT) Hece, istead of obtaiig w = Q v via matrix-vector multiplicatio i O( 2 ) operatios, we ca alteratively obtai w from v via FFT i O( log ) operatios The solutio of the Poisso problem i two space dimesios (ie, O( 2 ) ukows), ca therefore be reduced from O( 3 ) operatios to O( 2 log ) operatios This is close to a optimal solver: O(log ) operatios per ukow ad O() storage requiremet per ukow The aswer to the fourth questio is yes, assumig that the domai is a udeformed box ad we use a simple, structured grid This is also related to the fifth questio: the fast solutio method preseted here is oly applicable o simple domais However, the approach is also attractive as a precoditioer for problems which do ot fall ito this category (more o this later) The solver preseted here is a example of a class of solvers called tesor-product solvers