Splitting of Expanded Tridiagonal Matrices. Seongjai Kim. Abstract. The article addresses a regular splitting of tridiagonal matrices.

Splitting of Expanded Tridiagonal Matrices ga = B? R for Which (B?1 R) = 0 Seongai Kim Abstract The article addresses a regular splitting of tridiagonal matrices. The given tridiagonal matrix A is rst expanded to an equivalent matrix A e and then split as e A = B? R for which B is block-diagonal and every eigenvalue of B?1 R is zero, i.e., (M?1 N ) = 0. The optimal splitting technique is applicable to various algorithms that incorporate one-dimensional solves or their approximations. Examples can be found in the parallelization of alternating direction iterative (ADI) methods and ecient parameter choices for domain decomposition (DD) methods for elliptic and parabolic problems. Numerical results solving the Helmholtz wave equation in two dimensions are presented to demonstrate usefulness and eciency of the splitting technique. Key words. Matrix expansion, regular splitting, ADI, domain decomposition method, Helmholtz wave equation. AMS subect classications. 5F15, 5F10, 5N55. 1. Introduction Lots of modern computational algorithms incorporate tridiagonal matrix solves of the form Au = b; (1.1) where A IR nn and u; b IR n. Examples in numerical PDE are the alternating direction iterative (ADI) method [,, 1] and parameter choices for multigrid (MG) and domain decomposition (DD) methods [10, 0]. Department of Mathematics, University of Kentucky, Lexington, Kentucky 050-00 USA Email: skim@ms.uky.edu 1

Splitting of Tridiagonal Matrices This article studies a regular splitting of tridiagonal matrices. The given tridiagonal matrix A is rst expanded to an equivalent matrix e A. Then, we consider a regular splitting for e A of the form ea = B? R; (1.) where B is block-diagonal and every eigenvalue of B?1 R is zero, i.e., (B?1 R) = 0: (1.3) Such an optimal splitting technique has not been studied for general tridiagonal matrices in the literature of numerical analysis; as a matter of fact, its usefulness is not yet recognized. The main obect of the article is to introduce the splitting technique (1.)-(1.3) and address its usefulness and eciency. The tridiagonal systems can become expensive to solve in some applications. For example, the ADI method is poorly tting with parallel computers of distributed memory, since it requires tridiagonal solves in all directions. The parallelization eciency is deteriorated when the tridiagonal systems are to be solved across the computer memory boundaries. We may try to solve the tridiagonal systems by an iterative algorithm (as an inner loop) incompletely. Such an idea was tested to be ecient even if -3 red-black (point) Gauss-Seidel iterations were performed for an approximation of the tridiagonal solve [, 5, 15]. Some computational algorithms require ecient choices of algorithm parameters, such as in domain decomposition (DD) methods and multigrid (MG) methods. As an application of (1.)-(1.3), we will present a strategy for parameter choices for a nonoverlapping DD method applied to the Helmholtz problem which is complexvalued, indenite, and non-hermitian. The article is organized as follows. In the next section, we present the numerical strategy of nding the optimal splitting of expanded matrices of tridiagonal matrices. Section 3 includes applications of the optimal splitting to the parallelization of ADI methods, and parameter choices for a DD method for the Helmholtz problem and the heat equation. Numerical results are presented to show eciency and usefulness of the optimal splitting.

Splitting of Tridiagonal Matrices 3. Optimal splitting of tridiagonal matrices.1. Matrix expansion and splitting We rst introduce the notion of equivalent matrix expansion of tridiagonal matrices. As an example, we consider the following algebraic system 3 3 3 a 11 a 1 u 1 b 1 a 1 a a 3 Au a 3 a 33 a 3 u a 3 a a u 3 b 5 5 = u b 3 5 b: (.1) b 5 a 5 a 55 u 5 b 5 An expansion of the system (.1) having two blocks can be written as 3 3 3 a 11 a 1 u 1 b 1 a 1 a a 3 u eaeu a 3 a 33 +' 1?' 1 a 3 b a 3?' 1 a 33 +' 1 a 3 u 3 u a 3 a a 5 5 3 = b 3 b u 5 3 b: e (.) b 5 a 5 a 55 u 5 b 5 When (.) admits a unique solution, the corresponding components of the solution eu are clearly the same as those of u. In the case, the matrix expansion e A in (.) is said to be equivalent to A. Now, consider a general tridiagonal matrix A IR nn and its related algebraic system (1.1). Then, we can expand A into A e of M blocks, M 1. We assume that each block has the same dimension m for simplicity; n = (m? 1)M + 1. Dene a tridiagonal matrix of dimension m m by T m (; ; ) = 11 + 1 1 3......... m?1;m? m?1;m?1 m?1;m m;m?1 m;m + Then, the splitting of e A can be formulated as follows: ea = B? R; (.3) 3 5 : where B = diag(d 1 ; ; D M ); R = 0 F 1 E 0 F......... E M?1 0 F M?1 E M 0 3 5 :

Splitting of Tridiagonal Matrices Here the matrices are dened by D 1 = T m ( 1 ; 0; ' 1 ); D = T m ( ; ' ;?1 ; ' ;+1 ); = ; ; M? 1; D M = T m ( M ; ' M;M?1 ; 0); where, = 1; ; M, are obtained from A, correspondingly, and E = for appropriate ea ;W and ea ;E. 3 0 0 ea ;W ' ;?1 0 0 0 0....... 5 ; F = 0 0 0 0 0 0 0 0....... 0 0 0 0 ' ;+1 ea ;E 0 0 Remark. With the splitting of the expanded matrix, we can solve (1.1) applying the iteration eu n = B?1 (Reu n?1 + e b); n = 1; ; ; (.) for a given eu 0. We know that the iteration converges if and only if (B?1 R) < 1. Remark. If A is an M-matrix, equivalence between A and A e can be proved as suggested by Tang [0], which is closely related to the existence of D?1, = 1; ; M? 1. For an expansion of tridiagonal matrices of constant diagonals for which the parameters ' i; are constant, i.e., ' = ' ;+1 = ' +1; for = 1; ; M? 1, Tang has utilized the analytic formula for D?1 to nd the single parameter ' that minimizes the spectral radius of the matrix B?1 R... The optimal parameters f' ;+1 g Recall the main goal of the section is to nd f' ;+1 g such that the spectral radius of G(:= B?1 R) is zero. It should be noticed that R has only (M? 1) nonzero elements, and therefore the matrix G has only (M? 1) nonzero columns that are related to the rst or last columns of the matrices D?1 )`;k, the (`; k)-element of D?1 t () `;k = (D?1 3 5 ;, = 1; ; M. Let. For a simple expression, we rst consider the case of two blocks: M =. Then, removing zero columns and the corresponding rows from matrix G, we have (G) = f0g [ (G 0 ); where (G) is the spectrum of the matrix G and G 0 = 0 0 ' 1 t (1) m?1;m ea 1;E t (1) m?1;m 0 0 ' 1 t (1) m;m ea 1;E t (1) m;m ea ;W t () 1;1 ' 1 t () 1;1 0 0 ea ;W t () ;1 ' 1 t () ;1 0 0 3 5 :

Splitting of Tridiagonal Matrices 5 Let P = Then a simple calculation yields where P?1 G 0 P = 1?' 1 =ea ;W 0 0 0 1 0 0 0 0 1 0 0 0?' 1 =ea 1;E 1 3 5 : 0 0 0 g 1 0 0 0 ea 1;E t (1) m;m ea ;W t () 1;1 0 0 0 g 1 0 0 0 (1) g 1 = ea 1;E t m?1;m + ' 1 t (1) () m;m ; g1 = ea ;W t ;1 + ' 1 t ea ;W ea 1;1 () : (.5) 1;E Again, reducing zero columns and the corresponding rows from P?1 G 0 P, we obtain (G) = f0g [ (G 0 ) = f0g [ (G 00 ); 3 5 ; where Now, it is easy to see (G) = 0 if G 00 = " 0 g1 g 1 0 # : g 1 = 0; or g 1 = 0: Now let us generalize the above argument. For example, consider the case: M =. Then, it is not dicult to derive the following: where G 00 = where, for = 1; ; M? 1, (G) = f0g [ (G 00 ); (.) 0 g 1 0 0 0 0 g 1 0 0 g 0 0 g 31 0 0 g 3 0 0 0 0 g 3 0 0 g 0 0 g 53 0 0 g 5 0 0 0 0 g 5 0 g?1; = ea ;E t () m?1;m + ' +1; ea +1;W t () (+1) g ;?1 = ea +1;W t ;1 + ' ;+1 ea ;E 3 5 m;m ; t (+1) 1;1 ; (.) : (.8)

Splitting of Tridiagonal Matrices What we want to do is to nd ' ;+1 and ' +1; such that g?1; = 0; or g ;?1 = 0; = 1; ; M? 1: In the case, we can see from (.)-(.) that (G) = 0. Set Then, from (.8), we have g?1; = 0; = 1; ; M? 1: ' +1; =?ea +1;W t () m?1;m t () m;m ; = 1; ; M? 1: (.9) The problem is thus reduced to the problem of nding the terms in the right side of (.9). How can we nd them? Since we do not assume any good features of A except nonsingularity, it is impossible to utilize the explicit formulas for the inverse of symmetric tridiagonal matrices of constant diagonals; see [8, 1] and and references therein. A computer-aid method can be designed as follows. Set ' ;+1 = ' +1; ; = 1; ; M? 1; (.10) and assume ' ;?1 is known. (The above setting is not a requirement, but a choice.) Let X be the last column of D?1, i.e., D X = I m ; X = (t () 1;m; t () ;m; ; t () m;m )T : (.11) where I m = (0; ; 0; 1) T. Now, we can nd X explicitly by inverting the tridiagonal matrix D. In fact, D need not be inverted completely: let D be factorized into L U, where the diagonal elements of L are all one. It can be easily veried that L U X = U X = I m and the (m? 1)-th equation of the linear system U X = I m is U ;m?1;m?1 t () m?1;m + U ;m?1;m t () m;m = 0; (.1) where U ;`;k is the (`; k)-th element of U. It follows from (.9) and (.1) that ' +1; = ea +1;W U ;m?1;m U ;m?1;m?1 ; = 1; ; M? 1: (.13) We have assumed that ' ;?1 is known to nd ' +1;. Note that one can begin the procedure from D 1 with ' 1;0 = 0 to nd ' 1; = ' ;1, and then nd ' ;+1 = ' +1;, = ; ; M? 1, recursively. We have shown that (G) = 0 when ' +1; are obtained recursively by using (.13) and setting ' ;+1 = ' +1; (a choice). The procedure works whenever the LU-factorizations of D exist with or without pivoting for = 1; ; M? 1. That is, we can nd parameters ' +1; if all D are nonsingular.

Splitting of Tridiagonal Matrices 3. Applications 3.1. Parallelization of the ADI method In this section we discuss the classical ADI method and its parallelization which has motivated the main result presented in the previous section. Let = (a x ; b x ) (a y ; b y ) be a domain in IR with its boundary? = @. Consider the problem (a)?r (aru) = f; x ; (b) au + u = g; x?; (3.1) where a = a(x) > 0 is the diusion coecient, the subscript denotes the outer unit normal from the domain, = (x) 0 is a given function, and f and g are the sources. Assume that the problem admits a unique solution. Let the problem (3.1) be discretized by e.g. the central nite dierence (FD) method on a uniform mesh of (N x? 1) (N y? 1) cells; h x = (b x? a x )=(N x? 1) and h y = (b y? a y )=(N y? 1). Then, its linear system reads Au = (H + V )u = b; (3.) where A IR NN, where N = N x N y. Here the matrices H and V are the discretizations of the dierential operators in the horizontal and vertical directions respectively. Each of them is assumed to be reducible to the direct sum of irreducible Stieltes matrices [1]. A Stieltes matrix is by denition a real symmetric and positive denite matrix with nonpositive o-diagonal entries. (We have assumed that A is symmetric. For the central FD method for the problem of the Neumann or mixed boundary condition, one can symmetrize the algebraic system by multiplying certain numbers to the rows of the matrix and the source vector that correspond to nodal points on the boundary. In this case, the matrix turns out to be the same as what one can obtain from the bi-linear nite element method with the Trapezoidal quadrature rule.) To introduce an ADI method, we rewrite (3.) as a pair of matrix equations (ri + H)u = (ri? V )u + b; (ri + V )u = (ri? H)u + b; (3.3) for some positive number r. The ADI iteration is then dened as [,, 1] (a) (r m+1 I + H)u m+1= = (r m+1 I? V )u m + b; (b) (r m+1 I + V )u m+1 = (r m+1 I? H)u m+1= + b; m 0; (3.) where u 0 is an initial approximation of the solution of (3.), and the r m 's are positive constants called acceleration parameters, which are to be chosen so as to make the convergence of the iteration rapid. For a given ordering of grid points (e.g., row-wise

Splitting of Tridiagonal Matrices 8 Figure 1: Numerical grid distribution for parallel computers. The thick line segments indicate computer memory boundaries. ordering), the matrix r m+1 I +H is tridiagonal. One can nd a suitable permutation of the rows and columns which converts r m+1 I + V to a tridiagonal matrix; the systems (3.) are easily solved by using e.g. the Gaussian elimination. ADI methods can be accelerated with multiple parameters of cycle length n c. There are well developed theory to accelerate the convergence of ADI by the choice of dierent r's used cyclically. Among others, the optimal parameters suggested by Wachspress [] for the cases n c = t, t > 0, are easy to nd and ecient. See also [1, Ch. ] and [3] for dierent choices of acceleration parameters and analysis for the asymptotic spectral radius of (3.). On parallel computers, tridiagonal solves across the computer memory boundaries often slow down the ADI method (3.) to be unsatisfactory. Let the data be distributed as in Figure 1 for a parallel computer. The vertical sweep (3.b) can be solved eciently after simple communications of data between adacent processors. The horizontal sweep (3.a) requires parallel tridiagonal solves, which is presumably expensive. We suggest the iteration of the form (.) for solving tridiagonal systems in (3.a) by a nonoverlapping DD technique [, 10, 1]. The matrix expansion presented in x has been motivated from the DD method, which is discussed below. Let y0 be a horizontal grid line passing the grid point (a x ; y 0 ) and? y0 =?\@ y0. Let a;hu h and @ c; u h be the centered (second-order) nite dierence approximations of (au x ) x and au, respectively. Then, the problem (3.a) restricted to y0 can be rewritten in the following form (a)? a;hu h + qu h = w; x y0 ; (b) @ c; u h + u h = g; x? y0 ; (3.5) where iteration indices are dropped to simplify the presentation and w is the nite dierence approximation of (au m y ) y + r m+1 u m + f: Let M be the number of subdomains/processors; decompose y0 into nonoverlapping M subintervals f y0 ; : = 1; ; Mg ordered from left to right and partition the subintervals equally into (m? 1) elements each. (The total number of grid points

Splitting of Tridiagonal Matrices 9 W O E y0 ;? y0 ;k y0 ;k Figure : The subintervals y0 ; and y0 ;k and their interface? y0 ;k. The interface point is denoted by O and its adacent points by W y0 ; and E y0 ;k. in the x-direction is N x = M (m? 1) + 1:) Let? y0 ; =? y0 \ @ y0 ;;? y0 ;k = @ y0 ; \ @ y0 ;k; k = + 1 or? 1: The restriction of u h onto y0 ; is denoted by u h. Now, we can dene the DD method for (3.5) over f y0 ;g: nd fu h g satisfying (a)? a;h uh + quh = w; x y 0 ;; (b) @ c; u h + u h = g; x? y0 ;; (c) u h = uh k ; @ f;ku h =?@ b;ku h k ; x? y 0 ;k; (3.) where @ f;k u h and @ b;k u h k are respectively the forward and backward dierences for a@u =@ and a@u k =@ k on? y0 ;k. For the case in Figure, they are dened by @ f;k u h (O) = ea u h (E)? uh (O) ;E ; @ b;k u h k h (O) = ea u h k(o)? u h k(e) ;E ; x h x where h x is the grid size and ea ;E is an average value of a on the interval [O; E], e.g., ea ;E = a(o) + a(e) : (One may try to use the harmonic average instead of the arithmetic average.) The outer average ea ;W of a at the left end point of y0 ; can be dened in the same way. For example, ea k;w = (a(o) + a(w))= in Figure. Remark. For DD methods for the dierential problem, both the solution fu g and the conormal derivative fa@u =@ g should be continuous on the interfaces. On the other hand, for the discrete problem, fu g and fa@u =@ g cannot be simultaneously continuous unless the solution is linear over the whole domain, a totally uninteresting case. Finite dierence/element methods impose the continuity of the solution and allow the conormal derivatives to be discontinuous on the interfaces of elements. To implement this situation elegantly, we have introduced the forward-backward matching for the conormal derivatives; see the second equation of (3.c). The equations (3.c) can be equivalently written as follows (called the Robin interface boundary condition): @ f;k u h + uh =?@ b;ku h k + uh k ;? k = 1; (3.)

Splitting of Tridiagonal Matrices 10 where is a positive function on the interfaces f? y0 ;kg. Remark. It is easy to check that (3.5) and (3.) are equivalent, i.e., the solution u h of (3.5) restricted onto y0 ; is the same as u h, the solution of (3.) on y0 ;. We will leave it to interested readers. The basic idea of a DD iterative method is to localize the computations to smaller subdomain problems. It is feasible to localize to each y0 ; by evaluating the quantities in (3.a)-(3.b) and (3.) related to each y0 ; at the new iterate level and those related to neighboring subdomains y0 ;k at the old level. The iterative algorithm can be dened as follows: choose an initial guess fu h;0 : = 1; ; M x g, then recursively build the sequences fu h;n : = 1; ; M x g, n 1, by solving (a) (b) (c) @ f;k u h;n? a;hu h;n @ c; u h;n + u h;n + qu h;n = w; x y0 ;; + u h;n = g; x? y0 ;; =?@ b;k u h;n?1 k + u h;n?1 k ; x? y0 ;k; (3.8) Its algebraic system can be formulated as in (.) where B and R are dened in (.3) and ' +1; = +1; h? ea +1;W ; ' ;+1 = ;+1 h? ea ;E : (3.9) One can nd the optimal parameters ' +1; using the procedure in x and ' ;+1 = ' ;+1 + ea ;E? ea +1;W from (3.9). Here we have set ;+1 = +1; to satisfy the continuity of the solution on the interfaces. For a test of (3.8) incorporating the optimal splitting technique, we choose y0 = (0; 1), replace (3.8b) by the Dirichlet boundary condition, and set u(x) = sin(x); a(x) = 1 + sin(x)=; (3.10) where w is given correspondingly. The unit interval is partitioned into 58 elements (h = 1=58). Following the argument in [] for n c = 8, we have found the eight cyclic acceleration parameters for the model problem (a(x) 1) as 3.1e+0; 8.98e-1;.13e-1; 5.01e-; 1.18e-;.e-3;.55e-; 1.8e-: For these parameters applied to the model problem in D, a cycle of eight ADI iterations would reduce the error by the factor of.e-3. As suggested in [], these parameters can be eciently incorporated in (3.) with a certain scaling. Note that the average of the diusion coecient in (3.10) is one. When the above parameters are used for the algorithm (3.8) with (3.10), the algorithm would converge fast for the acceleration parameters except two or three smallest ones. In Table 1, we present the relative error E n 1 = ku? u h;n k 1 =kuk 1 ;

Splitting of Tridiagonal Matrices 11 r =5.01e- r =1.8e- n M=8 M=1 M=3 M=8 M=1 M=3 1.5e-1.e-1 5.e-1.e-1 8.e-1 9.0e-1 3.9e-5.1e- 1.e- 1.8e-1.e-1.e-1 3 1.e- 9.0e-.1e-.9e-.8e- 1.e-1 1.e- 1.e-.3e-5.1e-.0e- 1.0e-1 5 1.e- 1.e- 9.e-.8e-3.8e-.5e- Table 1: The error E n 1 for the rst ve iterations of the algorithm (3.8). for the rst ve iterations of (3.8), for r = 5.01e- and r 8 = 1.8e-. Set u h;0 0; M denotes the number of subdomains, i.e., the number of processors for a parallel computation. The discretization error seems 1.e-. As one can see from the table, when 1 or less processors are utilized, -3 iterations of (3.8) are almost the same as the direct tridiagonal solve for large parameters, while -3 iterations improve one-digit accuracy of the solution for the smallest acceleration parameter. Note that the inner loop of nested iterative algorithms can be solved incompletely without signicantly deteriorating the overall convergence rate. More systematic treatments of ADI and their parallel implementation will appear elsewhere [1]. In practice, the iterative line solves (3.8) can be implemented with the red-black ordering of subproblems, which speeds up convergence further. 3.. Parameter choice for domain decomposition methods As another application of the splitting discussed in Section, we consider parameter choices for a nonoverlapping DD method for the Helmholtz wave equation. The Helmholtz problem. Let = (a x ; b x ) (a y ; b y ) and? = @. Consider the following complex-valued indenite problem?u? K u = S; x ; u + iu = 0; x?; (3.11) where S is the wave source and i is the imaginary unit. The coecients K and satisfy K = p? iq ; 0 < p 0 p(x) p 1 < 1; 0 q 0 q(x) q 1 < 1; = r? i i ; r (x) > 0; i (x) 0; for some positive constants p 0, p 1, q 0, and q 1, and are suciently regular that the existence and uniqueness of a solution of (3.11) lying in H 1 () are assured for reasonable S. The coecient is properly chosen such that the second equation of (3.11) represents a rst-order absorbing boundary condition (ABC) that allows normally incident

Splitting of Tridiagonal Matrices 1 waves to pass out of transparently. The problem (3.11) models the propagation of time-harmonic waves such as electromagnetic waves, seismic waves, and underwater acoustics. In most applications, the wave number p is given as p =! v = f v ; where f is the frequency,!(= f) is the angular frequency, and v is the wave velocity in the medium under consideration. The wave problem (3.11) is dicult to solve, in particular, when 0 q p: (3.1) In addition to having a complex-valued solution, it is neither Hermitian symmetric nor coercive; as a consequence, most standard iterative methods either fail to converge or converge very slowly. In acoustic wave applications, for example, we often need to simulate waves of 0-50 wavelengths. (The wavelength is =p by denition.) It is known that the second-order FD scheme requires at least to 8 grid points per wavelength for a stability reason [10, 19]. Domain decomposition method: In solving (3.11), it is often the case that the coarse grid problem of DD or MG methods is still large and the coarse grid problem should be solved without introducing another coarser grid correction due to stability reasons. Here we introduce a nonoverlapping DD method not incorporating coarser meshes. The DD method can be utilized as either the main solver or the coarse grid solver of a MG method. Let the domain be partitioned into (N x?1)(n y?1) cells; h x = (b x?a x )=(N x?1), h y = (b y? a y )=(N y? 1), and h = max(h x ; h y ). Let u h be the solution of the FD approximation of (3.11):? h u h? K u h = S; x ; @ c u h + iu h = 0; x?; (3.13) where h u h is the centered 5-point dierence approximation of u and @ c u h denotes the centered dierence for u on the boundary?. Let f ; = 1; ; Mg be the subdomains of : = M[ =1 ; \ k = ;; = k: Assume that are also rectangular regions consisting of a group of cells. Let? =? \ @ ;? k =? k = @ \ @ k :

Splitting of Tridiagonal Matrices 13 Let u be the restriction of u on. Then, following the idea discussed in x3.1, we can formulate a nonoverlapping DD method for (3.13) as follows: choose initial guess g, then nd fu h;n g, n 1, satisfying fu h;0 @ f;k u h;n? h u h;n @ c; u h;n + i u h;n? K u h;n = S; x ; + iu h;n = 0; x? ; =?@ b;k u h;n?1 k + i u h;n?1 k ; x? k ; where = r? i i, r > 0, i 0, is the acceleration parameter. (3.1) Despres [, 3], indebted to Lions [1], analyzed a DD algorithm for (3.11) on a dierential, rather than discrete, level. In his implementation of the algorithm incorporating the mixed nite element method of index zero [18], he chose K for the acceleration parameter, i.e., = K. The choice has been motivated by the ABC; when K is real and constant, = K satises the rst-order ABC. The convergence of algorithm (3.1) was analyzed by the author [10], without introducing the coarse grid correction. Various acceleration techniques are incorporated for (3.1) to converge with a linear dependence on the subdomain size H, but independently on the mesh size h and the wave number K; see [9, 10, 11, 13]. The splitting technique in x is applied to nding the acceleration parameter. When the rectangular domain is decomposed into M x M y subdomains, for example, the one-dimensional problem restricted on a horizontal (resp. vertical) grid line can be viewed as a DD method of M x (resp. M y ) subdomains. Applying the optimal splitting technique, we can nd on each of horizontal and vertical grid lines for which the spectral radius of the iteration matrix for the 1D-restricted problem becomes zero. Such a parameter is called the alternating direction optimal parameter (ADOP). Note that the computation of ADOP requires less than N operations, where N is the number of all grid points, which is negligible compared to the total cost. Numerical experiments. Let = (0; 1). The wave speed v(x; y) is chosen as v 1 (x; y) = 1; v (x; y) = ( + sin(x))(? sin(3y)): (3.15) We set the frequency f = 10, p = f=v =!=v :8=v; q is chosen such that the quality factor Q(:= p =q ) is 100. (The average value of q for v = v is 1..) The source S is selected such that u(x; y) = is the true solution of (3.11). (x) (y)! ; (x) = e i!(x?1) + e?i!x? ; (3.1) The domain is partitioned into (N x? 1) (N x? 1) uniform cells of the edge size h = 1=(N x?1). The cells are grouped into 1 strip-type subdomains of the same size.

Splitting of Tridiagonal Matrices 1 v = v 1 v = v 1=h n r n 1 CPU n r n 1 CPU 18 1.5e-1 5.90 8.1e- 8.3 5 8.1e- 39. 0 5.95e-3.9 Table : Numerical results for (3.1) with ADOP. The domain is decomposed into 1 strip-type subdomains and the solution incorporates 10 wavelengths. Each subproblem is solved directly by the Gauss-elimination. The error is estimated by the relative maximum norm r n 1, and the iteration is stopped when s n 1 10? : r n 1 = kuh;n? uk L 1 () kuk L 1 () ; s n 1 = kuh;n? u h;n?1 k L 1 () ku h;n k L 1 () ; where u h;n is the approximate solution of the n-th iteration. Zero initial values are assumed: u h;0 0. In Table, we present numerical results for (3.1) incorporating ADOP. The integer n denotes the number of DD iterations and CPU is the user time (in second) on a Gateway Solo, a MHz laptop having 18M memory and a Linux operating system. As one can see from the table, the convergence is quite successful. When the parameter is chosen as in Despres [], the algorithm (3.1) is still running at iteration 1000 for v = v 1 and diverged for v = v. ADOP has been successfully tested also with box-type subdomains. The parameter choice ADOP works at least 15 times better than any other constant parameters, for all numerical tests. 3.3. DD methods for the heat equation When the heat equation is discretized with the backward Euler (BE) or the Crank- Nicolson (CN) method, the problem to be solved at the m-th time level is as follows: (a)?u m + p t um = F m ; x ; (b) u m + u m = g m ; x?; (3.1) where t is the timestep size and p = 1 (BE) or p = (CN). The problem (3.1) is not dicult to solve; its algebraic system is strongly diagonally dominant and most iterative algorithms can nd a good initial guess (u m?1 or (u m?1? u m? )). The problem is much better conditioned than the indenite non-hermitian problem (3.11). When the DD method incorporating ADOP is applied to (3.1), it can be easily expected to converge fast. The DD method turns out to be superior to the PCG- ILU (the conugate gradient method preconditioned by incomplete LU-factorization). Furthermore, it can be parallelized with a maximum eciency, since it requires local

Splitting of Tridiagonal Matrices 15 communications only. Details and comparisons with various other algorithms will appear elsewhere [1]. Acknowledgment I sincerely thank Dr. G. Wachspress for his helpful comments and interests. References [1] Y. Cha and S. Kim, Parallelization of ADI methods for elliptic problems. In preparation, 000. [] B. Despres, Domain decomposition method and the Helmholtz problem, in Mathematical and Numerical Aspects of Wave Propagation Phenomena, G. Cohen, L. Halpern, and P. Joly, eds., Philadelphia, 1991, SIAM, pp. {5. [3], Domain decomposition method and the Helmholtz problem (part II), in Mathematical and Numerical Aspects of Wave Propagation, R. Kleinman, ed., Philadelphia, 1993, SIAM, pp. 19{0. [] C. Douglas, S. Malhotra, and M. Schultz, Parallel multigrid with ADIlike smoothers in two dimensions. Preprint, 1998. [5], Transpose free alternating direction smoothers for serial and parallel multigrid methods. Preprint, 1998. [] J. Douglas, Jr., On the numerical integration of @ u @x methods, J. Soc. Indust. Appl. Math., 3 (1955), pp. {5. + @ u @y = @u @t by implicit [] J. Douglas, Jr. and D. Peaceman, Numerical solution of two-dimensional heat ow problems, American Institute of Chemical Engineering Journal, 1 (1955), pp. 505{51. [8] S. Kim, A parallelizable iterative procedure for the Helmholtz problem, Appl. Numer. Math., 1 (199), pp. 35{9. [9], Parallel multidomain iterative algorithms for the Helmholtz wave equation, Appl. Numer. Math., 1 (1995), pp. 11{9. [10], Domain decomposition iterative procedures for solving scalar waves in the frequency domain, Numer. Math., 9 (1998), pp. 31{59. [11], On the use of rational iterations and domain decomposition methods for solving the Helmholtz problem, Numer. Math., 9 (1998), pp. 59{55.

Splitting of Tridiagonal Matrices 1 [1], Numerical methods for parabolic problems. In preparation, 000. [13] S. Kim and M. Lee, Articial damping techniques for scalar waves in the frequency domain, Computers Math. Applic., 31, No. 8 (199), pp. 1{1. [1] P. Lions, On the Schwarz alternating method III: a variant for nonoverlapping subdomains, in Domain Decomposition Methods for Partial Dierential Equations, T. Chan, R. Glowinski, J. Periaux, and O. Widlund, eds., Philadelphia, PA, 1990, SIAM, pp. 0{3. [15] S. Malhotra, C. Douglas, and M. Schultz, Parameter choices for ADIlike methods on parallel computers, Comp. Appl. Math., 1 (1998), pp. 1{3. [1] G. Meurant, A review on the inverse of symmetric tridiagonal and block tridiagonal matrices, SIAM J. Matrix Anal. Appl., 13 (199), pp. 0{8. [1] D. Peaceman and H. Rachford, The numerical solution of parabolic and elliptic equations, J. Soc. Indust. Appl. Math., 3 (1955), pp. 8{1. [18] P.-A. Raviart and J. M. Thomas, A mixed nite element method for second order elliptic problems, in Mathematical Aspects of the Finite Element Method, I. Galligani and E. Magenes, eds., vol. 0 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, New York, 19, pp. 9{315. [19] V. Shaidurov and E. Ogorodnikov, Some numerical methods of solving Helmholtz wave equation, in Mathematical and Numerical Aspects of Wave Propagation Phenomena, G. Cohen, L. Halpern, and P. Joly, eds., Philadelphia, 1991, SIAM, pp. 3{9. [0] W. Tang, Generalized Schwarz splittings, SIAM J. Sci. Stat. Comput., 13 (199), pp. 53{595. [1] R. Varga, Matrix Iterative Analysis, Prentice-Hall, Englewood Clis, NJ, 19. [] E. Wachspress, Optimum alternating-direction-implicit iteration parameters for a model problem, J. Soc. Indust. Appl. Math., 10 (19), pp. 339{350. [3], The ADI model problem, Self-published book, 980 Montego Ct., Windsor, CA 959 USA, 1995.