Domain decomposition schemes with high-order accuracy and unconditional stability

Domain decomposition schemes with high-order accuracy and unconditional stability Wenrui Hao Shaohong Zhu March 7, 0 Abstract Parallel finite difference schemes with high-order accuracy and unconditional stability for solving parabolic equations are presented. The schemes are based on domain decomposition method, i.e., interface values between subdomains are computed by the explicit scheme; interior values are computed by the implicit scheme. The numerical stability and error are derived in the H norm in one dimensional case. Numerical results of both one and two dimensions examining the stability, accuracy, and parallelism of the procedure are also presented. Keywords: Domain decomposition, Finite difference, Parabolic equation, High-order accuracy, Unconditional stability. Introduction Domain decomposition is a powerful tool for devising parallel methods to solve time-dependent partial differential equations. There is rich literature on domain decomposition finite difference methods for solving parabolic equations on parallel computers. For the non-overlapping domain decomposition methods, the explicit nature of the calculation at the interface of sub-domain leads some domain decomposition schemes to be conditionally stable, which implies that they have to suffer from temporal step-size restrictions (see []-[5]). Schemes with unconditional stability as well as high-order accuracy being desired in the applications, many investigators have turned to improve the stability of the domain decomposition method. For example, the corrected explicit-implicit domain decomposition algorithms were presented in [6] and [7]. By adding the correction step to explicit-implicit domain decomposition methods, updating the interface solutions at each time level, the corrected methods were proved to Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556 USA (whao@nd.edu). School of Mathematics Science and LPMC, Nanai University, Tianin, 30007, China (shhzhu@nanai.edu.cn).

have the unconditional stability. While the needless corrected domain decomposition schemes with unconditional stability were presented in [8] and [9]. All of these methods with unconditional stability reach the second order accuracy at most. The purpose of this paper is to present the domain decomposition finite difference procedure with third-order accuracy and unconditional stability. We first consider the following Dirichlet boundary problem U t U = 0, x (0, ), t (0, T], x (.) U(0, t) = U(, t) = 0, t (0, T], (.) U(x, 0) = U 0 (x), 0 x, (.3) where the initial function U 0 (x) satisfies the boundary condition, i.e., U 0 (0) = U 0 () = 0. Then we extend the method to the problem of two dimensional space. We will introduce two new finite difference schemes for solving (.)-(.3) in Section, and setch the domain decomposition procedure, the numerical stability and convergence in Section 3. In section 4, the proof of the unconditional stability and the error estimate will be given. Numerical examples and examination of the algorithm will be provided in Section 5. In Sections 6 and 7, we extend the method to the problem of two dimensional space and test some examples. Two new finite difference schemes Taing the usual h, τ mesh in x and t, and denoting the approximate value of U(x, t n ) U n by u n, where x = h and t n = nτ, we define the following operators + u n = un + un, u n = un un, τu n = τ (un un ). It is well nown that the following Taylor expansion resulting in the fully implicit finite difference scheme is valid. τ U + U ( ) τ h = + h U t (x, t ) + O(τ + h 4 ). (.4) Noticing that U t (x, t ) = U U n + Un τ + O(τ), (.5) and substituting U t (x, t ) into (.4), we obtain τ U + U ( ) τ h + + h U U n + Un τ = O(τ + h 4 + τh ). (.6)

3 Replacing by + and in (.5), we have and U+ = Un + Un + + τ U t (x +, t ) + O(τ 3 ) U = Un Un + τ U t (x, t ) + O(τ 3 ). Substituting U± into (.6), we can get τ U Un + Un + U + U n Un h + ( = τ U h t (x +, t ) + U t (x, t ) ( ) τ + h U ) + O(τ + h 4 + τh + τ3 U n + Un τ h ). (.7) By omitting the high order term, (.6) and (.7) yield two new finite difference schemes for (.) : τ u + u ( τ h + + h ( τ + h τ u + u h + ) u ) u u n + un τ = 0, (.8) u n + un τ r( τ u n + τ u + ) r( τu n τ u ) = 0, (.9) where r = τ. From the derivation of (.8) and (.9), we now that the truncation errors of (.8) and (.9) are O(τ +h 4 +τh h ( ) τ ) and O h +τ +h 4 +τh + τ3 h respectively. If r is any positive real number, the truncation errors become O(h 4 ) and O(h ). 3 Domain decomposition procedure and main results Suppose Jh =, Nτ = T. For simplicity, we will consider a domain decomposition which involves in decomposing (0, ) into only two subdomains, (0, x) and (x, ), where x = x for some integer ( < < J ). We use the explicit scheme (.9) to compute the solution value u and the implicit scheme (.8) to compute other solution values u ( ) respectively. The system can be written as L(u ) = 0, n N, 0 < < J, u = 0, = 0, J, (3.0)

4 where the linear operator L is L(u ) = τ u + u ( ) τ h + + h u u n + un τ,, τ u + u ( ) τ h + + h u u n + un τ r( τ u n + τ u + ) r( τu n τ u ), =. (3.). The resulting system of equations decouples into two disoint sets of equations corresponding to the subdomains. These systems can be solved in parallel. We are now in position to state two main theorems of this paper, which will be proved in the forthcoming sections. For the discrete function u n = {u n = 0,,..., J, u n 0 = un J = 0}, define u n = u n h, + u n = + u n h. = We have the following theorems. Theorem 3. (Stability) For any given r > 0, the finite difference solutions of the parallel scheme (3.0)-(3.) satisfy max +u n n =0 r + r + ( +u + u u 0 ). Theorem 3. (Convergence) Let e n = U(x, t n ) u n. For any given r > 0, the finite difference solutions of the parallel scheme (3.0)-(3.) satisfy max n +e n C( + e + e e 0 + h 4 ), where C is a positive constant independent of h and τ. Since the finite difference scheme (3.0) has three time levels, besides taing u 0 = U 0(h), thus e 0 = 0, we need to find other methods to solve u,. In order to match high order accuracy, we can use either fourth order explicit schemes such as the impact scheme or the high-order parallel iterative method [0] to compute u, i.e., let e O(h4 ). Then according to Theorem 3., scheme (3.0) will reach third-order accuracy. 4 Proof of stability and convergence We first state three auxiliary lemmas. The stability and convergence results are then derived.

5 Lemma 4. (Discrete Poincare Inequality) For the discrete function u n = {u n = 0,,..., J, un 0 = u n J = 0}, there exists u n + u n h. Lemma 4. (Discrete Green Theorem) If u and v are discrete functions on the set {x = 0,,, J}, then we have J u + v = v u u 0 v + u J v J. = = Lemma 4.3 For any given r > 0, f L (0, T; L (0, )), the finite difference solutions of system L(u ) = f with the Dirichlet boundary condition satisfy ( max +u n + u + n r + ) u r + u 0 + Th max r n fn. Lemma 4. and Lemma 4. are proved in []. Before proving the stability and convergence, we will give the proof of Lemma 4.3. 4. Proof of Lemma 4.3 Denoting w = u boundary condition as w r + u u n, we can rewrite L(u ) = f with the Dirichlet ( + w r + u + + ) ( r + r (w ) (w w n ) r(w+ n w + ) r(wn w ) = f u = 0, = 0, J. w n ) = f τ,, τ, =, Multiplying the above equations by w h, =,,, J and summing them up respect to, we have = (w rw ) h r = ( w + u h + + ) r (w+ n w + )h rw (w n w From Lemma 4., we have = )h = = (w w n )w h f w τh. (4.)

6 J w + u = u w = = = ( + u ) + ( + u n ) ( + u + u n ). =0 Then (4.) becomes w + r [ =0 ( + + )[ r ( + u = (w +rhw (w + + w ) = rhw =0 (w+ n + wn ) + ) h =0 =0 ] ( + u n ) h + r ) h (w n ) h + = w f = ( (w ) + (w+ n rh ) + (w τh ) + (w n ) = ) + = =0 ( + u (w ] w n ) h (w ) + (f + u n ) h τ) rh(w ) + rh (wn + ) + (w n ) + w + τ f. (4.3) Noting that = r r ( + u + u n ) h + rhw (w + + w =0 =0,, ( + w we can simplify (4.3) as follows w + r ( + 4 + 4r [ (w + +rh ) + (w ) [ (w ) + h + rh ) + (w [ + u + u n ] + r ) ) ( w w n + w w n ) ) =0,, ] + rh(w ), (wn + ) + (w n ] ) τ f ( + w ) h h

7 Since w 0, we have =0,, ] ( r [ + u + u n + [ (w + +rh ) + (w ) Summing up respect to n, we get Thus ( r +u + 4 + r +u + ( 4 + 4r + u + u + Lemma 4.3 is proved. 4. Proof of stability ( + w ) h 0, and w w n 0, then 4 + 4r (wn + ) + (w n ) ) ( w w n ) ] τ f ) w + rh (w + ) + (w ) 4r ) w + rh (w + ) + (w ) + Tτ max n fn. ( r + ) u r + u 0 + Th max r n fn. Let f = 0 in Lemma 4.3, thus Theorem 3. is proved. 4.3 Proof of convergence According to the parallel scheme (3.0)-(3.), the errors e ( n < N) follows the below equations: τ e + e ( τ h + + h τ e + e ( τ h + + h ) e ) e e n + en τ = G,, e n + en τ r( τ e n + τe + ) r( τe n τe ) = Φ h + G, =, e = 0, = 0, J, where Φ = r U t (x, t ), G O(h 4 ). In order to get the error estimates for e, we assume that e = p + q, where p and q are the solutions of the following problems respectively.

8 problem I τ p + p ( τ h + + h τ p + p ( τ h + + h ) p ) p p n + pn τ = G,, p n + pn τ r( τ p n + τp + ) r( τp n τp ) = G, =, (4.4) p = 0, = 0, J, p 0 = e0, p = e,. problem II τ q + q ( τ h + + h τ q + q ( τ h + + h ) q ) q q n + qn τ = 0,, q n + qn τ r( τ q n + τ q + ) r( τq n τ q ) = Φ h, =, q = 0, = 0, J, q 0 = q = 0,. From lemma 4.3, we can obtain the estimate for p, i.e., ( + p n + p + r + ) p r + p 0 + Th max r n Gn ( r + ) r + ( + e + e e 0 ) + O(h 0 ). (4.5) For estimating q, we first consider q ( n N )which satisfies the following equations Then the formula of q q = + q h = 0,, + q h = Φ h, q 0 = q J = 0. is 0, = 0, J, J Φ h 4,, J + J J Φ h 4, < < J. i=

9 Hence + q O(h 4 ) and q q n O(τh 3 ). Define q = q q, then we have τ q + q ( τ h + + h τ q + q ( τ h + + h ) q ) q q n + qn τ = R,, q n + qn τ r( τ q n + τ q + ) r( τ q n τ q ) = R, =, q = 0, = 0, J, q 0 = q 0, q = q,, where R = ( τ τ q + + h ( τ τ q + + h r( τ q n τ q ) q ) q ), =, q n + qn τ,, q n + q n τ r( τ q n + τ q + ) and R O(h 3 ). From Lemma 4.3, we can obtain the estimate of + q. ( + q + q + r + ) q r + q 0 + Th max r n Rn O(h 8 ). Thus, Combining with (4.5), we get + q + q + + q O(h 4 ). + e + p + + q C( + e + e e 0 + h 4 ), where C is a positive constant independent of h and τ. Theorem 3. is proved. 5 Numerical experiments In this section, some numerical results are presented to show the stability, accuracy, and parallelism of the scheme described above, and the computational costs are also presented. All the experiments are run on a cluster consisting of a manager that uses one core of a Xeon 540 processor and up to 3 computing nodes, each containing two Xeon 540 processors running 64-bit Linux, i.e., each node consists of 8 processing cores.

0 We consider the problem defined in equations (.)-(.3) with U 0 (x) = sin(πx). Obviously the exact solution of the equations is U(x, t) = e πt sin(πx). First, we verify the stability of the scheme by taing the step size h = 0 3, r =, 0, 00, 000 with two subdomains. FIG. clearly shows that the norm of u n doesn t occur blowing up even if r is large enough. This explains the unconditional stability of the scheme. r= r=0 u n 0.5 u n 0.5 0 0 0.5 time t r=00 0 0 0.5 time t r=000 u n 0.5 u n 0.5 0 0 0.5 time t 0 0 0.5 time t Figure : The infinity norm of u n v.s. t Second, we examine the numerical errors in the solutions. Table shows that the errors for each case are roughly of the same order of magnitude, and the errors appear to be O(h 3 ) in each case. Third, we test the speed-up for the scheme. Here we tae h = 0 5, τ = 0 6, i.e., r = 0 4, T = 0.5 and list the time for computing and the speed-up in Table 5, which shows that the scheme has a coemptive parallelism. 6 Extension to two dimensional case In this section, U(x, y, t) will be a solution of the following Dirichlet boundary problem on Ω = (0, ) (0, ),

Table : Numerical errors for different grid points r = 0, T = processors 4 processors 8 processors J u n U n /h 3 J u n U n /h 3 J u n U n /h 3 000.068 000 4.05448 000 8.07385 000.0675 000 4.05336 000 8.0776 4000.0636 4000 4.04938 4000 8.0688 8000.9499 8000 3.93590 8000 7.9563 r = 00, T = processors 4 processors 8 processors J u n U n /h 3 J u n U n /h 3 J u n U n /h 3 000.06703e+0 000 4.04766e+0 000 8.0437e+0 000.0679e+0 000 4.0599e+0 000 8.06955e+0 4000.06690e+0 4000 4.05307e+0 4000 8.076e+0 8000.0680e+0 8000 4.05434e+0 8000 8.07356e+0 r = 000, T = processors 4 processors 8 processors J u n U n /h 3 J u n U n /h 3 J u n U n /h 3 000.0883e+04 000 4.0444e+04 000 8.03459e+04 000.03740e+04 000 3.9388e+04 000 8.03843e+04 4000.06360e+04 4000 4.03957e+04 4000 8.0579e+04 8000.06634e+04 8000 4.0537e+04 8000 8.06558e+04 U U = 0, (x, y) Ω, t (0, T], t (6.6) U(x, y, t) = 0, (x, y) Ω, t (0, T], (6.7) U(x, y, 0) = U 0 (x, y), (x, y) Ω, (6.8) where the initial function U 0 (x, y) satisfies the boundary condition, and U = U x + U. We extend the domain decomposition method stated in Section 3 y to the above problem. Tae Ω = {(x, y) Ω : x < x}, Ω = {(x, y) Ω : x > x}. Let x i = ih same as in Section 3, and let y = h. Suppose that there exists an integer such that x = x (see Fig. ). In analogy with Section 3 we call points (x i, y, t n ) as boundary points if (x i, y ) Ω, and interface points if i =. Otherwise, we call them interior points. The values u n i, will approximate U(x i, y, t n ) Ui, n. We denote + u i, = u i+, u i, + u i,, δ + δ u i, = u i,+ u i, + u i,,

Table : Comparison of time for different processors Processors n Time t n (seconds) Speed-up t /t n 486.66-450.8.98 4 50.9 3.89 8 6.74 7.8 6 35.05 4.96 3 65.85 9.3 64 00.45 48.40 Figure : domain decomposition for two dimensional case and Then Thus we have τ u n i, = un i, un i,. τ + U i, h = U x (x i, y, t) + h 4 U x 4 (x i, y, t) + O(h 4 ), δ + δ U i, h = U y (x i, y, t) + h 4 U y 4 (x i, y, t) + O(h 4 ), δ + δ + U i, h 4 = 4 U x y (x i, y, t) + O(h ). + U i, + δ + δ U i, h + 6h δ +δ + U i, = U(x i, y, t) + h ( 4 U x 4 (x i, y, t) + 4 U x y (x i, y, t) + 4 U y 4 (x i, y, t) ) + O(h 4 ) = U(x i, y, t) + h U t (x i, y, t) + O(h 4 ). (6.9)

3 Moreover, τ U i, = U t (x i, y, t ) τ U t (x i, y, t ) + O(τ ). (6.0) Subtract (6.0) with (6.9), we obtain τ Ui, + Ui, + δ + δ Ui, h 6h δ +δ + Ui, ( ) τ = + h U t (x i, y, t ) + O(τ + h 4 ). Therefore, the fourth order finite difference scheme for (6.6) is τ u i, + u i, + δ + δ u i, h 6h δ +δ + u i, ( ) τ + + h u i, u n i, + un i, τ = 0, (6.) which is a nine-point finite difference scheme with three levels (see Fig. 3). Figure 3: nine point scheme Denote u i, = u i, + u i,+ + u i+, + u i+,+ 4u i, = + u i, + δ + δ u i, + δ +δ + u i,. Then (6.) is equivalent to τ u i, u i, + + u i, + δ + δ u i, 3h + ( τ + h ) u i, u n i, + un i, τ = 0,

4 which can be used for computing u i, when i. When i =, u i±, in (6.) are replaced by u n i±, un i±,. In this case, we need to solve a tridiagonal matrix using Thomas algorithm for computing u,. It is straightforward to extend the two-subdomain results to many subdomains on x direction by cutting the whole domain into vertical strips. Moreover, we can also extend the domain decomposition scheme to three dimensional case by dividing x axis into many subdomains. 7 Two dimensional numerical experiments We consider the problem defined in equations (6.6)-(6.8) with U 0 (x, y) = sin(πx)sin(πy). Obviously the exact solution of the equations is U(x, y, t) = e πt sin(πx)sin(πy). The computing time and the speed-up of the scheme (6.9-6.0) are shown in Table 3 with h = 0 3, τ = 0 and T =. Table 3: Comparison of time for two dimensional case Processors n Time t n (seconds) Speed-up t /t n 0h3m9s - 6h0m9s.7 4 h5m4s 3.6 8 h3m54s 7.4 6 4m47s 4.53 3 m3s 9.06 64 m58s 47.9 7. Lota-Volterra system The competitive Lota-Volterra equations are a simple model of the population dynamics of species competing for some common resource. Given two populations, u and v, with logistic dynamics, the Lota-Volterra formulation adds an additional term to account for the species interactions. Thus the competitive Lota-Volterra equations are: u t = u + u( v), (x, y) Ω v t = v v( u), (x, y) Ω v = u = 0, (x, y) Ω. We choose random values for u(x, y) and v(x, y) as initial conditions shown in FIG 4. We tae h = 0.00, τ = 0.0, T = and show the computing time in Table 4. FIG 5 shows the numerical solution for T =.

5 u(x,y,t) time t=0 0.8 0.6 0.8 0.6 y 0.4 0.4 0. 0. 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x v(x,y,t) time t=0 0.8 0.6 0.8 0.6 y 0.4 0.4 0. 0. 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x Figure 4: Initial condition for the Lota-Volterra system Table 4: Comparison of time for Lota-Volterra system Processors n Time t n (seconds) Speed-up t /t n h5m3s - h45m8s.86 4 6hm36s 3.5 8 h58m57s 7.37 6 h33m37s 4.6 3 45m8s 8.85 64 8ms 46.5 8 4m33s 90. 56 7m8s 75.6 Conclusion We presented domain decomposition finite difference schemes with unconditional stability and third-order accuracy for the parabolic system. Error estimate and stability of the numerical solutions have been derived for one dimensional case. The scheme is easy to implement the parallelism and is extended in

6 y 0.8 0.6 0.4 0. u(x,y,t) time t= x 0 5 3.5 3.5.5 0.5 0.8 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x v(x,y,t) time t= x 0 6 0 y 0.6 0.4 0. 8 6 4 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x Figure 5: Numerical solutions for T = two and three dimensional case. The numerical results demonstrate the good performance of the parallel scheme, namely, unconditional stability, the third order accuracy and high degree of parallelism. References [] C.N. Dawson, Q. Du, and T. F. Dupont, A finite difference domain decomposition algorithm for numerical solution of the Heat equation,math. Comp., Vol. 57, 63 7 (99). [] G. Yuan, S. Zhu and L. Shen, Domain decomposition algorithm based on the group explicit formula for the heat equation, Int. J. Comput. Math, Vol. 8, 95 306 (005). [3] S. Zhu, Z. Yu and J. Zhao, A high-order parallel finite difference algorithm, Applied Math. and Comp., Vol. 83, 365-37 (006). [4] C.N. Dawson and T.F. Dupont, Explicit/implicit conservative domain decomposition procedures for parabolic problems based on bloc-centered finite differences, SIAM J. Numer. Anal., Vol. 3, 045 06 (994).

7 [5] G. Yuan and L. Shen, Stability and convergence of the explicit-implicit conservative domain decomposition procedure for parabolic problems, Computers and Math. with Applications, Vol. 47, 793-80 (004). [6] H. Shi and H. Liao, Unconditional stability of corrected explicit-implicit domain decomposition algorithms for parallel approximation of heat equations, SIAM J. Numer. Anal., Vol. 44, 584-6 (006). [7] H. Liao, H. Shi and Z. Sun, Corrected explicit-implicit domain decomposition algorithms for two-dimensional semilinear parabolic equations, Science in China Series A, Vol. 5, 36-388 (009). [8] Z. Sheng, G. Yuan and X. Hang, Unconditional stability of parallel difference schemes with second order accuracy for parabolic equation, Applied Math. and Comp., Vol. 84, 05-03 (007). [9] S. Zhu, Conservative domain decomposition procedure with unconditional stability and second-order accuracy, Applied Math. and Comp., Vol.6, 375-38 (00). [0] W. Hao and S. Zhu, Parallel iterative methods for parabolic equations, Int. J. Comput. Math, Vol. 86, 43 440 (009). [] Y. Zhou, Applications of discrete functional analysis to the finite difference method, International Academic publishers (99).