arxiv: v1 [math.na] 7 Jan PDF Free Download

Efficient p-multigrid Methods for Isogeometric Analysis R. Tielen a,, M. Möller a, D. Göddeke b, C. Vuik a arxiv:9.685v [math.na] 7 Jan 29 a Delft Institute of Applied Mathematics, Delft University of Technology, Van Mourik Broekmanweg 6, 2628XE, Delft b Institute for Applied Analysis and Numerical Simulation, University of Stuttgart Allmandring 5b, 7569, Stuttgart Abstract Over the years, Isogeometric Analysis has shown to be a successful alternative to the Finite Element Method (FEM). However, solving the resulting linear systems of equations efficiently remains a challenging task. In this paper, we consider a p-multigrid method, in which each level of the hierarchy is associated with a different approximation order p instead of mesh width h. Since the use of classical smoothers (i.e. Gauss-Seidel) results in a p-multigrid method with deteriorating performance for higher values of p, the use of an ILUT smoother is investigated. Numerical results and a spectral analysis indicate that the resulting p-multigrid method exhibits convergence rates independent of h and p. Keywords:. Introduction Isogeometric Analysis, Multigrid methods, p-multigrid Isogeometric Analysis (IgA) [] has become widely accepted over the years as an alternative to the Finite Element Method (FEM). The use of B-spline basis functions or Non-Uniform Rational B-splines (NURBS) allows for a highly accurate representation of complex geometries and establishes the link between computer-aided design (CAD) and computer-aided engineering (CAE) tools. Furthermore, the C p continuity of the basis functions offers a higher accuracy per degree of freedom compared to standard FEM [2]. IgA has been applied with success in a wide range of engineering fields, such as structural mechanics [3], fluid dynamics [4] and shape optimization [5]. r.p.w.m.tielen@tudelft.nl Preprint submitted to Computer Methods in Mechanics and Engineering January 8, 29

Solving the resulting linear systems efficiently is, however, still a challenging task. The condition numbers of the mass and stiffness matrices increase exponentially with the approximation order p, making the use of (standard) iterative solvers inefficient. On the other hand, the use of direct solvers is not straightforward due to the increasing stencil of the basis functions and increasing bandwith of matrices for higher values of p. Furthermore, direct solvers may not be practical for large problem sizes due to memory constraints, which is a common problem in high-order methods in general. Recently, various solution techniques have been developed for discretizations arising in Isogeometric Analysis. For example, preconditioners have been developed based on fast solvers for the Sylvester equation [3] and overlapping Schwarz methods [4]. As an alternative, geometric multigrid methods have been investigated, as they are considered among the most efficient solvers in Finite Element Methods for elliptic problems. However, the use of standard smoothers like (damped) Jacobi or Gauss-Seidel leads to convergence rates which deteriorate for increasing values of p [5]. It has been noted in [6] that very small eigenvalues associated with high-frequency eigenvectors cause this behaviour. The use of non-classical smoothers, such as a boundary-corrected mass-richardson smoother, has proven to solve this problem [7, 8]. p-multigrid methods can be adopted as an alternative solution strategy. In contrast to h-multigrid methods, a hierarchy is constructed where each level represents a different approximation order. Throughout this paper, the coarse grid correction is obtained at level p =. Here, B-spline functions coincide with linear Lagrange basis functions, enabling the use of well known solution techniques for standard Lagrangian Finite Elements. p-multigrid methods have mostly been used for solving linear systems arising within the Discontinuous Galerkin method [6, 7, 8, 9], where a hierarchy was constructed until level p =. However, some research has been performed for continuous Galerkin methods [] as well, where the coarse grid correction was obtained at level p =. Recently, the authors applied a p-multigrid method, using a Gauss-Seidel smoother, in the context of IgA []. As with h-multigrid methods, a dependence of the convergence rate on p was reported. In this paper, a p-multigrid method is presented making use of an Incomplete LU factorization based on a dual threshold strategy (ILUT) [2] for smoothing. The spectral properties of the resulting p-multigrid method are analyzed. Furthermore, numerical results are presented for Poisson s equation on a non-trivial geometry and the convection-diffusion-reaction (CDR) 2

equation on the unit square. The use of ILUT as a smoother significantly improves the performance of the p-multigrid method, leading to convergence rates independent of h and p. This paper is organised as follows. In Section 2 the model problem and spatial discretisation are considered. Section 3 presents the p-multigrid method in detail, together with the proposed ILUT smoother. A spectral analysis is performed and discussed in Section 4. In Section 5, numerical results for the considered benchmark are presented. Finally, conclusions are drawn in Section 6. 2. Model problem To assess the quality of the p-multigrid method, the convection-diffusionreaction (CDR) equation is considered as a model problem: (D u) + (vu) + Ru = f, on Ω, () where D denotes the diffusion tensor, v the velocity field and R a source term. Here, Ω R 2 is a connected domain, f L 2 (Ω) and u = on the boundary Ω. Let V = H (Ω) denote the space of functions in the Sobolev space H (Ω) that vanish on Ω. The variational form of () is then obtained by multiplication with an arbitrary test function v V and application of integration by parts: Find u V such that (D u, v) + (v u + Ru, v) = (f, v) v V. (2) The physical domain Ω is then parameterized by a geometry function F : Ω Ω, F(ξ) = x, (3) which describes an invertible mapping from the parameter domain Ω R 2 to the physical domain Ω R 2. The tensor product of one-dimensional B- spline basis functions of order p are then used for spatial discretization. Onedimensional B-spline basis functions are defined on the parameter domain and are uniquely determined by their underlying knot vector Ξ = {ξ, ξ 2,..., ξ N+p, ξ N+p+ }, (4) 3

consisting of a sequence of non-decreasing knots ξ i R. Here, N denotes the number of basis functions of order p defined by this knot vector. Throughout this paper, B-spline basis functions are considered based on an open uniform knot vector with knot span size h, implying that the first and last knots are repeated p + times. As a consequence, the basis functions considered are C p continuous and interpolating only at the two end points. Constant B-spline basis functions are defined as follows [9]: { if ξ i ξ < ξ i+, φ i, (ξ) = (5) otherwise. Higher-order B-spline basis functions of order p are then defined recursively for p > : φ i,p (ξ) = ξ ξ i φ i,p (ξ) + ξ i+p+ ξ φ i+,p (ξ). (6) ξ i+p ξ i ξ i+p+ ξ i+ The resulting B-spline basis functions φ i,p are non-zero on the interval [ξ i, ξ i+p+ ), implying a compact support that increases with p. The spline space V h,p is then formulated adopting the tensor product of these basis functions: Φ I, p (ξ) := φ i,p (ξ)φ j,q (ξ), I = (i, j), p = (p, q). (7) Using the geometry mapping F as pull-back operator, V h,p can be written as follows: } V h,p := span {Φ I, p F. (8) i=,...,n dof Here, N dof denotes the number of tensor-product basis functions, implying N dof = N 2. The Galerkin formulation of (2) becomes: Find u h,p V h,p such that (D u h,p, v h,p ) + (v u h,p + Ru h,p, v h,p ) = (f, v h,p ) v h,p V h,p. (9) The discretized problem can be written as a linear system A h,p u h,p = f h,p, () where A h,p denotes the system matrix resulting from this discretization with B-spline basis functions of order p and mesh width h. For a more detailed 4

description of the spatial discretization in Isogeometric Analysis, the authors refer to []. Throughout this paper two benchmarks are considered: Benchmark. Let Ω be the quarter annulus with an inner and outer radius of and 2, respectively. The coefficients are chosen as follows: [ ] [ ] D =, v =, R =. () Furthermore, the right-hand side is chosen such that the exact solution u is given by: u(x, y) = (x 2 + y 2 )(x 2 + y 2 4)xy 2. Benchmark 2. Here, the unit square is adopted as domain, i.e. Ω = [, ] 2, and the coefficients are chosen as follows: [ ] [ ].2.7.4 D =, v =, R =.3. (2).4.9.2 The right-hand side is chosen such that the exact solution u is given by: u(x, y) = sin(πx)sin(πy). The resulting system matrix from the first benchmark will be referred to as Ā h,p, while the system matrix associated with the second benchmark is denoted by Â h,p. Please note, that both Ā h,p and Â h,p are symmetric positive definite matrices. 3. p-multigrid method Multigrid methods aim to solve equations resulting from the discretization of partial differential equations by using a hierarchy of discretizations. At each level of the hierarchy a smoother is applied, whereas on the coarsest level a correction is determined. To solve Equation () using a p-multigrid method, a sequence of spaces V h,,..., V h,p is obtained by applying refinement in p. Note that, since basis functions with maximal continuity are considered, the spaces are not nested. A single step of the two-grid correction scheme for the p-multigrid method consists of the following steps []: 5

. Starting from an initial guess u () h,p, apply a fixed number ν of presmoothing steps: u (,m) h,p = u (,m ) h,p + S h,p ( f h,p A h,p u (,m ) h,p ), m =,..., ν, (3) where S is a smoothing operator applied to the high-order problem. 2. Determine the residual at level p and project it onto the space V h,p using the restriction operator Ip p : ( ) r h,p = Ip p f h,p A h,p u (,ν ) h,p. (4) 3. Solve the residual equation at level p to determine the coarse grid error: A h,p e h,p = r h,p. (5) 4. Project the error e h,p onto the space V h,p using the prolongation operator I p p and update u (,ν ) h,p : u (,ν ) h,p := u (,ν ) h,p + I p p (e h,p ). (6) 5. Apply ν 2 postsmoothing steps to obtain u (,ν +ν 2 ) h,p =: u () h,p. In literature, step (2)-(4) are referred to as coarse grid correction. Recursive application of this scheme on Equation (5) until level p = is reached, results in a V-cycle. However, as shown in Figure, different cycle types can be applied, for example a W-cycle. k = 3 k = 2 k = Restrict Restrict Prolongate Prolongate Figure : Description of a V-cycle and W-cycle. 6

Prolongation and restriction To transfer both coarse grid corrections and residuals between different levels of the hierarchy, prolongation and restriction operators are defined. The prolongation and restriction operator adopted in this paper are based on an L 2 projection and have been used extensively in literature [2, 2, 22]. At each level k, where 2 k p, the coarse grid correction is prolongated to level k by projection onto the space V h,k. The prolongation operator Ik k : V h,k V h,k is given by I k k (v k ) = (M k ) P k k v k, (7) where the mass matrix M k and transfer matrix P k k are defined, respectively, as follows: (M k ) (i,j) := φ i,k φ j,k dω, (P k k ) (i,j) := φ i,k φ j,k dω (8) Ω The residuals are restricted from level k to k by projection onto the space V h,k. The restriction operator I k k : V h,k V h,k is defined by I k k (v k ) = (M k ) P k k v k. (9) To prevent the explicit solution of a linear system of equations for each projection step, the consistent mass matrix M in both transfer operators is replaced by its lumped counterpart M L by applying row-sum lumping: Smoother N dof M L (i,i) = M (i,j). (2) j= Within multigrid methods, a basic iterative method is typically used as a smoother. However, in IgA the performance of classical smoothers such as (damped) Jacobi and Gauss-Seidel decreases significantly for higher values of p. Therefore, an Incomplete LU factorization is adopted with a dual threshold strategy (ILUT) [2] to approximate the operator A h,p : A h,p L h,p U h,p. (2) The ILUT factorization is determined completely by a tolerance τ and fillfactor m. Two dropping rules are applied during factorization: Ω 7

. All elements smaller (in absolute value) than the dropping tolerance are dropped. The dropping tolerance is obtained by multiplying the tolerance τ with the average magnitude of all elements in the currect row. 2. Apart from the diagonal element, only the M largest elements are kept in each row. Here, M is determined by multiplying the fillfactor m with the average number of non-zeros in each row of the original operator A h,p. An efficient implementation of ILUT is available in the Eigen library [23] based on [24]. Once the factorization is obtained, a single smoothing step is applied as follows: e (n) h,p = (L h,p U h,p ) (f h,p A h,p u (n) h,p ), = U h,p L h,p (f h,p A h,p u (n) h,p ), u (n+) h,p = u (n) h,p + e(n) h,p. Throughout this paper, the fillfactor m = is used and the dropping tolerance equals τ = 2. Hence, the number of non-zero elements of L h,p +U h,p is similar to the number of non-zero elements of A h,p. Figure 2 shows the sparsity pattern of the stiffness matrix Ā h,p and L h,p +Ū h,p for p = 3. Since a permutation is performed during the ILUT factorization, sparsity patterns differ significantly. However, the number of non-zero entries is comparable. Coarse grid operator At each level of the hierarchy, the operator A h,p is needed to apply smoothing or solve the residual equation. The operators at the coarser levels can be obtained by rediscretizing the bilinear form in (2) with low-order basis functions. Alternatively, a Galerkin projection can be adopted: A G h,p = I p p A h,p I p p. (22) For both approaches, the condition number of the coarse grid operator for Poisson s equation is presented in Table for different values of h and p. The condition number when using the Galerkin projection is significantly higher compared to the condition number of the rediscretized operator. Results for the second benchmark show the same behaviour. Therefore, the coarse grid operators in this paper are obtained by using the latter approach. 8

2 2 4 4 6 6 8 8 2 5 nz = 6273 2 2 4 6 8 2 nz = 6273 Figure 2: Sparsity pattern of the stiffness matrix Ā h,3 (left) and L h,3 + Ū h,3 (right). Table : Condition number of the coarse grid operator for Poisson s equation obtained with the Galerkin projection (G) and rediscretization (RD). p = 2 κ(ā G h, ) κ(ārd h, ) p = 3 κ(āg h,2 ) κ(ārd h,2 ) h = 2 4 6. 7 9.78 2 h = 2 4 7. 9.56 3 h = 2 5 4.79 9 4.9 3 h = 2 5 6.5 6.7 3 h = 2 6 2.94.76 4 h = 2 6 4.99 2.84 4 h = 2 7 5.48 7.28 4 h = 2 7 7.58 2.8 5 4. Spectral Analysis In this section, the performance of the proposed p-multigrid method is analyzed in different ways. First, a spectral analysis is performed to investigate the interplay between the coarse grid correction and the smoother. Then, the spectral radius of the iteration matrix is determined to obtain the asymptotic convergence factors of the p-multigrid method. Throughout this section, both benchmarks presented in Section 2 are considered for the analysis. Reduction factors To investigate the effect of a single smoothing step or coarse grid correction, a spectral analysis [25] is carried out for different values of p. For this analysis, we consider u = with homogeneous Dirichlet boundary conditions and, hence, u = as its exact solution. Let us define the error reduction factors as follows: 9

r S (u h,p) = S(u h,p ) 2 u h,p, r CGC (u h,p) = CGC(u h,p ) 2 2 u h,p, (23) 2 where S( ) and CGC( ) denote a single smoothing step and a coarse grid correction, respectively. As an initial guess, the generalized eigenvectors v i are chosen which satisfy A h,p v i = λ i M h,p v i, i =,..., N 2. (24) Here, M h,p denotes the consistent mass matrix as defined in (8) and v i the generalized eigenvectors. The error reduction factors for the first benchmark for different smoothers are shown in Figure 3 for h = 2 5 and different values of p. In the left column, the reduction factors are shown obtained with Gauss- Seidel as a smoother, while the plots in the right column are obtained based on an ILUT factorization. In general, the coarse grid correction reduces the eigenvectors corresponding to the low-frequency modes, while the smoother reduces the eigenvectors associated with high-frequency modes. However, for increasing values of p, the reduction factors of the Gauss-Seidel smoother increase for the high-frequency modes, implying that the smoother becomes less effective. On the other hand, the use of ILUT as a smoother leads to decreasing reduction factors for all modes when the value of p is increased. Figure 4 shows the error reduction factors obtained for the second benchmark, showing similar, but less oscillatory, behaviour. These results indicate that the use of ILUT as a smoother could improve the convergence properties of the p-multigrid method compared to the use of a classical smoother. Iteration Matrix For any multigrid method, the asymptotic convergence rate is determined by the spectral radius of the iteration matrix. To obtain this matrix explicitly, consider, again, u = with homogeneous Dirichtlet boundary conditions. By applying a single iteration of the p-multigrid method using the unit vector e i h,p as initial guess, one obtains the ith column of the iteration matrix [26]. Figure 5 shows the spectra for the first benchmark for h = 2 5 and different values of p obtained with both Gauss-Seidel and ILUT as a smoother. For reference, the unit circle has been added in all plots. The spectral radius of the iteration matrix, defined as the maximum eigenvalue in absolute value,

Reduction Reduction Reduction Reduction Reduction Reduction.2 CGC Smoother Gauss-Seidel (p=2).2 CGC Smoother ILUT (p=2).8.8.6.6.4.4.2.2 2.2 CGC Smoother Eigenvalue Gauss-Seidel (p=3) 2 Eigenvalue.2 CGC Smoother ILUT (p=3).8.8.6.6.4.4.2.2 2.2 CGC Smoother Eigenvalue Gauss-Seidel (p=4) 2 Eigenvalue.2 CGC Smoother ILUT (p=4).8.8.6.6.4.4.2.2 2 Eigenvalue 2 Eigenvalue Figure 3: Error reduction in (v j ) for the first benchmark with p = 2, 3, 4 and refinement level 5 obtained with Gauss-Seidel (left) and ILUT (right).

Reduction Reduction Reduction Reduction Reduction Reduction.2 CGC Smoother Gauss-Seidel (p=2).2 CGC Smoother ILUT (p=2).8.8.6.6.4.4.2.2 2 Eigenvalue.2 CGC Smoother Gauss-Seidel (p=3) 2.2 CGC Smoother Eigenvalue ILUT (p=3).8.8.6.6.4.4.2.2 2 Eigenvalue.2 CGC Smoother Gauss-Seidel (p=4) 2.2 CGC Smoother Eigenvalue ILUT (p=4).8.8.6.6.4.4.2.2 2 Eigenvalue 2 Eigenvalue Figure 4: Error reduction in (v j ) for the second benchmark with p = 2, 3, 4 and refinement level 5 obtained with Gauss-Seidel (left) and ILUT (right). 2

is then given by the eigenvalue that is the furthest away from the origin. Clearly, the spectral radius significantly increases for higher values of p when adopting Gauss-Seidel as a smoother. The use of ILUT as a smoother results in spectra clustered around the origin, implying fast convergence of the resulting p-multigrid method. The spectra of the iteration matrices for the second benchmark are presented in Figure 6. Although the eigenvalues are more clustered compared to the first benchmark, the same behaviour can be observed. The spectral radia for both benchmarks, where ν = ν 2 =, are presented in Table 2 and Table 3, respectively. For Gauss-Seidel, the spectral radius of the iteration matrix is independent of the mesh width h, but depends strongly on the approximation order p. For one configuration, the p-multigrid method diverges, indicated by a spectral radius larger then. The use of ILUT leads to a spectral radius which is significantly lower for all values of h and p. Although ILUT exhibits a small h-dependence, the spectral radius remains low for all values of h. As a consequence, the p- multigrid method is expected to show both h- and p-independence when ILUT is adopted as a smoother. Table 2: Spectral radius ρ for the first benchmark using Gauss-Seidel and ILUT for different values of h and p. p = 2 GS ILUT p = 3 GS ILUT p = 4 GS ILUT h = 2 4.635.4 h = 2 4.849.4 h = 2 4.963.3 h = 2 5.63.39 h = 2 5.845.2 h = 2 5.96.29 h = 2 6.647.57 h = 2 6.844.24 h = 2 6.4.23 Table 3: Spectral radius ρ for the second benchmark using Gauss-Seidel and ILUT for different values of h and p. p = 2 GS ILUT p = 3 GS ILUT p = 4 GS ILUT h = 2 4.352.43 h = 2 4.74.2 h = 2 4.96.3 h = 2 5.352.37 h = 2 5.699.4 h = 2 5.93.2 h = 2 6.36.42 h = 2 6.699.22 h = 2 6.94.7 3

Imaginary axis Imaginary axis Imaginary axis Imaginary axis Imaginary axis Imaginary axis.8.6.4.2 -.2 -.4 -.6 -.8 -.8.6.4.2 -.2 -.4 -.6 -.8 -.8.6.4.2 -.2 -.4 -.6 -.8 - Gauss-Seidel (p=2) - -.5.5 Real axis Gauss-Seidel (p=3) - -.5.5 Real axis Gauss-Seidel (p=4) - -.5.5 Real axis.8.6.4.2 -.2 -.4 -.6 -.8 -.8.6.4.2 -.2 -.4 -.6 -.8 -.8.6.4.2 -.2 -.4 -.6 -.8 - ILUT (p=2) - -.5.5 Real axis ILUT (p=3) - -.5.5 Real axis ILUT (p=4) - -.5.5 Real axis Figure 5: Spectra of the iteration matrix for the first benchmark obtained with Gauss- Seidel (left) and ILUT (right) as smoother. 4

Imaginary axis Imaginary axis Imaginary axis Imaginary axis Imaginary axis Imaginary axis Gauss-Seidel (p=2) ILUT (p=2).5.5 -.5 -.5 -.5 -.5 -.5 -.5 - - -.5.5 Real axis Gauss-Seidel (p=3) - -.5.5 Real axis Gauss-Seidel (p=4) - -.5.5 Real axis -.8.6.4.2 -.2 -.4 -.6 -.8 -.8.6.4.2 -.2 -.4 -.6 -.8 - - -.5.5 Real axis ILUT (p=3) - -.5.5 Real axis ILUT (p=4) - -.5.5 Real axis Figure 6: Spectra of the iteration matrix for the second benchmark obtained with Gauss- Seidel (left) and ILUT (right) as smoother. 5

5. Numerical Results In this Section, p-multigrid is both applied as a stand-alone solver and as a preconditioner within a Conjugate Gradient method. Results are compared using different smoothers (i.e. Gauss-Seidel and ILUT) and cycle types. Furthermore, the influence of the coarse grid correction and coarsening strategy on the convergence is investigated. Finally, (normalized) CPU times are presented for p-multigrid using different smoothers. For all numerical experiments, the initial guess u () h,p is chosen randomly, where each entry is sampled from a uniform distribution on the interval [, ] using the same seed. The stopping criterion is based on a relative reduction of the initial residual, where a tolerance of ɛ = 8 is adopted. Hence, the method is considered converged if r (i) h,p < ɛ, (25) r () h,p where r (i) h,p denotes the residual after iteration i. At level p =, the solution of the residual equation is obtained by means of a CG solver with a relatively high tolerance (ɛ = 4 ). Boundary conditions are imposed by using Nitsche s method [27]. Furthermore, the same number of presmoothing and postsmoothing steps are applied for all numerical experiments (ν = ν = ν 2 ) in this section. Table 4 shows the number of V-cycles needed for convergence for the first benchmark using different smoothers. As in the previous sections, both Gauss-Seidel and ILUT are considered as a smoother, with ν = 2. As expected, the number of V-cycles needed with Gauss-Seidel is independent of h, but strongly depends on p. The use of ILUT as a smoother leads to a p-multigrid which exhibits both independence of h and p. Furthermore, the number of iterations needed for convergence is significantly lower. For ν = 4, the number of iterations are shown in Table 5. When applying Gauss-Seidel, doubling the number of smoothing steps at each level halves the number of iterations needed with p-multigrid. For ILUT, this behaviour is less noticeable, as the number of iterations is already relatively low. The h-independence and p-dependence when adopting Gauss-Seidel as a smoother is further illustrated in Figure 7. Here, the normalised residual is shown for different values of h and p. Clearly, the performance of the p-multigrid deteriorates for higher values of p. 6

Table 4: Number of V-cycles needed for convergence for the first benchmark with p- multigrid, ν = 2. p = 2 ILUT GS p = 3 ILUT GS p = 4 ILUT GS h = 2 4 2 4 h = 2 4 2 29 h = 2 4 9 h = 2 5 3 4 h = 2 5 2 3 h = 2 5 2 89 h = 2 6 3 5 h = 2 6 2 3 h = 2 6 2 88 h = 2 7 3 5 h = 2 7 2 3 h = 2 7 2 86 Table 5: Number of V-cycles needed for convergence for the first benchmark with p- multigrid, ν = 4. p = 2 ILUT GS p = 3 ILUT GS p = 4 ILUT GS h = 2 4 2 7 h = 2 4 5 h = 2 4 45 h = 2 5 2 7 h = 2 5 2 6 h = 2 5 45 h = 2 6 2 8 h = 2 6 2 6 h = 2 6 2 44 h = 2 7 2 8 h = 2 7 2 6 h = 2 7 2 43 Gauss-Seidel (p=3) Gauss-Seidel (h=2 5 ) Normalised Residual 2 4 6 h = 2 4 h = 2 5 h = 2 6 Normalised Residual 2 4 6 p = 2 p = 3 p = 4 8 8 5 5 2 25 3 V-cycles 2 4 6 8 V-cycles Figure 7: Normalised residual for different values of p and h respectively. For the second benchmark, the number of V-cycles needed for convergence are shown in Table 6. The use of Gauss-Seidel leads to a non-converging multigrid method for all numerical experiments. On the other hand, the number of iterations needed when using ILUT as a smoother increases only slightly compared to the first benchmark. 7

Table 6: Number of iterations needed for convergence for the second benchmark with p-multigrid, ν = 2. p = 2 ILUT GS p = 3 ILUT GS p = 4 ILUT GS h = 2 4 3 h = 2 4 2 h = 2 4 2 h = 2 5 3 h = 2 5 3 h = 2 5 2 h = 2 6 3 h = 2 6 3 h = 2 6 3 h = 2 7 3 h = 2 7 3 h = 2 7 2 As an alternative, the p-multigrid method can be applied as a preconditioner within a Conjugate Gradient method. In the preconditioning phase of each iteration, a single V-cycle is applied. The number of iterations needed for convergence using both type of smoothers for the first benchmark can be found in Table 7. Since (forward) Gauss-Seidel, which has been adopted in the previous experiments, is not symmetric, symmetric Gauss-Seidel (SGS) is used as a smoother instead. The number of iterations needed when applying symmetric Gauss-Seidel is significantly lower compared to the number of p-multigrid V-cycles, especially for higher values of p. However, a dependence of the iteration numbers on p is still present. When adopting ILUT as a smoother, the number of iterations is identical to the number of p-multigrid V-cycles for all configurations. Hence, h- and p-independence is also observed when applying p-multigrid as a preconditioner and ILUT as smoother. Table 7: Number of iterations needed for convergence for the first benchmark with PCG, ν = 2. p = 2 ILUT SGS p = 3 ILUT SGS p = 4 ILUT SGS h = 2 4 2 h = 2 4 2 5 h = 2 4 26 h = 2 5 3 h = 2 5 2 6 h = 2 5 2 3 h = 2 6 3 h = 2 6 2 6 h = 2 6 2 29 h = 2 7 3 h = 2 7 2 6 h = 2 7 2 29 Similar results are obtained for the second benchmark, as shown in Table 8. Adopting ILUT as a smoother leads to h- and p-independence of the resulting preconditioned CG method, while p-dependence is observed when using symmetric Gauss-Seidel as a smoother. As discussed in Section 3, different multigrid schemes can be applied to solve the resulting linear system of equations. Results obtained when adopting a 8

Table 8: Number of iterations needed for convergence for the second benchmark with PCG, ν = 2. p = 2 ILUT SGS p = 3 ILUT SGS p = 4 ILUT SGS h = 2 4 3 8 h = 2 4 2 h = 2 4 2 22 h = 2 5 3 7 h = 2 5 2 h = 2 5 2 2 h = 2 6 3 7 h = 2 6 3 h = 2 6 2 2 h = 2 7 4 7 h = 2 7 3 h = 2 7 2 2 W-cycle instead of a V-cycle are presented for both benchmarks in Table 9 and, respectively. Here, the approximation order is fixed to p = 3. Note that for all configurations, the number of iterations needed for convergence is identical. However, since the computational costs per cycle is lower for V-cycles, they are considered throughout the rest of this paper. Table 9: Number of iterations needed for different cycle types for the first benchmark. ILUT GS p = 3 V(2,2) W(2,2) p = 3 V(2,2) W(2,2) h = 2 4 2 2 h = 2 4 29 29 h = 2 5 2 2 h = 2 5 3 3 h = 2 6 2 2 h = 2 6 3 3 h = 2 7 2 2 h = 2 7 3 3 Table : Number of iterations needed for different cycle types for the second benchmark. ILUT GS p = 3 V(2,2) W(2,2) p = 3 V(2,2) W(2,2) h = 2 4 2 2 h = 2 4 h = 2 5 3 3 h = 2 5 h = 2 6 3 3 h = 2 6 h = 2 7 3 3 h = 2 7 ILUT as a preconditioner To investigate the quality of the proposed p-multigrid method in more detail, results are compared using ILUT as a preconditioner within a Conjugate Gradient method. The number of iterations with ILUT as a preconditioner for the first benchmark are shown in Table. As expected, the number of 9

CG iterations roughly doubles when the meshwidth h is halved. The number of iterations needed with p-multigrid (with ILUT as a smoother) as a preconditioner is independent of h. Results obtained for the second benchmark are shown in Table 2. As with the first benchmark, the number of CG iterations approximately doubles when h is halved in case ILUT is applied as a preconditioner. Table : Number of iterations needed for convergence for the first benchmark using p-multigrid (pmg) or ILUT as a preconditioner for a CG method, ν = 2. p = 2 pmg ILUT p = 3 pmg ILUT p = 4 pmg ILUT h = 2 4 2 3 h = 2 4 2 2 h = 2 4 h = 2 5 3 7 h = 2 5 2 4 h = 2 5 2 3 h = 2 6 3 2 h = 2 6 2 6 h = 2 6 2 5 h = 2 7 3 24 h = 2 7 2 4 h = 2 7 2 9 Table 2: Number of iterations needed for convergence for the second benchmark using p-multigrid (pmg) or ILUT as a preconditioner for a CG method, ν = 2. p = 2 pmg ILUT p = 3 pmg ILUT p = 4 pmg ILUT h = 2 4 3 5 h = 2 4 2 3 h = 2 4 2 2 h = 2 5 3 8 h = 2 5 2 5 h = 2 5 2 4 h = 2 6 3 6 h = 2 6 3 9 h = 2 6 2 7 h = 2 7 4 28 h = 2 7 3 7 h = 2 7 2 2 Comparison h- and hp-multigrid To assess the quality of the coarse grid correction in more detail, results are compared with h- and hp-multigrid methods. Compared to the proposed p-multigrid method, only the way in which the hierarchy is constructed differs. For the h-multigrid method, coarsening in h is applied, while for hpmultigrid coarsening in h and p is applied simultaneously. Hence, all other components (i.e. prolongation and restriction) are identical for all multigrid methods. Results obtained for both benchmarks with the different coarsening strategies for different values of h and p are shown in Table 3 and Table 4. For all numerical experiments, the number of iterations needed with h- and hp-multigrid are comparable. Note that, for the first benchmark, the number of iterations decreases for increasing values of p and depends on the mesh width h. 2

Table 3: Comparison of p-multigrid with h- and hp-multigrid for the first benchmark, ν = 2. p = 2 p h hp p = 3 p h hp p = 4 p h hp h = 2 4 2 4 4 h = 2 4 2 2 2 h = 2 4 h = 2 5 3 h = 2 5 2 5 5 h = 2 5 2 3 3 h = 2 6 3 2 2 h = 2 6 2 3 3 h = 2 6 2 8 8 h = 2 7 3 27 27 h = 2 7 2 24 28 h = 2 7 2 6 8 Table 4: Comparison of p-multigrid with h- and hp-multigrid for the second benchmark, ν = 2. p = 2 p h hp p = 3 p h hp p = 4 p h hp h = 2 4 3 6 6 h = 2 4 2 3 3 h = 2 4 2 2 2 h = 2 5 3 6 6 h = 2 5 3 8 9 h = 2 5 2 4 4 h = 2 6 3 28 28 h = 2 6 3 2 24 h = 2 6 3 2 3 h = 2 7 3 33 33 h = 2 7 3 5 6 h = 2 7 2 3 38 For both benchmarks and all configurations, the number of iterations needed for convergence with p-multigrid is significantly lower compared to both h- multigrid and hp-multigrid. The obtained results indicate that coarsening in p is much more effective compared to the alternative coarsening strategies. CPU times Besides iteration numbers, computational times have been determined when adopting both Gauss-Seidel and ILUT as a smoother within p-multigrid. A serial implementation is considered on a Intel(R) Xeon(R) E5-2687W CPU (3.GHz). Table 5 presents the CPU times for different values of h and p. For all configurations, the use of ILUT as a smoother significantly reduces the CPU time needed before convergence has been reached. Table 5: Computational times (in seconds) for convergence with p-multigrid for the first benchmark, ν = 2. p = 2 ILUT GS p = 3 ILUT GS p = 4 ILUT GS h = 2 4.5.7 h = 2 4.7 2.4 h = 2 4.8 4. h = 2 5.6 2.5 h = 2 5 2.2 9.2 h = 2 5 3.6 56.5 h = 2 6 6..8 h = 2 6 9.3 47.4 h = 2 6 6. 27.3 h = 2 7 32.7 85.8 h = 2 7 48.5 339. h = 2 7 85.5 666.2 2

By combining Table 4 and 5, an estimate for the difference in computational time of applying a single smoothing step can be derived. A reduction of the iterations with a factor of γ when applying ILUT as a smoother results in a reduction of the computational time of approximately γ. Hence, applying a 2 smoothing step with ILUT is approximately twice as expensive compared to Gauss-Seidel. For completeness, the normalized CPU times obtained for the second benchmark are presented in Table 6, showing the same behaviour when using ILUT as a smoother. Table 6: Computational times (in seconds) for convergence with p-multigrid for the second benchmark, ν = 2. p = 2 ILUT GS p = 3 ILUT GS p = 4 ILUT GS h = 2 4.5 h = 2 4.6 h = 2 4.9 h = 2 5.3 h = 2 5 2.2 h = 2 5 3. h = 2 6 5.5 h = 2 6 9.4 h = 2 6 7. h = 2 7 28.9 h = 2 7 53.4 h = 2 7 68.8 Since the number of iterations needed with the CG method using ILUT as a preconditioner is relatively low (see Table and 2), CPU times have also been determined using both symmetric Gauss-Seidel and ILUT as a preconditioner. Results for both benchmarks are presented in Table 7 and Table 8. For most of the configurations, the use of ILUT as a preconditioner leads to similar CPU times compared to the use of symmetric Gauss-Seidel. The CPU time needed with ILUT as a preconditioner increases approximately with a factor of 8 when the meshwidth is halved. This increase is expected, as the number of degrees of freedom increases with a factor of 4 and twice as many iterations are needed for convergence. As a consequence, CG preconditioned with ILUT becomes less competitive for smaller values of h compared to the use of p-multigrid (with ILUT as a smoother). Table 7: Computational times (in seconds) for convergence with PCG for the first benchmark. p = 2 ILUT SGS p = 3 ILUT SGS p = 4 ILUT SGS h = 2 4.2.4 h = 2 4.2.6 h = 2 4.2. h = 2 5.8 2. h = 2 5.6.9 h = 2 5.6 3.9 h = 2 6 2.4 4.3 h = 2 6 9.5.9 h = 2 6 2.8 5.2 h = 2 7 3.9 3. h = 2 7. 87.6 h = 2 7. 83.7 22

Table 8: Computational times (in seconds) for convergence with PCG for the second benchmark. p = 2 ILUT SGS p = 3 ILUT SGS p = 4 ILUT SGS h = 2 4.3.3 h = 2 4.2.4 h = 2 4.2.9 h = 2 5.9 2.7 h = 2 5.5 2. h = 2 5.7 3. h = 2 6 3.3 2.5 h = 2 6.9 4. h = 2 6 3.8 3.9 h = 2 7 95.8 72.6 h = 2 7 88.2.8 h = 2 7 5.7 9.7 6. Conclusions In this paper, we presented a p-multigrid method that uses a smoother based on an ILUT factorization. In contrast to classical smoothers (i.e. Gauss-Seidel), the reduction factors of the general eigenvectors associated with high-frequency modes do not increase for higher values of p. This results in asymptotic convergence factors which are independent of both the mesh width h and approximation order p. Numerical results, obtained for the Poisson equation in 2D on a quarter annulus and the convection-diffusion-reaction equation on the unit square, have been presented when using p-multigrid as a stand-alone solver or as a preconditioner within a CG method. Furthermore, different cycle types have been considered (i.e. V- and W-cycles). For all configurations, the number of iterations needed when using ILUT as a smoother are significantly lower compared to the use of Gauss-Seidel. The quality of the coarsening strategy adopted in p-multigrid has been compared with the one in both h- and hpmultigrid methods. Results indicate that the coarse grid correction with p-multigrid is more effective compared to the other coarsening strategies. Finally, CPU times have been presented, showing a significant improvement when adopting ILUT as a smoother. Future research will focus on the application of p-multigrid methods for higher-order partial differential equations (i.e. biharmonic equation), where the use of basis functions with high continuity is necessary. Furthermore, using the proposed p multigrid method on multipatches will be investigated. Finally, the multigrid method can be applied within a HPC framework. Here, the ILUT smoother can easily be parallelized using colouring, as frequently incorporated in multiplicative Schwarz methods [28]. 23

References [] T.J.R. Hughes, J.A. Cottrell and Y.Bazilevs. Isogeometric analysis: CAD, f inite elements, NURBS, exact geometry and mesh ref inement. Computer Methods in Applied Mechanics and Engineering, 94(39-4): 435-495, 25 [2] T.J.R. Hughes, A. Reali and G. Sangalli, Duality and unified analysis of discrete approximations in structural dynamics and wave propagation: Comparison of p-method finite elements with k-method NURBS. Computer Methods in Applied Mechanics and Engineering, 97(49-5): 44-424, 28 [3] J.A. Cottrell, A. Reali, Y.Bazilevs and T.J.R. Hughes. Isogeometric analysis of structural vibrations. Computer Methods in Applied Mechanics and Engineering, 95(4-43): 5257-5296, 26 [4] Y. Bazilevs, V.M. Calo, Y. Zhang and T.J.R. Hughes. Isogeometric Fluid-structure Interaction Analysis with Applications to Arterial Blood Flow. Computational Mechanics, 38(4-5): 3-322, 26 [5] W.A. Wall, M.A. Frenzel and C. Cyron. Isogeometric structural shape optimization. Computer Methods in Applied Mechanics and Engineering, 97(33-4): 2976-2988, 28 [6] K.J. Fidkowski, T.A. Oliver, J. LU and D.L. Darmofal. p-multigrid solution of high-order discontinuous Galerkin discretizations of the compressible Navier-Stokes equations. Journal of Computational Physics, 27(): 92-3, 25 [7] H. Luo, J.D. Baum and R. Löhner. A p-multigrid discontinuous Galerkin method for the Euler equations on unstructured grids. Journal of Computational Physics, 2(2): 767-783, 26 [8] H. Luo, J. D. Baum, and R. Löhner. Fast p-multigrid Discontinuous Galerkin Method for Compressible Flows at All Speeds, AIAA Journal, 46(3): 635-652, 28 [9] P. van Slingerland and C. Vuik. Fast linear solver for diffusion problems with applications to pressure computation in layered domains. Computational Geosciences, 8(3-4): 343-356, 24 24

[] B. Helenbrook, D. Mavriplis and H. Atkins. Analysis of p-multigrid for Continuous and Discontinuous Finite Element Discretizations, 6 th AIAA Computational Fluid Dynamics Conference, Fluid Dynamics and Co-located Conferences [] R. Tielen, M. Möller and C. Vuik. Efficient multigrid based solvers for Isogeometric Analysis.. Proceedings of the 6 th European Conference on Computational Mechanics and the 7 th European Conference on Computational Fluid Dynamics, Glasgow, UK. [2] Y. Saad, ILUT: A dual threshold incomplete LU factorization, Numerical Linear Algebra with Applications, (4): 387-42, 994. [3] G. Sangalli and M. Tani. Isogeometric preconditioners based on fast solvers for the Sylvester equation. SIAM Journal on Scientific Computing, 38(6): 3644-367, 26 [4] L. Beirao da Veiga, D. Cho. L.F. Pavarino and S.Scacchi. Overlapping Schwarz methods for isogeometric analysis. SIAM Journal on Numerical Analysis, 5(3): 394-46, 22 [5] K.P.S. Gahalaut, J.K. Kraus and S.K. Tomar. Multigrid methods for isogeometric discretizations. Computer Methods in Applied Mechanics and Engineering, 253: 43-425, 23 [6] M. Donatelli, C. Garoni, C. Manni, S. Capizzano and H. Speleers. Symbol-based multigrid methods for Galerkin B-spline isogeometric analysis. SIAM Journal on Numerical Analysis, 55(): 3-62, 27 [7] C. Hofreither, S. Takacs and W. Zulehner. A robust multigrid method for Isogeometric Analysis in two dimensions using boundary correction. Computer methods in Applied Mechanics and Engineering, 36: 22-42, 27 [8] A. de la Riva, C. Rodrigo and F. Gaspar. An efficient multigrid solver for isogeometric analysis. arxiv:86.5848v [9] C. De Boor. A practical guide to splines. st edn. Springer-verlag, New York, 978 25

[2] S.C. Brenner and L.R. Scott, The mathematical theory of finite element methods, Texts Applied Mathematics, 5, Springer, New York, 994 [2] W.L. Briggs, V. E. Henson and S.F. McCormick, A Multigrid Tutorial 2nd edition, SIAM, Philadelphia, 2. [22] R.S. Sampath and G. Biros, A parallel geometric multigrid method for f inite elements on octree meshes., SIAM Journal on Scientific Computing, 32(3): 36-392, 2 [23] G. Guennebaud, J. Benoît et al., Eigen v3, http://eigen.tuxfamily.org, 2 [24] Y. Saad. SPARSKIT: a basic tool kit for sparse matrix computations. 994 [25] C. Hofreither and W. Zulehner, Spectral Analysis of Geometric Multigrid Methods for Isogeometric Analysis., Numerical Methods and Applications, 8962: 23-29, 25 [26] U. Trottenberg, C. Oosterlee and A. Schüller. Multigrid, Academic Press, 2 [27] J. Nitsche. Uber ein Variations zur Lösung von Dirichlet-Problemen bei Verwendung von Teilräumen die keinen Randbedingungen unterworfen sind. Abhandlungen aus dem mathematischen Seminar der Universität Hamburg, 36(): 9-5, 97 [28] E. Chow and A. Patel, Fine-Grained Parallel Incomplete LU Factorization. SIAM Journal on Scientific Computing, 37(2): 69-93, 25 26

arxiv: v1 [math.na] 7 Jan 2019