Transactions on Modelling and Simulation vol 19, 1998 WIT Press, ISSN X

Similar documents
Transactions on Modelling and Simulation vol 18, 1997 WIT Press, ISSN X

Transactions on Modelling and Simulation vol 18, 1997 WIT Press, ISSN X

Fast multipole boundary element method for the analysis of plates with many holes


APPROXIMATING GAUSSIAN PROCESSES

An H-LU Based Direct Finite Element Solver Accelerated by Nested Dissection for Large-scale Modeling of ICs and Packages

SIMULATION OF PLANE STRAIN FIBER COMPOSITE PLATES IN BENDING THROUGH A BEM/ACA/HM FORMULATION

Fast Multipole BEM for Structural Acoustics Simulation

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques

Reduction of Smith Normal Form Transformation Matrices

H 2 -matrices with adaptive bases

Generalized Fibonacci Numbers and Blackwell s Renewal Theorem

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples

Fast Multipole Methods for Incompressible Flow Simulation

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG. Lehrstuhl für Informatik 10 (Systemsimulation)

Poisson Solvers. William McLean. April 21, Return to Math3301/Math5315 Common Material.

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

Journal of Inequalities in Pure and Applied Mathematics

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Interpolation in h-version finite element spaces

Practical Tips for Modelling Lot-Sizing and Scheduling Problems. Waldemar Kaczmarczyk

Hierarchical Matrices. Jon Cockayne April 18, 2017

R ij = 2. Using all of these facts together, you can solve problem number 9.

Chapter Two: Numerical Methods for Elliptic PDEs. 1 Finite Difference Methods for Elliptic PDEs

Conjecture 4. f 1 (n) = n

AMS 529: Finite Element Methods: Fundamentals, Applications, and New Trends

Technical University Hamburg { Harburg, Section of Mathematics, to reduce the number of degrees of freedom to manageable size.

Enumerate all possible assignments and take the An algorithm is a well-defined computational

Technische Universität Graz

Iterative coupling in fluid-structure interaction: a BEM-FEM based approach

A NOTE ON THE DEGREE OF POLYNOMIAL APPROXIMATION*

Demystification of the Geometric Fourier Transforms

An adaptive fast multipole boundary element method for the Helmholtz equation

The Fast Multipole Method and other Fast Summation Techniques

für Mathematik in den Naturwissenschaften Leipzig

An Adaptive Hierarchical Matrix on Point Iterative Poisson Solver

On Surface Meshes Induced by Level Set Functions

Bounding in Multi-Stage. Stochastic Programming. Problems. Olga Fiedler a Andras Prekopa b

An implementation of the Fast Multipole Method without multipoles

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product

Conflict-Free Colorings of Rectangles Ranges

Bin Sort. Sorting integers in Range [1,...,n] Add all elements to table and then

Generalized Finite Element Methods for Three Dimensional Structural Mechanics Problems. C. A. Duarte. I. Babuška and J. T. Oden

Lower Bounds for q-ary Codes with Large Covering Radius

UND INFORMATIK. Maximal φ-inequalities for Nonnegative Submartingales. G. Alsmeyer und U. Rösler

On the maximal density of sum-free sets

A Fast N-Body Solver for the Poisson(-Boltzmann) Equation

Elementary linear algebra

Bicriterial Delay Management

arxiv: v2 [math-ph] 24 Feb 2016

Monadic Second Order Logic and Automata on Infinite Words: Büchi s Theorem

Algorithms as multilinear tensor equations

Hybrid Cross Approximation for the Electric Field Integral Equation

Preprint Alexander Heinlein, Axel Klawonn, and Oliver Rheinbach Parallel Two-Level Overlapping Schwarz Methods in Fluid-Structure Interaction

BETI for acoustic and electromagnetic scattering

Technische Universität Graz

CLASSIFICATION AND PRINCIPLE OF SUPERPOSITION FOR SECOND ORDER LINEAR PDE

OTTO H. KEGEL. A remark on maximal subrings. Sonderdrucke aus der Albert-Ludwigs-Universität Freiburg

Fractal two-level finite element method for free vibration of cracked beams

Bridging the gap between flat and hierarchical low-rank matrix formats: the multilevel BLR format

Representation of Lie Groups and Special Functions

1 Solutions to selected problems

Introduction to the Numerical Solution of IVP for ODE

Continuum mechanics V. Constitutive equations. 1. Constitutive equation: definition and basic axioms

Least-squares data fitting

An accelerated predictor-corrector scheme for 3D crack growth simulations

A Study of Numerical Elimination for the Solution of Multivariate Polynomial Systems

Algorithm efficiency analysis

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Deterministic Polynomial Time Equivalence between Factoring and Key-Recovery Attack on Takagi s RSA

Large topological cliques in graphs without a 4-cycle

A Reverse Technique for Lumping High Dimensional Model Representation Method

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

M.A. Botchev. September 5, 2014

S N. hochdimensionaler Lyapunov- und Sylvestergleichungen. Peter Benner. Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz

ON POLYNOMIALS GENERATED BY TRIANGULAR ARRAYS

Computation of the Bernstein Coecients on Subdivided. are computed directly from the coecients on the subdivided triangle from the

SEPARATION AXIOMS FOR INTERVAL TOPOLOGIES

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

1. How do error estimators for the Galerkin FEM depend on &, ft? 2. Can the classical Galerkin approach be improved towards a linear rule?

Solving an Elliptic PDE Eigenvalue Problem via Automated Multi-Level Substructuring and Hierarchical Matrices

Optimal multilevel preconditioning of strongly anisotropic problems.part II: non-conforming FEM. p. 1/36

ON THE THEORY OF ASSOCIATIVE DIVISION ALGEBRAS*

MATH 590: Meshfree Methods

Notes on Numerical Fluid Mechanics

Numerical Integration for Multivariable. October Abstract. We consider the numerical integration of functions with point singularities over

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Functions associated to scattering amplitudes. Stefan Weinzierl

Research Reports on Mathematical and Computing Sciences

FEM and sparse linear system solving

Effective matrix-free preconditioning for the augmented immersed interface method

Lecture 2: Linear Algebra Review

Numerical Methods I Non-Square and Sparse Linear Systems

arxiv:math/ v2 [math.qa] 12 Jun 2004

QUASINORMAL FAMILIES AND PERIODIC POINTS

Two Lectures on the Ellipsoid Method for Solving Linear Programs

Domain decomposition methods via boundary integral equations

Residual iterative schemes for largescale linear systems

Solving linear systems (6 lectures)

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40

Transcription:

Cost estimation of the panel clustering method applied to 3-D elastostatics Ken Hayami* & Stefan A. Sauter^ * Department of Mathematical Engineering and Information Physics, Graduate School of Engineering, University of Tokyo, 113-8656 Tokyo, Japan, EMail: hayami@simplex.t.u-tokyo.ac.jp % Lehrstuhl fur Praktische Mathematik, Mathematisches Seminar Bereich II, Christian-Albrechts-Universitdt zu Kiel, D-24098 Kiel, Germany, EMail: sas@numerik.uni-kiel.de Abstract We will present an efficient algorithm for the panel clustering method for the three-dimensional elastostatic problem, which reduces the memory and computational work for the boundary element method. In order to make the necessary polynomial expansions for the clustering simple, the fundamental solution for the displacement components are expressed as a linear combination of partial derivatives of the distance r between the observation point and the field point. It is shown that this approach is far more efficient compared to the standard expression by comparing the estimates for the computational work. 1 Introduction Although the Boundary Element Method (BEM) enjoys the advantage of the boundary only discretization, a serious computational difficulty arises due to its dense matrix formulation for large scale three-dimensional problems, since the method requires O(JV^) memory and O(JV^) computational work using the conventional approach, where N is the number of unknowns. The situation is even worse for the three-dimensional elastostatic problem, where the number of unknowns is three times that of the potential problem, since the unknowns at each node is a vector instead of a scalar.

34 Boundary Element Research In Europe Hackbusch and Nowak[l] proposed the panel clustering method in order to overcome this difficulty. The main idea is to approximate the farfieldusing polynomial expansions around a centre of a cluster of panels or boundary elements, thus reducing the O(N^) dense matrix vector multiplication to a O(7V(log N)*) sparse matrix vector multiplication for each iteration of the iterative linear solver. The authors previously proposed a formulation of the panel clustering method for the three-dimensional elastostatic problem [2]. In this paper we will present an efficient algorithm for calculating the expansion coefficients for the panel clustering, which is the most time and memory consuming part of the method, and the computational costs will be estimated. 2 The boundary element formulation of 3-D elastostatics The boundary integral equation for the three-dimensional (linear, isotropic) elastostatic problem is given by + c.p. (f= 1,2,3), (1) where we have used Einstein's convention for the summation over the repeated index k = 1,2,3, F is the boundary of the domain under consideration, %,P& are the displacement and traction components, respectively, %(#) = ^6/& when F is smooth at a, and the body force term has been neglected. u*k(x,y) is the fundamental displacement, which is usually given by where /z is the shear modulus, v is the Poisson's ratio, r := \y #, r^ := dr/dykj x = (x\,x^x^ and y = (3/1,2/2,2/3)^- However, it will prove useful [2] to use the alternative expression: 1 f ^ 1 1,gX ^r,mm,6 2(1 _y) ^J'

Boundary Element Research In Europe 35 The fundamental traction component is given by P%(%,2/) = <?;%%, = A%%,%6 + /4%j + %W%' (4) where n/ is the component of the unit outward normal vector at y G F Next, the boundary T is discretized into boundary elements or panels TT^, (a = 1,...,n). Although the following discussion is also valid for higher order elements, assume constant elements, and let x" be the point representing 7r<%, to obtain 0=1 (a = l,._,7z; /= 1,2,3), (5) where uf = u(x&) etc. Given the boundary data, equation (5) is a system of linear equations for the unknown boundary displacement and traction components u and p%, where the matrix is dense and nonsymmetric. Hence, if it is solved using LU-decomposition, the computational work is O(N*) &%d memory O(JV^), where N = 3n is the number of unknowns. Alternatively, one could use iterative solvers for nonsymmetric matrices, such as the GMRES method. Since the system of equations arising from boundary integral equations is usually well conditioned, the method should converge within M < N iterations with the use of a suitable preconditioner when necessary. In this method, the dominant part of the computation is the dense matrix-vector multiplication corresponding to equation (5) for the unknown boundary data or iteration vector, which costs O(N^) for each iteration. Hence, the amount of computational work is reduced from O(TV^) to O(MN^), but the memory required is still O(N^), and it is this memory bottle-neck that hinders the solution of large scale problems using the boundary element method. 3 The panel clustering method The reason why the matrix-vector product for the iterative solution of equation (5) is dense is because the observation point x<* on the element I\ is related to all the elements Fp on the boundary through the kernels p*k(x<*,y) and u*k(x *,y).

36 Boundary Element Research In Europe The panel clustering method [1] was proposed to reduce the required memory and computational work for computing such matrixvector products. The method makes use of polynomial expansions to approximate the integral kernels for cluster of elements which are sufficiently far from the observation point, thus reducing the amount of computation and required memory. Let 6(%,!/X%)dr(2,) (6) p represent the integrals in equation (1). In the farfield (i.e. when x - y \ is sufficiently large) k(x,y) is approximated by a piecewise polynomial &#% (#,%/) with respect to %/, which may be based on thetaylor expansion around a centre y R^ expressed as global polynomials of y: 66/rn where /m is an index set of size (t/m < Cjm?. The integer m is the order of the expansion. The functions $i(y) must be independent of x and the centre y. Furthermore, integrals of $^(y) over single boundary elements F/? must be easily computable. Usually, $*(t/) is a (piecewise) polynomial over each element F^. The error of the expansion of equation (7) depends on the order m and on the distance y - y \ from the centre, i.e. A(*,2/) - W*,%) < Ci(C27?r 6(a:,y)l &" all y - ^ < 7? % - %, where 0 < rj < 1. Next, "clusters" r, which are unions of several boundary elements 7T0, are introduced. For each observation point %, the boundary F can be represented by a union of certain number of elements and clusters: F = 7riU7r2U...U7TpUriUr2U...Urc (TT^ : elements, TJ : clusters). (8) Since the clusters TJ are unions of many elements, the sum p + c can be considerably smaller (O(logn)) than the number of elements n. The integral of equation (6) can now be expressed as j=l

Boundary Element Research In Europe 37 The first term corresponding to the "near field" TI U 7r2 U... U Tip will be evaluated directly. The clusters TJ correspond to the "far field", where the expansion (7) around a centre y = y^ of TJ can be exploited, i.e. we can approximate the integral over TJ by replacing k by km to obtain Since the quantities J' («) = / *»(y)tt(»)dr(»), (j = l.-.c) are independent of a, they will be computed in the first phase for all indices i and clusters r. Then the evaluation of equation (9) can be performed for all element nodes x", 1 < a < n, by = K.K;/)4/4, which has tt/m =0(ro3) terms independent of the size of the cluster TJ, and Jj.(w) can be shared among different %". By taking a hierarchy of clusters, the number of all possible clusters (consisting of more than one element) for all JB«, 1 < a < n can be kept under the number of elements n [1]. The error of approximating the integral /,. k(x,y)u(y}ay(y) can be controlled by keeping the size of the clusters sufficiently small compared to the distance from x. A partition of T by equation (8) is called "admissible" with respect to x when the clusters are sufficiently small in this sense. The admissible covering with the smallest number of members is used. 4 Computation of the expansion coefficients The computation of the expansion coefficients K,(a;;j/ ) in (7) is the most time-consuming part of the panel clustering method. Let km(x,y) be the Taylor expansion of k(x,y) around yr to the m-th order, i.e. where,:= K^f, ", NO, 1 < «No := {0}UN. ~ and z =

38 Boundary Element Research In Europe Sauter [3] proposed an efficient algorithm for calculating KV(X, j/r) when k(x,y) = % - J/H"""*, which is given in the following. Let [x] denote the integer part of x 6 R, := x - Algorithm a) Compute s\(yr] for 0 < / < m and u? for 0 < V{ < m, 1 < i < 3 b) Compute for all 1/3, / E NO; */s + / < rn : c) Compute Df'** for r/2,^3,1 E NO; ^2 + ^3 + / < m T-\t/2»^3. JJt. d) Compute D" for all i/ < m : e) Compute the coefficients % (%,%/?) using the following recursion: 1. For v \ m 1,...,0, Compute {n^ U if II k K \u f/ I I y^3.^ j l/ < Aj < TTi where e* is the unit vector such that (e;)j = %. 2. Compute for all v E Njj, i/ < m :

Boundary Element Research In Europe 39 The total computational work for computing %,/(%, */T),H < for fixed,j/t is Wn = (5nf + 25m* + 43m + 71) eo + 1 neo, (10) LZi where eo stands for an elementary operation such as ±, X,-f, and neo stands for a non-elementary operation (namely square root). The memory required for K^(x,j/r),H < m is Nm = KM < m} = i(m* + 3m' + 2m). Next, we will make use of the expression (3) and the preceding algorithm to obtain polynomial expansions for the three-dimensional elastostatic kernels. First, consider approximating the fundamental displacement u*k(x,y) for a cluster r of elements by the polynomial expansion obtained by the m-th order Taylor expansion around %/ = %/?, i.e. Then, the integral over the cluster r is approximated by p 0=1 where the last approximation is valid for constant elements, where Pk(y) «pf,l < fc < 3 for y 7T0, and the farfieldcoefficients are given by 31 := / t/"dr(y). (12) J^P Let fp denote the p-th order (Taylor) expansion of /. Then, we have Lemma 1 Here the _ denotes that Einstein's convention for repeated indices do not apply. < can be obtained by Algorithm 1 with 7 = -1.

40 Boundary Element Research In Europe Thus, (3) gives for / ^ 6, /A%,2/T) = - i^ ^(^/ +!)(*% + l)<+e,+e/^,%/t) (13) and for / = A;, %=! where an = 1L Note that it is only necessary to compute and store & * for 1 < I < k < 3, due to symmetry, so that the memory required for f/ < m is 6JV = m* + 3m^ + 2m. As for the fundamental traction pj^, ^ is a function not only of spatial derivatives of r but also of the unit outward normal n/(y) to the boundary T, which is generally a piecewise polynomial (cf. (4)), so that it is not possible to obtain a global expansion for p^(x,y). Instead, we will expand parts of p^ by global polynomials, as follows. From (4), we have 4- i/ <m ( ^'X^ /3=1 /5=1 J (15) where the farfieldcoefficients are given by /^ ^ := Jpn%, and constant elements: rij(y) w n?, Uk(y} «wf for y ftp are used. (High order elements can be treated similarly.) The expansion coefficients for u*^ can be computed using Lemma 2

Boundary Element Research In Europe 41 5 Estimates of computational costs First, the computational work to obtain the expansion coefficients will be estimated. According to (10), the work to obtain < for \v\ < m + 3 is Wm+3 = (6^ + T# * + IT ' + 2f m + 14o) eo + 1 neo. Next, K$* for 1 < I < k < 3 and i/ < ra+1 is obtained using (13) and 7 " ' '» * (14) with work: l4(? + 6m* + llm + 6)eO. Finally, «""''', 1 < / < 3 and K^'^^\ 1 < / < 3, 1 < k < j < 3; i/ < m are computed using Lemma 2 with work: BIANCO = (f ^ + T^ + 27m) eo. Hence, the total work to compute the necessary expansion coefficients Transactions on Modelling and Simulation vol 19, 1998 WIT Press, www.witpress.com, ISSN 1743-355X is / 5 A 415 a 508 o 929, A _, l^n = [ Ji^4 _ _ ^3 _ _ ^2 _ _ yn + 224 eo + V12 12 3 3 / The memory required to store the expansion coefficients for \v\ < m for tijfc, 1 < / < k < 3 is 67V, for A%, 1 < / < 3 is 37V^, and for Mkj + %%,&), 1 < '^ 3, 1 < j <fc < 3 is 18JV^, which gives a total of % = 27JV^, where JV^ = i/ < m - i(m^ + Sm^ + 2m). On the other hand, if we had used the standard expression (2) for u*^, the work estimate would have been /2 «% 79 X 314 3 2059 % 2071 177 TC = (^ + 8m^ym' + m^ + -^m^-^m+ eo + IneO, which is greater than the W above for m > 2. The reason why the standard expression requires O(ra^) work is because it involves multiplication of polynomials with O(m^) terms each, such as (r,/) and (r,/.)m, which requires O(m^) work if done in the straightforward way. The required memory using the standard expression is the same as above. Since the number of admissible clusters per observation point is O(log n) and the order of expansion m necessary for an approximation consistent with the discretization error is O(logn), where n is the number of boundary elements (observation points) [1], the total work necessary for the computation of the expansion coefficients is O(nlognm^) = O(nlog* n). The memory required is O(n log n m?) = O(nlog^n).

42 Boundary Element Research In Europe Next, the work for the computation of the farfieldcoefficients of (12) etc. is O(nm^) =O(nlog*n) [1] and the memory is O(nra^) = O(nlog^n). As for the work for each iteration, assuming that the number of elements in a cluster is p, the work per iteration for computing Ir <&(*,%W%/)drW using (11) is 3{(6p + 3)7V - l}eo and that for computing frp*k(*>y)uk(ywy) using (15) is 3{(24p+10)JV^-l}eO, so that the work for a matrix vector multiplication iso(nra^) O(nlog^ n) and the work and memory for other parts of the iteration such as in GMRES is O(n). Hence, the dominant part of the whole algorithm is the computation of the expansion coefficients, which consumes O(nlog^ n) work and O(nlog^n) memory. 6 Conclusions We presented an algorithm for applying the panel clustering method to the three-dimensional elastostatic problem and estimated its computational complexity. It was shown that the computation of the expansion coefficients is the dominant part of the algorithm and that expressing the fundamental displacement in terms of a linear combination (3) of the second order spatial derivatives of the distance between the observation point and the field point substantially reduces the computational costs. References [1] Hackbusch, W. and Nowak, Z.P., On the fast matrix multiplication in the boundary element method by panel clustering, Numerische Mathematik, Vol. 54, pp. 463-491, 1989. [2] Hayami, K. and Sauter, S., Application of the panel clustering method to the three-dimensional elastostatic problem, Boundary Elements XIX, Computational Mechanics Publications, pp. 625-634, 1997. [3] Sauter, Der Aufwand der Panel-Clustering-Methode fur Integralgleichungen, Bericht Nr. 9115, Institut fur Informatik und Praktische Mathematik, Christian-Albrechts-Universitat Kiel, 1991.