Similar documents
H 2 -matrices with adaptive bases

Hierarchical Matrices. Jon Cockayne April 18, 2017

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples

für Mathematik in den Naturwissenschaften Leipzig

Tensor networks, TT (Matrix Product States) and Hierarchical Tucker decomposition

Fast evaluation of boundary integral operators arising from an eddy current problem. MPI MIS Leipzig

Hamburger Beiträge zur Angewandten Mathematik

The Mortar Wavelet Method Silvia Bertoluzza Valerie Perrier y October 29, 1999 Abstract This paper deals with the construction of wavelet approximatio

A New Scheme for the Tensor Representation

für Mathematik in den Naturwissenschaften Leipzig

Numerical Integration for Multivariable. October Abstract. We consider the numerical integration of functions with point singularities over

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

2 W. Hackbusch, W. Kress and S. Sauter 2 Formulation of the Problem We consider a scattering problem in an exterior domain. For this, let R 3 be an un

Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks

Numerical tensor methods and their applications

Lecture 1: Center for Uncertainty Quantification. Alexander Litvinenko. Computation of Karhunen-Loeve Expansion:

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques

4. Multi-linear algebra (MLA) with Kronecker-product data.

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

A Finite Element Method for an Ill-Posed Problem. Martin-Luther-Universitat, Fachbereich Mathematik/Informatik,Postfach 8, D Halle, Abstract

An H-LU Based Direct Finite Element Solver Accelerated by Nested Dissection for Large-scale Modeling of ICs and Packages

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f

The WENO Method for Non-Equidistant Meshes

On the Ecient Realization of Sparse Matrix. Techniques for Integral Equations with Focus. on Panel Clustering, Cubature and Software.

Matrix assembly by low rank tensor approximation

Polynomial functions over nite commutative rings

Transactions on Modelling and Simulation vol 19, 1998 WIT Press, ISSN X

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr

Math 671: Tensor Train decomposition methods

Technische Universität Graz

A Fast N-Body Solver for the Poisson(-Boltzmann) Equation

Non-Conforming Finite Element Methods for Nonmatching Grids in Three Dimensions

Convergence of The Multigrid Method With A Wavelet. Abstract. This new coarse grid operator is constructed using the wavelet

Nets Hawk Katz Theorem. There existsaconstant C>so that for any number >, whenever E [ ] [ ] is a set which does not contain the vertices of any axis

524 Jan-Olov Stromberg practice, this imposes strong restrictions both on N and on d. The current state of approximation theory is essentially useless

/00 $ $.25 per page

Ecient computation of highly oscillatory integrals by using QTT tensor approximation

Calculus and linear algebra for biomedical engineering Week 3: Matrices, linear systems of equations, and the Gauss algorithm

QUADRATURES INVOLVING POLYNOMIALS AND DAUBECHIES' WAVELETS *

Coins with arbitrary weights. Abstract. Given a set of m coins out of a collection of coins of k unknown distinct weights, we wish to

APPROXIMATING GAUSSIAN PROCESSES

A Hybrid Method for the Wave Equation. beilina

Fast Multipole BEM for Structural Acoustics Simulation

Stability of implicit extrapolation methods. Abstract. Multilevel methods are generally based on a splitting of the solution

SPECTRAL METHODS ON ARBITRARY GRIDS. Mark H. Carpenter. Research Scientist. Aerodynamic and Acoustic Methods Branch. NASA Langley Research Center

PROOF OF TWO MATRIX THEOREMS VIA TRIANGULAR FACTORIZATIONS ROY MATHIAS

The Best Circulant Preconditioners for Hermitian Toeplitz Systems II: The Multiple-Zero Case Raymond H. Chan Michael K. Ng y Andy M. Yip z Abstract In

2 W. LAWTON, S. L. LEE AND ZUOWEI SHEN is called the fundamental condition, and a sequence which satises the fundamental condition will be called a fu

PIECEWISE LINEAR FINITE ELEMENT METHODS ARE NOT LOCALIZED

An Adaptive Hierarchical Matrix on Point Iterative Poisson Solver

für Mathematik in den Naturwissenschaften Leipzig

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

An exploration of matrix equilibration

8 Singular Integral Operators and L p -Regularity Theory

Ole Christensen 3. October 20, Abstract. We point out some connections between the existing theories for

then kaxk 1 = j a ij x j j ja ij jjx j j: Changing the order of summation, we can separate the summands, kaxk 1 ja ij jjx j j: let then c = max 1jn ja

Iterative procedure for multidimesional Euler equations Abstracts A numerical iterative scheme is suggested to solve the Euler equations in two and th

COMPARED to other computational electromagnetic

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

ETNA Kent State University

October 7, :8 WSPC/WS-IJWMIP paper. Polynomial functions are renable

A Block Red-Black SOR Method. for a Two-Dimensional Parabolic. Equation Using Hermite. Collocation. Stephen H. Brill 1 and George F.

Hierarchical Matrices in Computations of Electron Dynamics

Functional Analysis: Assignment Set # 11 Spring 2009 Professor: Fengbo Hang April 29, 2009

Approximation of High-Dimensional Rank One Tensors

Structural Grobner Basis. Bernd Sturmfels and Markus Wiegelmann TR May Department of Mathematics, UC Berkeley.

A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators

Priority Program 1253

G METHOD IN ACTION: FROM EXACT SAMPLING TO APPROXIMATE ONE

B-Spline Interpolation on Lattices

Viewed From Cubic Splines. further clarication and improvement. This can be described by applying the

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning


Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

Moment Computation in Shift Invariant Spaces. Abstract. An algorithm is given for the computation of moments of f 2 S, where S is either

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses

ON TRIVIAL GRADIENT YOUNG MEASURES BAISHENG YAN Abstract. We give a condition on a closed set K of real nm matrices which ensures that any W 1 p -grad

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION

Recurrence Relations and Fast Algorithms

A gradient recovery method based on an oblique projection and boundary modification

WEAKLY SINGULAR INTEGRAL EQUATIONS

STABILITY OF INVARIANT SUBSPACES OF COMMUTING MATRICES We obtain some further results for pairs of commuting matrices. We show that a pair of commutin

1 Solutions to selected problems

LECTURE 16 GAUSS QUADRATURE In general for Newton-Cotes (equispaced interpolation points/ data points/ integration points/ nodes).

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Overlapping Schwarz preconditioners for Fekete spectral elements

Tensor-Product Representation of Operators and Functions (7 introductory lectures) Boris N. Khoromskij

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

satisfying ( i ; j ) = ij Here ij = if i = j and 0 otherwise The idea to use lattices is the following Suppose we are given a lattice L and a point ~x

Rearrangements and polar factorisation of countably degenerate functions G.R. Burton, School of Mathematical Sciences, University of Bath, Claverton D

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices

Discrete Projection Methods for Integral Equations

Transactions on Modelling and Simulation vol 18, 1997 WIT Press, ISSN X

2 EBERHARD BECKER ET AL. has a real root. Thus our problem can be reduced to the problem of deciding whether or not a polynomial in one more variable

2 Section 2 However, in order to apply the above idea, we will need to allow non standard intervals ('; ) in the proof. More precisely, ' and may gene

2 Multidimensional Hyperbolic Problemss where A(u) =f u (u) B(u) =g u (u): (7.1.1c) Multidimensional nite dierence schemes can be simple extensions of

A fast method for solving the Heat equation by Layer Potentials

MATH 590: Meshfree Methods

Lecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko

Transcription:

Max-Planck-Institut fur Mathematik in den Naturwissenschaften Leipzig H 2 -matrix approximation of integral operators by interpolation by Wolfgang Hackbusch and Steen Borm Preprint no.: 04 200

H 2 -Matrix Approximation of Integral Operators by Interpolation Wolfgang Hackbusch and Steen Borm Max-Planck-Institut Mathematik in den Naturwissenschaften Inselstr. 22{26, D-0403 Leipzig, Germany Abstract Typical panel clustering methods for the fast evaluation of integral operators are based on the Taylor expansion of the kernel function and therefore usually require the user to implement the evaluation of the derivatives of this function up to an arbitrary degree. We propose an alternative approach that replaces the Taylor expansion by simple polynomial interpolation. By applying the interpolation idea to the approximating polynomials on dierent levels of the cluster tree, the matrix vector multiplication can be performed in only O(np d ) operations for a polynomial order of p and an n-dimensional trial space. The main advantage of our method, compared to other methods, is its simplicity: Only pointwise evaluations of the kernel and of simple polynomials have to be implemented. Introduction We consider an integral equation on a submanifold of R d for d 2 N, namely Find u (in a suitable space) satisfying for all x 2 u(x) + for a right hand side f, a parameter 2 R and kernel function k(x; y)u(y)dy = f(x) (.) k : R d R d! R that has the asymptotic smoothness (cf. (5.) ) typical for BEM kernels. We use a Galerkin approach for the discretisation of the equation (.), i.e., we choose a family ( i) i2i of suitable basis functions, set V h := spanf i : i 2 Ig and get the discrete problem Find u h 2 V h satisfying for all v h 2 V h. u h (x)v h (x)dx + k(x; y)u h (y)v h (x)dy dx = We introduce matrices M 2 R II ; K 2 R II and vectors ^u 2 R I ; ^f 2 R I by setting M ij = u h = i2i j(x) i(x)dx; ^fj = ^u i i and K ij = for all i; j 2 I, which allows us to rewrite (.2) in the form f(x) j(x)dx; k(x; y) j(y) i(x)dy dx f(x)v h (x)dx (.2) (M + K)^u = ^f: (.3)

In typical applications, the basis functions ( i) i2i have local support, so the matrix M is sparse, i.e., for n := jij, the construction of the matrix and its multiplication with a given vector can be accomplished in O(n) operations. Since the support of the integral kernel k(; ) is not local, the matrix K is not sparse, so a straightforward computation of its entries requires at least O(n 2 ) operations, and a multiplication of a vector with this matrix will require the same amount of work. There are dierent approaches for reducing the complexity: If the domain is very regular, the resulting matrix has a Toeplitz-type structure and can therefore be evaluated eciently by performing fast Fourier transformations. If the domain is smooth, it is possible to use Wavelet compression, and this approach can be extended to piecewise smooth domains []. In this article, we will consider a variant of the panel clustering method [8] and the mosaic skeleton matrix approach [0] by using the framework of H 2 -matrices introduced in [7]. The idea of our variant is to replace K by a suitable approximation and thereby to reduce the complexity to O(np d ), where p is a parameter corresponding to the \quality" of the approximation and d, as above, is the dimension of the space in which the manifold is embedded. We will approximate the kernel function k(; ) by an interpolant. Since k(; ) will usually not be smooth on the entire domain, we do not use a global interpolant, but choose subsets where k(; ) is smooth and interpolate on each of these subsets. If we organise the subsets in a hierarchical way, we end up with a multilevel block matrix where each block has a rank p d. The structure is that of an H-matrix described in [4, 5]. In this form, the approximating matrix requires only O(np d log n) units of storage, constructing it and multiplying it with vectors requires O(np d log n) operations. If we exploit the fact that the kernel is locally approximated by polynomials that depend only on and and have a xed degree, we nd that the structure resembles that of H 2 -matrices described in [7]. Using an algorithm similar to the Fast Fourier Transformation, but not as limited in scope, we can reduce the storage requirements and the number of operations to O(np d ). Since only quadrature and polynomial interpolation are needed in our algorithm, its implementation is very simple as compared to Wavelet compression techniques or panel clustering approaches based on Taylor expansion. In typical applications, the order p will be chosen to be proportional to log n, so the complexity will be O(n log d n), i.e., not optimal. In [2], the use of anisotropic interpolation on domains adapted to the boundary is suggested in order to reduce the complexity to O(n log d n). We propose a dierent approach, namely the variable order expansion along the lines of [9]. This is only a very minor modication of the original constant-order algorithm and allows us to attain optimal complexity. 2 Cluster Tree, Block Partitioning and Types of H-Matrices In order to keep the selection of the subsets mentioned above algorithmically manageable, we base them on a hierarchical splitting of the index set I corresponding to the basis functions: Let T (I) be a binary tree and let T (I) be the set of its nodes. T (I) is called a binary cluster tree if it satises the following conditions:. T (I) P(I), i.e., each node of T (I) is a subset of the index set I. 2. I is the root of T (I). 3. If 2 T (I) is a leaf, then jj Cp d, i.e., the leaves consist of a relatively small number of indices. 4. If 2 T (I) is not a leaf, then it has exactly two sons and is their disjoint union. For each 2 T (I), we denote the set of its sons by S() T (I). The support of a cluster 2 T (I) is given by the union of the supports of the basis functions corresponding to its elements, i.e., := [ i2 i ; where i := supp i for all i 2 I. 2

Next, we need an admissibility condition that allows us to select pairs (; ) 2 T (I) T (I) such that the kernel k(; ) is smooth enough on the domain associated with. We are going to dene our interpolants on axiparallel boxes B containing the set, so we need the kernel to be smooth on B B if we want a good approximation. This can be expressed in quantitative terms by the inequality maxfdiam(b ); diam(b )g 2 dist(b ; B ); (2.) where 2]0; [ is some parameter controlling the trade-o between the number of admissible blocks, i.e., the algorithmic complexity, and the speed of convergence, i.e., the quality of the approximation. The condition (2.) is especially suited for kernel functions with the property (5.). Figure contains a cluster splitting of the unit circle. For some clusters, the corresponding boxes B are highlighted in grey. Figure : Dyadic clustering of the unit circle. The index set I I corresponding to the matrix K 2 R II is partitioned into blocks by the algorithm shown in Table : procedure BuildPartitioning(,, var P ); begin if (; ) meets condition (2.) then P := P [ f g else if S() 6= ; and S() 6= ; then for 0 2 S(); 0 2 S() do BuildPartitioning( 0, 0, P ) else P := P [ f g end Table : BuildPartitioning algorithm Calling this procedure with = = I and P = ; creates a block partitioning of I I consisting of admissible blocks and non-admissible blocks corresponding to leaf clusters of T (I). An example of an admissible partitioning corresponding to the splitting outlined in Figure can be found in Figure 2. The complexity of algorithms for the creation of suitable cluster trees and block partitionings has been analysed in detail in [3]: For typical quasi-uniform grids, a \good" cluster tree can be created in O(n log n) operations, the computation of the block partitioning can be accomplished in O(n) operations. Based on the cluster tree and the block partitioning, we dene the following three types of data-sparse matrices: Denition 2. (H-matrices) Let A 2 R II be a matrix and P be a block partitioning of I I consisting of admissible blocks and leaf blocks.. Let k 2 N. A is called H-matrix of rank k, if rank(aj ) k (2.2) 3

Figure 2: Block partitioning for the unit circle with dyadic clustering. holds for each 2 P. 2. A family V = (V ) 2T (I) is called a cluster basis, if there is a k 2 N for each 2 T (I) such that V 2 R k. Let V c ; V r be cluster bases. A is called uniform H-matrix with respect to V c and V r, if for each 2 P there is a matrix S satisfying Aj = V c; S V T r; : (2.3) In this context, V c is called column basis of A, V r is called row basis of A. 3. A cluster basis V = (V ) 2T (I) is called nested, if there are transfer matrices B 0 ; 2 R k 0 ;k for all 2 T (I) with S() 6= ; and 0 2 S() satisfying V j 0 k = V 0B 0 ; : (2.4) A is called H 2 -matrix with respect to V c and V r, if it is a uniform H-matrix with respect to these cluster bases and if both bases are nested. 3 H 2 Construction 3. Approximation of the Kernel For each cluster 2 T (I), we x a family (x ) 2I of interpolation points (e.g. tensor products of the zeroes of Chebyshev polynomials) in B and the family (p ) 2I of corresponding Lagrange polynomials. On a given admissible block 2 P, we approximate the kernel function k(; ) by its interpolant 3.2 Approximation of the Discrete Operator The discrete Galerkin operator is given by ~k (x; y) := k(x ; y )p (x)p (y): (3.) 2I 2I K ij := i k(x; y) j(y) i(x)dy dx for nite element basis function i; j having their supports in i ;. 4

For i 2 and j 2, we can replace k(; ) with ~ k(; ) and nd ~K ij := i = 2I ~ k(x; y) i(x) j(y)dy dx 2I k(x ; y ) i p (x) i(x)dx Introducing matrices V 2 R I ; V 2 R I and S 2 R I I with we get V i := i.e., a representation as a uniform H-matrix. 3.3 Nested Bases i p (x) i(x)dx; V j := p (y) j(y)dy: p (y) j(y)dy; (3.2) S := k(x ; y ); (3.3) ~K ij = V S V T ij ; (3.4) Let Q p be the space of polynomials spanned by tensor products of polynomials of a degree up to p. The interpolation operator is a projection onto this space, so if we use Q p as the interpolation polynomials on all clusters, we can express polynomials corresponding to father clusters in terms of polynomials corresponding to son clusters without loss, i.e., for a cluster 2 T (I) with a son 0 2 T (I), we nd p (x) = p (x 0 )p 0 (x): (3.5) 2I 0 By introducing a matrix B 0 ; 2 R I 0 I by setting we get Vi = p i (x) i(x)dx = 2I 0 = 2I 0 B 0 ; := p (x 0 ); (3.6) p (x 0 ) i p 0 (x) i(x)dx B 0 ; V 0 i = (V 0 B 0 ; ) i (3.7) for all i 2 0, so the bases are nested in the way required for H 2 -matrices. 4 Algorithms and Complexity We require the block partitioning P to be sparse in the sense of [3], i.e., that there exists a constant C sp satisfying max jf 2 T (I) : 2 P gj C sp ; (4.) 2T (I) max jf 2 T (I) : 2 P gj C sp : (4.2) 2T (I) For standard situations with quasi-uniform meshes, this estimate has been established in [3]. We further assume that there is a constant C L satisfying S() = ; ) ji j jj C L ji j (4.3) for all 2 T (I), i.e., the size of a leaf cluster is comparable to the number of polynomials used in the interpolation on this cluster. Note that, due to the requirement of subsection 3.3, we have ji j = p d for all clusters 2 T (I). 5

procedure ForwardTransformation(); begin if S() = ; then x := V T xj else begin x := 0; for 0 2 S() do begin ForwardTransformation( 0 ); x := x + B 0 ; T x 0 end end end Table 2: ForwardTransformation algorithm 4. Matrix-Vector Multiplication We split the matrix-vector multiplication y := ~ Kx into three steps: First, the products of the input vector with the matrices V T are computed by recursively applying the equation (3.7). Then, the resulting coecients are multiplied by S. In the last step, the result is transformed back from the coecients of the cluster bases into the standard basis. 4.. Forward Transformation In this rst step, we want to compute x := V T xj for all 2 T (I). If is not a leaf, i.e., if S() 6= ;, we have V T xj = B 0 ; T V T 0 xj 0 = B 0 ; T x 0; 0 2S() 0 2S() so the algorithm shown in Table 2 can be used to compute all coecients. Now we want to analyse the complexity of this algorithm. For each 2 T (I), we have ji j = p d and nd that O(p 2d ) operations are required for the computation of x. Due to the lower bound in condition (4.3), there are no more than n=p d leaves in T (I), so there are less than 2n=p d nodes. Therefore, the full forward transformation takes O(np d ) operations to complete. 4..2 Multiplication In the second step, we want to compute for all 2 T (I), where y := 2P S x P := f 2 T (I) : 2 P g: Due to the sparsity condition (4.), we have jp j C sp, so the computation of y requires O(C sp p 2d ) operations. The cluster tree has at most 2n=p d nodes, so the second step takes O(np d ) operations. 4..3 Backward Transformation In the last step, we transform the result from the cluster bases into the standard basis. The procedure shown in Table 3 is the adjoint of the Forward Transformation. By the same arguments as in the case of the Forward Transformation, we nd that only O(np d ) operations are required for the Backward Transformation, so the complete matrix-vector multiplication can be completed in O(np d ) operations. 6

procedure BackwardTransformation(); begin if S() = ; then yj := V y else for 0 2 S() do begin y 0 := y 0 + B 0 ; y ; BackwardTransformation( 0 ) end end Table 3: BackwardTransformation algorithm 4.2 Discretisation 4.2. Basis Transformation Matrices We rst consider the computation of the matrices B 0 ;. We are using tensor product interpolation, so the polynomials p and the corresponding interpolation points x can be written in the form p (x ; : : : ; x d ) = p ; (x ) : : : p ;d d (x d ); x = (x ; ; : : : ; x ;d d ) for one-dimensional Lagrange polynomials (p ;i j ) and corresponding interpolation points (x;i j ) on the real line. The equation (3.6) takes the form so it has the tensor product structure with B 0 ; = p (x 0 ) = p ; (x 0 ; ) : : : p ;d B 0 ; = B 0 ;; : : : B 0 ;;d B 0 ;;i i; i = p ;i i (x 0 ;i i ): d (x 0 ;d The computation of one of these latter matrices requires O(p 3 ) operations, the computation of all of them requires O(dp 3 ). This means that, since there are n=p d leaves and each leaf requires O(dp 3 ) units of storage, the matrices B 0 ; can be stored in the form of tensor products in O(dnp 3d ) units of storage. Of course, we can also construct the full matrix B 0 ; from the tensor product form. This requires O(dp 3 + p 2d ) operations and leads to a total of O(dnp 3d + np d ) operations. 4.2.2 Leaf Basis Matrices The computation of the matrices V for leaves 2 T (I) corresponds to the computation of the integrals V i = i p (x) i(x)dx; where i is some nite element basis function, p a tensor product polynomial in Q p and where i 2 and 2 I. The computation of each of the O(p 2d ) entries using Gauss quadrature of order q requires O(p 2d q d ) operations, so all leaf basis matrices can be computed using O(np d q d ) operations and stored in O(np d ) units of memory. 4.2.3 Admissible Blocks According to (3.3), the coecient matrices S corresponding to admissible blocks are dened by S = k(x ; x ); so building one of these matrices requires O(p 2d ) operations. By the same arguments as before, we get a total of O(np d ) operations for the process of building the coecients for all admissible blocks. d ); 7

4.2.4 Non-admissible Blocks To compute the matrices S for non-admissible blocks, we have to approximate the integral S ij = i k(x; y) j(y) i(x)dy dx: As before, we use quadrature of order q and nd that we need O(np d q d ) operations and O(np d ) units of memory. 5 Error Analysis We assume that the kernel k(; ) is asymptotically smooth, i.e., that there are constants C as (n; m) satisfying j@ x @ y k(x; y)j C as(jj; jj)kx yk jjjj jk(x; y)j (5.) for all multi-indices ; 2 N d 0 and all x; y 2 R d with x 6= y. Lemma 5. (Kernel approximation) Let 2 P be an admissible block and assume that (5.) holds for k(; ). Then the kernel ~ k (; ) from (3.) satises the estimate with the constant kk ~ k k ;B B C apx (p) p+ kkk ;B B (5.2) p 2d(Cas (0; p + ) + C as (p + ; 0)) (C log(p + )) 2d C apx (p) := : (5.3) (p + )! 2 p=2 The behaviour of this constant will be discussed in Remark 5.2. Proof. The interpolated kernel ~ k(; ) can be written in the form ~k = I p B B k by using the tensor product interpolation operator. Applying the error estimate of Lemma A., we nd kk kk ~ ;BB = kk I p B B kk ;B B 4p 2(p + )! (C log(p + ))2d diam(b B ) p+ By using the admissibility condition (2.), we nd 2d @ p+ j diam(b B ) = maxfkp p 2 k 2 : p ; p 2 2 B B g = maxfk(x ; y ) (x 2 ; y 2 )k 2 : (x ; y ); (x 2 ; y 2 ) 2 B B g q k : ;B B = maxfk(x ; y ) (x 2 ; y 2 )k 2 2 : x ; x 2 2 B ; y ; y 2 2 B g q = maxfkx x 2 k 2 2 + ky y 2 k 2 2 : x ; x 2 2 B ; y ; y 2 2 B g p diam(b ) 2 + diam(b ) 2 p 2 maxfdiam(b ); diam(b )g 2 p 2 dist(b ; B ); and therefore p kk kk ~ 2 (C log(p + )) 2d ;BB p+ dist(b (p + )! 2 p=2 ; B ) p+ 2d @ p+ j k : ;B B The partial derivatives can be estimated by using the asymptotic smoothness (5.) as follows: j@ p+ j k(x; y)j C as (p + ; 0)kx yk (p+) jk(x; y)j 8

C as (p + ; 0) dist(b ; B ) (p+) jk(x; y)j for j d and for j > d, so we can use 2d @ p+ j k ;B B to get the desired result j@ p+ j k(x; y)j C as (0; p + ) dist(b ; B ) (p+) jk(x; y)j d(c as (p + ; 0) + C as (0; p + )) dist(b ; B ) (p+) kkk ;BB kk kk ~ ;BB p 2 (C log(p + )) 2d p+ dist(b (p + )! 2 p=2 ; B ) p+ p 2 (C log(p + )) d p+ dist(b ; B ) p+ (p + )! 2 p=2 dist(b ; B ) (p+) (d C as (0; p + ) + d C as (p + ; 0)) kkk BB C apx (p) p+ dist(b ; B ) p+ dist(b ; B ) p+ kkk B B C apx (p) p+ kkk BB : 2d @ p+ j k ;B B Remark 5.2 (Convergence) For typical kernels, an estimate of the type holds, leading to C as (n; m) c c n+m 0 (n + m)! C apx (p) 2 p (C log(p + ))2d 2c d c p+ 2 p=2 0 : If c 0 <, the approximation ~ k(; ) will converge to k(; ) faster than (c 0 ) p+. If the continuous operator is W -elliptic for some Hilbert space W with W h := spanf i : i 2 Ig W, then we can proceed as in [6, Lemma 3] to prove the estimate ku ~u h k W c inf ku v h k W + C apx (p) p+ k KkWh ~!W kuk 0 h W : v h 2W h 6 Extensions Our construction can be extended to other integral operators and other discretisation techniques by simple modication of the denition of V : 6. Collocation If we want to discretise by the collocation method, we have an additional family ( i ) of collocation points and the discrete operator is given by K ij := Replacing once more k(; ) by ~ k(; ), we nd ~K ij := j ~ k(i ; y) j(y)dy = 2I k( i ; y) j(y)dy: 2I k(x ; y )p ( i ) 9 p (y) j(y);

so we can dene V ;coll 2 R I by setting and get ~K ij = V ;coll i := p ( i ) V ;coll S V T ij ; i.e., the same result as in (3.4) with only V replaced by V ;coll. As before, V ;coll can be expressed in terms of the matrices V 0 ;coll corresponding to the sons 0 of. 6.2 Double Layer Potential If we want to discretise the double layer potential (using the Galerkin approach again), we have to compute K ij := i @ @n y k(x; y) j(y) i(x)dy dx: Replacing k(; ) by its approximation k(; ~ ), we get @ ~K ij := k(x; ~ y) i @n y = 2I 2I k(x ; y ) i p (x) i(x)dx j(y) i(x)dy dx @ @n y p (y) j(y)dy; so by dening V ;DLP 2 R I as we once more get the representation @ V ;DLP i := p @n (y) y ~K ij = V S V ;DLPT ; ij j(y)dy; similar to (3.4). This means that for the typical variants of the integral operator (single layer potential, double layer potential and its transposed, hypersingular operator) and for the typical discretisation methods (collocation and Galerkin) the matrices S and B 0 ; stay the same, only the matrices V corresponding to the leaves of the cluster tree have to be modied. The construction leads directly to H 2 -matrices, so if tensor products of polynomials of a degree up to p are used for a kernel function dened in R d, the rank will be k = p d and the complexity of the matrix-vector multiplication will be O(nk) = O(np d ). 6.3 Variable Order Approximation In [9], a variation of the H 2 -technique is investigated that varies the polynomial order corresponding to the size of the clusters: For small clusters, a lower order approximation is sucient, while larger clusters require a higher order. The same approach can be used in the context of our method: If the polynomial order corresponding to a son cluster 0 is lower than that of its father, the equation (3.5) yields only an approximation p (x) ~p (x) := p (x 0 )p 0 (x) 2I 0 of the possibly higher-order polynomial p by lower-order polynomials p 0. If the growth of the polynomial order is \slow enough", e.g., only polynomial with respect to the level of the clusters, then the storage requirements and the complexity of the discretisation and the matrix-vector multiplication can be reduced to O(n), i.e., to optimal complexity. 0

7 Numerical Experiments We applied our technique to a simple model problem: The discretisation of the single layer potential on the unit circle by a Galerkin approach using a regular grid of piecewise constant functions. The results are reported in Table 4. Table 4: Unit circle, single layer potential, = 0:8, p(`) = + (`max `). n Build=s MVM=s ka Ak2 ~ =kak 2 Memory per DoF 024 :05 0:02 0:000483583 47 2048 2:30 0:06 0:00026483 4605 4096 4:97 0:4 0:00040073 4929 892 0:2 0:3 0:0000725354 562 6384 2:0 0:62 0:0000370742 5324 32768 40:96 :22 0:000087955 5434 65536 83:54 2:54 5507 3072 67:79 5:6 5554 26244 338:22 0:27 5584 524288 675:67 20:8 5602 We can see the linear growth of the time for the construction of the matrix and for the matrix vector multiplication. The relative approximation error appears to be proportional to the approximation error. The memory requirements per degree of freedom are bounded. A Multidimensional Interpolation Estimate We denote the Chebyshev points in the interval [; ] by ( i ) p i=0, i.e., we have 2i + i = cos 2p + 2 ; and the corresponding Lagrange polynomials by (p i ) p i=0. The one-dimensional interpolation operator is given by This operator satises the error estimate and the stability estimate ~I p : C[; ]! P p ; u 7! p i=0 u( i )p i : k~ I p u uk ;[;] 2p (p + )! ku(p+) k ;[;] (A.) k~ I p uk ;[;] C log(p + )kuk ;[;] : (A.2) We are interested in a result for the d-dimensional tensor product interpolation operator I p B B := dy [a j ; b j ]: on a box The interpolation error can be bounded as follows: Lemma A. (Tensor product interpolation) We have ki p B d u uk ;Bd

2p (p + )! d k= (C log(p + )) k bj a j 2 4p 2(p + )! (C log(p + ))d diam(b d ) p+ p+ @ p+ j d u ;Bd @ p+ j u ;Bd (A.3) for u 2 C p+ (B). Proof. We denote by the transformation from the unit interval to [a j ; b j ]. For each j 2 f; : : : ; dg, we introduce the operator I p j j : [; ]! [a j ; b j ]; t 7! 2 (( + t)b j + ( t)a j ) :C(B)! C(B); u 7! x 7! p i=0 u(x ; : : : ; x j ; j ( i ); x j+ ; : : : ; x d ) p i ( j (x j )) interpolating functions in the j-th coordinate direction only. The estimates (A.) and (A.2) can be applied to I p j and take the form! ki p j u uk ;B 2p (p + )! bj a j 2 ki p j uk ;B (C log(p + ))kuk ;B : The tensor product interpolation operator can be written as I p B = Ip : : : Ip d = dy Due to this product representation, we can derive the following estimate: ki p B u uk ;B = = = p+ k@ p+ j uk ;B ; (A.4) I p j : dy I p j u u ;B 0 dy d Y ky I p j u + @ k I p j u I p j u A u k= ;B 0 d Y @ k k Y I p j u I p j u A d 0 k= @ k Y k= ;B 0 d Y (A.5) (A.4) k= d k= @ k (C log(p + )) A ki p k u uk ;B (C log(p + )) k 2 p (p + )! I p j A (I p k ;B u u) p+ bk a k k@ p+ k uk ;B : 2 (A.5) References [] W. Dahmen and R. Schneider: Wavelets on manifolds I: Construction and domain decomposition, SIAM J. of Math. Anal. 3 (999) 84{230. 2

[2] K. Giebermann: Multilevel approximation of boundary integral operators, Computing 67 (200) pp. 83{ 207. [3] L. Grasedyck: Theorie und Anwendungen Hierarchischer Matrizen, PhD thesis, Universitat Kiel, 200. [4] W. Hackbusch: A sparse matrix arithmetic based on H-Matrices. Part I: Introduction to H-Matrices, Computing 62 (999) 89{08. [5] W. Hackbusch and B. Khoromskij: A sparse matrix arithmetic based on H-Matrices. Part II: Application to multi-dimensional problems, Computing 64 (2000) 2{47. [6] W. Hackbusch and B. Khoromskij: A sparse H-matrix arithmetic: General complexity estimates, J. Comp. Appl. Math. 25 (2000) 479{50. [7] W. Hackbusch, B. Khoromskij, and S. Sauter: On H 2 -matrices. In Lectures on Applied Mathematics, H. Bungartz, R. Hoppe, and C. enger, eds., 2000, pp. 9{29. [8] W. Hackbusch and. P. Nowak: On the fast matrix multiplication in the boundary element method by panel clustering, Numerische Mathematik 54 (989) 463{49. [9] S. Sauter: Variable order panel clustering, Computing 64 (2000) 223{26. [0] E. Tyrtyshnikov: Mosaic-skeleton approximation, Calcolo 33 (996) 47{57. 3