Efficient approximation of random fields for numerical applications

Institutional Repository of the University of Basel University Library Schoenbeinstrasse 8-0 CH-0 Basel, Switzerland http://edoc.unibas.ch/ Year: 0 Efficient approximation of random fields for numerical applications Harbrecht, Helmut and Peters, Michael and Siebenmorgen, Markus Posted at edoc, University of Basel Official URL: http://edoc.unibas.ch/dok/a99 This is the peer reviewed version of the following article: Harbrecht, Helmut and Peters, Michael and Siebenmorgen, Markus. (0) Efficient approximation of random fields for numerical applications. Numerical linear algebra with applications, Vol., H.. S. 9-. which has been published in final form at http://dx.doi.org/.0/nla.9. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 0000; 00: Published online in Wiley InterScience (www.interscience.wiley.com). DOI:.0/nla Efficient approximation of random fields for numerical applications Helmut Harbrecht, Michael Peters, Markus Siebenmorgen Mathematisches Institut, Rheinsprung, 0 Basel, Switzerland SUMMARY This article is dedicated to the rapid computation of separable expansions for the approximation of random fields. We consider approaches based on techniques from the approximation of non-local operators on the one hand and based on the pivoted Cholesky decomposition on the other hand. Especially, we provide an a- posteriori error estimate for the pivoted Cholesky decomposition in terms of the trace. Numerical examples are provided to validate and quantify the presented methods. Copyright c 0000 John Wiley & Sons, Ltd. Received.... INTRODUCTION In this article, we present and compare two different approaches for the approximation of random fields in L P( Ω, H p (D) ) for a spatial domain D R d and a separable, complete probability space (Ω, F, P). Stochastic fields appear for example in the modeling of diffusion problems with random data, see e.g. [], and in machine learning, see e.g. []. To make a stochastic field a(x, ω) feasible for numerical computations in a stochastic Galerkin or stochastic collocation method, see e.g. [,,,,, ] and the references therein, one has to separate the spatial variable x and the stochastic variable ω. Since L P( Ω, H p (D) ) = L P (Ω) H p (D), see e.g. [8], this can be accomplished by computing a basis representation of a in L P (Ω) Hp (D). A very common approach to obtain such a representation is the Karhunen-Loève expansion, cf. [, 9], which can be regarded as the linear operator analogue of the singular value decomposition of matrices. The main task in the computation of a Karhunen-Loève expansion is the solution of a symmetric and positive semidefinite eigen-problem. In this context, approaches to efficiently compute the Karhunen-Loève expansion have been made by means of the Fast Multipole Method (FMM) based on interpolation, cf. [], in [] and with the aid of H-matrices, cf. [], in []. The idea in these articles is to provide a data-sparse representation of the covariance operator which is then used to solve the related eigen-problem numerically by a Krylov subspace method, cf. []. Of course, another algorithm for the efficient approximation of non-local operators, like the Adaptive Cross Approximation (ACA), cf. [, ], or the Wavelet Galerkin Scheme (WGS), cf. [, 8], can be considered as well for the representation of the covariance operator. Nevertheless, the major drawback of these approaches is that the number of eigenvalues to be computed has to be known in advance which might be a strong assumption in practice. To overcome this obstruction, we present here an alternative approach based on the Pivoted Cholesky Decomposition (PCD). The PCD is an established tool in the simulation of Gaussian Correspondence to: E-mail: michael.peters@unibas.ch This research has been supported by the Swiss National Science Foundation (SNSF) through the project Rapid Solution of Boundary Value Problems on Stochastic Domains. Copyright c 0000 John Wiley & Sons, Ltd. [Version: 0/0/ v.00]

processes and and the computation of low-rank approximations to covariance matrices, see e.g. [, 9, 0]. It can be interpreted as a single-block ACA with applicable total pivoting, cf. []. Hence, only the main diagonal of the discretized operator has to be precomputed, which can be performed in essentially, i.e. up to possible poly-logarithmic terms, linear complexity, if the quadrature proposed in [] is applied to discretize the operator. Then, in each step of the algorithm, the quality of the approximation with respect to the stochastic field is controllable by means of the trace. If the desired accuracy is achieved, the algorithm stops with an M-term approximation to the operator. If M is substantially smaller than the dimension of the ansatz space, we end up with a remarkable computational speed-up. The related Karhunen-Loève expansion might then be computed in a post-processing step. In this case, the PCD yields a full but relatively small eigenproblem if the operator under consideration exhibits a certain smoothness. This eigen-problem might be solved numerically by e.g. the QR-algorithm, cf. []. Now the following question arises: which approach is more efficient? We will try to answer this question numerically by comparing the PCD with methods lend from the approximation of nonlocal operators. We employ here ACA for the data-sparse approximation of the covariance operator which results in a fast matrix-vector product. Thus, a Krylov subspace method we use the Implicit Restarted Arnoldi Method (IRAM), cf. [,, ] is feasible to compute the desired eigenvalues of largest magnitude. Finally, we would like to emphasize that, although we focus here on the application to random fields, the presented methods are also applicable in the more general case of approximating bivariate functions in L (D ) L (D ) for two domains D R d and D R d. The rest of this article is structured as follows. Section is devoted to the approximation of random fields. We start by introducing the Karhunen-Loève expansion. After this, we introduce error bounds for the approximation of random fields in terms of operator traces. Especially, we discuss here error estimates including discretization and truncation error. We also provide bounds for the decay of the covariance operator s eigenvalues. In Section, we provide the theoretical background for the pivoted Cholesky decomposition. To that end, we consider separable representations of random fields which are more general than the Karhunen-Loève expansion. After this, we establish error estimates for the approximation of random fields by the pivoted Cholesky decomposition. These estimates are essential for the a-posteriori control of the approximation error. Section introduces a special class of covariance functions based on the Matérn kernel functions. We choose this class of covariance functions for our numerical tests, since we a-priori know the decay rate of the respective eigenvalues. In particular, we are also able to analytically compute the eigenfunctions and eigenvalues in the case of the unit sphere S. Thus, these kernels provide an excellent benchmark to compare both approaches. Section is dedicated to testing the numerical performance of the methods under consideration. We start here by giving some information on the implementation of the considered methods and especially introduce in brief the ACA. Then, we will solve the eigenvalue problem for covariance operators related to some of the Matérn kernels from Section on different geometries. Finally, we sum up the results presented within this article in Section. In the following, in order to avoid the repeated use of generic but unspecified constants, by C D we mean that C can be bounded by a multiple of D, independently of parameters which C and D may depend on. Obviously, C D is defined as D C, and C D as C D and C D.. APPROXIMATION OF RANDOM FIELDS Let (Ω, F, P) be a complete probability space with σ-field F Ω and probability measure P. In order to ensure that L P (Ω) is separable, we also assume that Ω is a separable set. Furthermore, let D R d for d =, be a sufficiently smooth and bounded domain. For p 0, the Lebesgue-Bochner space LP( Ω; H p (D) ) consists of all maps a: Ω H p (D) that satisfy ( ) / v L P (Ω;H p (D)) := v(, ω) H p (D) dp(ω) <. () Ω DOI:.0/nla

In the sequel, it is convenient to identify LP( Ω; H p (D) ) according to LP( Ω; H p (D) ) = H p (D) L P(Ω). For further details on Lebesgue-Bochner spaces see e.g. []... Karhunen-Loève expansion Let a L P (Ω; Hp (D)) be a random field. We define the related centered random field a 0 (x, ω) via a 0 (x, ω) := a(x, ω) a(x) := a(x, ω) a(x, ω) dp(ω) () and the corresponding Hilbert-Schmidt operators, i.e. S : L P (Ω) Hp (D) with (Su)(x) = a 0 (x, ω)u(ω) dp(ω) for u L P(Ω) Ω Ω and its adjoint S : H p (D) L P (Ω) with (S u)(ω) = a 0 (y, ω)u(y) dy D for u H p (D). Then, we especially find that SS : H p (D) H p (D) is given by (SS u)(x) = a 0 (x, ω)a 0 (y, ω) dp(ω)u(y) dy =: (Cu)(x) () D Ω which is the covariance operator related to the stochastic field a. Obviously, the Hilbert-Schmidt norms of S and S correspond to the L P( Ω; L (D) ) -norm of a. The products of Hilbert-Schmidt operators form the trace-class of operators, cf. [8]. Definition.. A bounded linear operator A: L (D) L (D) is of trace-class if Tr A := m= ( (A A) ϕ m, ϕ m ) L (D) < holds for an arbitrary orthonormal basis {ϕ m} m in L (D). For the positive and symmetric operator C, i.e. (Cu, u) L (D) = (S u, S u) L (Ω) 0, it holds Tr C = m= (Cϕ m, ϕ m ) L (D). Moreover, we derive Tr C = = (Cϕ m, ϕ m ) L (D) = S ϕ m L P (Ω) = ( ) a0 (, ω), ϕ m dpω L (D) m= m= Ω m= () a 0 (, ω) L (D) dp(ω) = a 0 L P Ω (Ω;L (D)). Notice that for continuous correlation kernels, we have in particular Tr C = k(x, x) dx with k(x, y) := a 0 (x, ω)a 0 (y, ω) dp(ω) () D which is a consequence of Mercer s theorem, cf. [9]. For more details on trace-class operators, we refer the reader to [8, 0]. A very common representation of random fields in terms of the covariance operator s eigen-pairs {(λ m, ϕ m )} m is given by the Karhunen-Loève expansion. Definition.. Let a L P( Ω; L (D) ) be a random field. The representation a(x, ω) = a(x) + Ω λm ϕ m (x)x m (ω) () m= DOI:.0/nla

is called Karhunen-Loève expansion with respect to a. Here, a(x) denotes the mean of a as defined in (). The random variables {X m } m are given according to X m (ω) := (S ϕ m )(ω) = a 0 (x, ω)ϕ m (x) dx λm λm and satisfy (X m, X n ) L P (Ω) = δ m,n as well as (X m, ) L P (Ω) = 0. The Karhunen-Loève expansion can be regarded as the continuous analogue to the singular value decomposition of matrices. For numerical issues, we have to truncate the series in () appropriately after M N terms. The question how small M can actually be chosen in order to achieve a certain precision is closely related to the decay of the eigenvalues of C, which depends on the smoothness index p. Results on the decay of the eigenvalues have been established for periodic functions already in []. Nevertheless, since we do not want to restrict ourselves to this situation, we refer here to the more general results in [, Theorem.,Theorem.]. Theorem.. Let a L P( Ω; H p (D) ) with p > d/. Then, the eigenvalues of the covariance operator C : H p (D) H p (D) decay like λ m m p/d as m and it holds ε(m) := m=m+ λ m M p d. D.. Finite element approximation For the approximation of spatial functions in L (D), we employ piece-wise polynomial (discontinuous) finite elements of order s. To that end, we introduce a family of quasi-uniform triangulations T h for D with mesh width h and define the spaces V s h := {v h : D R : v T is a polynomial of order s for all T T h } L (D). () Then, given a function v H t (D) with 0 t s, it holds due to the Bramble-Hilbert lemma the approximation estimate v Q h v L (D) := inf v v h L (D) h t v H t (D) (8) v h Vh s uniformly in h, see e.g. [, ]. For the spatial approximation of a(x, ω), i.e. a h (x, ω) := (Q h a)(x, ω) = (Q h a)(x) + λm (Q h ϕ m )(x)x m (ω), we obtain in terms of the trace the following approximation result in Vh s. Even though this result has already been derived in [, Theorem.], we shall present here a proof which employs another technique required for our considerations later on. Theorem.. Let N = dim V s h, λ λ... 0 be the eigenvalues of the covariance operator C and λ,h λ,h... λ N,h 0 those of C h := Q h CQ h. Then, it holds and therefore m= a 0 Q h a 0 L P (Ω;L (D)) = Tr C Tr C h a 0 Q h a 0 L P (Ω;L (D)) = N m= (λ m λ m,h ) + m=n+ Proof Let {ϕ m } m be an orthonormal basis of L (D) such that either ϕ m Im Q h or ϕ m Im(I Q h ) holds. Therefore, we obtain (S (I Q h )ϕ m, S Q h ϕ m ) L P (Ω) = 0. Thus, we infer by () that a 0 Q h a 0 L P (Ω;L (D)) = (S ϕ m, S ϕ m ) L P (Ω) (S Q h ϕ m, S Q h ϕ m ) L P (Ω) m= = Tr C Tr C h. λ m. DOI:.0/nla

For the rest of this article, we refer to {(λ m, ϕ m )} m as the eigen-pairs of C (in decreasing order) and to {(λ m,h, ϕ m,h )} N m= as the eigen-pairs of C h. Moreover, it is convenient to set a h (x) := (Q h a)(x) and a 0,h := a h a h. By the application of the theorem and the approximation estimate (8) it is straightforward to show the following Corollary.. The trace error satisfies 0 Tr C Tr C h h min{s,p} if a L P( Ω; H p (D) ), where the hidden constant involves the L P( Ω; H p (D) ) -norm of a. Theorem. remains valid if we introduce the additional orthogonal projection P h : Vh s U onto an M-dimensional subspace U Vh s. The related projected stochastic field is given by a h,m := a h + P h a 0,h = Q h a + P h Q h a 0 and its covariance according to C h,m := P h C h P h. We arrive at the subsequent approximation result. Theorem.. Let C h = Q h CQ h, C h,m = P h C h P h and a h,m = a h + P h a 0,h. Then, there holds a a h,m L P (Ω;L (D)) h min{s,p} + (Tr C h Tr C h,m ), (9) where the hidden constant involves the L P( Ω; H p (D) ) -norm of a. Proof By Theorem. and Corollary., it holds a a h,m L P (Ω;L (D)) (I Q h)a L (D) + a 0 a 0,h L P (Ω;L (D)) + a 0,h P h a 0,h L P (Ω;L (D)) h min{s,p} + (Tr C Tr C h ) + (Tr C h Tr C h,m ) h min{s,p} + (Tr C h Tr C h,m ). The theorem indicates that, after fixing the ansatz space Vh s, the approximation error of the stochastic field is controllable in terms of the discretized operators C h and C h,m. The optimal choice of P h in terms of minimizing the trace error is the orthogonal projection onto the dominant invariant subspace of C h, i.e. U M,h := span{ϕ,h,..., ϕ M,h } corresponding to the M dominant eigenvalues of C h. The related Karhunen-Loève expansion then reads a h,m (x, ω) = a h (x) + M λm,h ϕ m,h (x)x m (ω), () m= where the random variables are given according to X m (ω) := λm,h D (P h Q h a 0 )(x, ω)ϕ m,h (x) dx. () Notice that, in this setting, the discretization of the stochastic field implies a change of the stochastic model induced by (). If, however, U M,h and thus P h are not known exactly, they have to be approximated appropriately. This induces an additional error and we have to assume that λ M+,h is distinct from λ M,h, cf. [, ]. Nevertheless, any subspace is feasible as long as the difference Tr C h Tr C h,m becomes small for moderate sizes of M. DOI:.0/nla

.. Approximation of eigenvalues In the remainder of this section, we will briefly consider how well the decay of the eigenvalues {λ m } m of C is reflected by Ritz-values λ,h,..., λ N,h related to C h. The approximation error of the Ritz-values is measured in terms of the gap between the invariant subspace U m = span{ϕ,..., ϕ m } L (D) of C and the approximation space V s h, i.e. θ(u m ) := sup (I Q h )u L (D). u U m, u L (D) = In order to control the gap, we have to take the eigenfunctions regularity into account. By [, Remark.8], it holds for a H p (D) L P (Ω) that the eigenfunctions {ϕ m} m of the covariance operator C satisfy ϕ m Ht (D) λ t p m for 0 t min{s, p} and λ m 0. Thus, the orthogonal projection of ϕ m onto Vh s yields the error estimate Now, we can estimate the gap as follows. (I Q h )ϕ m L (D) h t ϕ m Ht (D) λ t p m h t. () Lemma.. Let U m = span{ϕ,..., ϕ m } L (D) be an invariant subspace of C. Then, it holds for 0 t min{s, p} and λ,..., λ m 0 that i= θ(u m ) mλ t p m h t. () Proof Let u = m i= α iϕ i with u L (D) =. Thus, it holds m i= α i =. Then, with α = (α,..., α m ), we have m θ(u m ) = sup (I Q h )u L (D) = sup α i (I Q h )ϕ i L u U m, u L (D) = α = i= (D) m m sup α i (I Q h )ϕ i L (D) sup α i λ t p i h t mλ t p m h t, α = α = where we used () in the second to last step. Remark.8. In order to achieve convergence for the m-th eigenvalue, we have to guarantee θ(u m ) < which imposes a restriction to the mesh width h of the discretization. Moreover, we have m N = dim V s h. From [, Theorem 9...] we obtain finally a convergence result, which relates the eigenvalues rate of approximation to the gap. Theorem.9. Let U m = span{ϕ,..., ϕ m } be an invariant subspace of C such that dim(q h U m ) = m. Then, the approximation λ i,h to the i-th eigenvalue by the Rayleigh-Ritz method, i.e. C h ϕ i,h = λ i,h ϕ i,h, satisfies the estimate i= 0 λ i λ i,h λ i ( θ(ui ) ) for all i m. () Thus, we can approximate the eigenvalues in V s h according to p t p 0 λ i λ i,h iλi h t for all i m and 0 t min{s, p}. Especially for s p, the eigenvalues of C h exhibit a similar rate of decay as the eigenvalues of C up to a relative error of ih s. DOI:.0/nla

. THE PIVOTED CHOLESKY DECOMPOSITION Based on the observation in Theorem. and the subsequent discussion, we consider in this section a more general approach for the representation of a random field. For this purpose, we rather refer here to the separable decomposition a h,m (x, ω) = a h (x) + M ψ m,h (x)y m (ω) () m= than to the orthogonal decomposition (). In the expansion (), we assume that {ψ m,h } M m= V s h with appropriately modified random variables {Y m } M m=. Algorithm : Pivoted Cholesky decomposition ([]) Data: matrix A = [a i,j ] R N N and error tolerance ε > 0 Result: low-rank approximation A M = M i= l il T i such that trace(a A M ) ε begin set M := ; set d := diag(a) and error := d l ; initialize π := [,,..., N]; while error > ε do set i := arg max{d πj : j = M, M +,..., N}; swap π M and π i ; set l M,πM := d πm ; for M + i N do ( M ) / compute l M,πi := a πm,π i l j,πm l j,πi l M,πM ; j= update d πi := d πi l M,πM l M,πi ; n compute error := d πi ; i=m+ increase M := M + ; One possibility to obtain a separable expansion () is to compute the pivoted Cholesky decomposition of the coefficient matrix of C h with respect to a basis in Vh s. To that end, let Φ(x) := [φ (x),..., φ N (x)] denote an orthonormal basis of Vh s. Then, the coefficient matrix of C h with respect to Φ is given by C = [(C h φ j, φ i ) L (D)] N i,j= R N N. () For each finite dimensional ansatz space, the matrix C is symmetric and positive semidefinite. Thus, C exhibits a (possibly pivoted) Cholesky decomposition. By pivoting the Cholesky decomposition (see Algorithm ), we achieve numerical stability on the one hand, cf. [8, 9], and, if the eigenvalues of C decay sufficiently fast, a low-rank approximation on the other hand, cf. []. Especially, the approximation error of the (truncated) pivoted Cholesky decomposition is a- posteriori controllable in terms of the (discrete) trace, i.e. trace(a) := N i= Ae i, e i = N i= a i,i, where e,..., e N denotes the canonical basis in R N... Separable expansions of random fields In the following, we establish the connection between the approximation to the random field obtained by the pivoted Cholesky decomposition and the Karhunen-Loève expansion of Q h a(x, ω). DOI:.0/nla

8 We denote the spectral decompositions related to C by C = N i= λ iv i v i with λ i [0, ) and v i R N. Therefore, the Karhunen-Loève expansion of Q h a(x, ω) is given by a h (x, ω) = Φ(x)a h + N λi Φ(x)v i X i (ω) with respect to the orthonormal basis Φ of Vh s and a h := [(a, φ i ) L (D)] N i= Rn. Moreover, the random variables {X i } N i= are given by (), where P h = I is chosen as the identity mapping on Vh s. This representation can be rewritten in matrix notation as i= a h (x, ω) Φ(x)a h =: Φ(x)VΣX(ω) () with V := [v,..., v N ], Σ := diag( λ,..., λ N ) and X(ω) := [X (ω),..., X N (ω)]. The matrix (VΣ) R N N from () exhibits a QR-decomposition: QL = (VΣ) or LQ = VΣ, respectively. Here, Q denotes an orthogonal matrix, i.e. Q Q = I R N N, and L R N N is a lower triangular matrix. We shall next define the transformed random vector Y(ω) := Q X(ω). Then, Y(ω) also consists of N uncorrelated and centered random variables, since it holds Y(ω)Y (ω) dp(ω) = Q X(ω)X (ω) dp(ω)q = Q IQ = I. Ω Ω That the random variables Y i (ω) are also centered, follows from the fact that they are weighted sums of centered random variables. Thus, we obtain a representation equivalent to () according to Φ(x)VΣX(ω) = Φ(x)LY(ω) where the change of basis Q only acts on the basis of L P (Ω). Moreover, we observe C = VΣ(VΣ) = LQ QL = LL. Since L is a lower triangular matrix, we thus end up with the Cholesky decomposition of C. In the following, without loss of generality, we will especially assume, that LL corresponds to the pivoted Cholesky decomposition of C. Using the Cholesky decomposition of C, we obtain the separable representation a h (x, ω) = Φ(x)a h + N Φ(x)l i Y i (ω) (8) for a h (x, ω) with L = [l,..., l N ]. Whereas, the related truncated Cholesky decomposition leads to the truncated expansion a h,m (x, ω) = Φ(x)a h + i= M Φ(x)l i Y i (ω). Notice that this is exactly the representation () with ψ i (x) = Φ(x)l i. Remark.. The separable representation (8) of the stochastic field is based on the knowledge of an appropriate matrix R R N N, a square root of the coefficient matrix, such that C = RR. It is known that for two different square roots, i.e. C = RR = R R, there exists an orthogonal matrix Q R N N such that R = RQ. The change of the representation () due to the application of Q is then performed by the change of the basis functions in L P (Ω), i.e. Y(ω) := Q X(ω). Thus, any square root of C yields a separable representation of a h (x, ω). Nevertheless, we focus on the pivoted Cholesky decomposition here. i= DOI:.0/nla

9.. Approximation by the pivoted Cholesky decomposition The approximation error of a given stochastic field induced by truncating of the pivoted Cholesky decomposition is controllable in accordance with the following theorem. Theorem.. For the coefficient matrix C R N N given by (), let C M = L M L M RN N denote its (truncated) pivoted Cholesky decomposition computed by Algorithm such that trace(c C M ) < ε holds for some ε > 0. Then, for the related stochastic fields, we have the error estimate a h a h,m L P (Ω;L (D)) < ε. Proof Let C = LL denote the pivoted Cholesky decomposition of C. We define ( ) u(x) (Shu)(ω) := Φ(x)LY(ω) dx D and P h : V s h span{φ (x),..., φ M (x)} the orthogonal projection onto the space spanned by the first M basis functions. Then, it holds in complete analogy to the proof of Theorem. that N a h a h,m L P (Ω;L (D)) = (Shφ i, Shφ i ) L P (Ω) (ShP h φ i, ShP h φ i ) L P (Ω) = i= N M L e i, L e i L e i, L e i = trace(c C M ) < ε. i= i= The theorem states that the choice ε h min{p,s} in the pivoted Cholesky decomposition guarantees, together with inequality (9), the (optimal) error estimate a a h,m L P (Ω;L (D)) h min{p,s}. The major advantage of this approach is, that at no time the coefficient matrix C has to be fully assembled. It is sufficient to provide access to single entries of this matrix while processing the pivoted Cholesky decomposition. The error in the approximation of the random field a is then a- posteriori controllable by the trace. Given that the pivoted Cholesky decomposition for C truncates with M N terms and C M = L M L M RN N, where L M R N M, the computation of the related Karhunen-Loève expansion is performed with complexity O(M N), cf. []. This can be achieved by computing the eigenvalues of L M L M R M M which coincide with those of C M. Then, if v,..., v M denote the orthonormal vectors of the small eigen-problem, the eigenvectors of C M are given by Lv,..., Lv M and we have (Lv i ) (Lv j ) = v i L Lv j = λ i δ i,j for all i, j =,..., M. (9) Thus, the related Karhunen-Loève decomposition is given by a h,m (x, ω) = a h (x) + M Φ(x)Lv i Xi (ω). (0) If the laws of the random variables X i (ω) are known, we obtain the relation i= X(ω) = [v,..., v M ] Y(ω) with v i from (0). Otherwise, the related random variables X i (ω) can be determined by a maximum likelihood estimate, cf. []. Notice that for the important Gaussian case X(ω) [N (0, )] N, we have for any orthogonal transform QX(ω) [N (0, )] N and thus X has the same law as X. DOI:.0/nla

. THE MATÉRN CLASS OF KERNELS For our numerical tests, we consider a special subset of Hilbert-Schmidt kernels, namely the Matérn class of kernel functions, cf. [0]. They are very often used as covariance kernels for the definition of stochastic fields. In accordance with [], they are defined as follows. Definition.. Let r := x y and l (0, ). Then, the Matérn covariance function of order ν > 0 is given by ( ) ν ( ) k ν (r) := ν νr νr K ν. () Γ(ν) l l Here, Γ denotes the gamma function and K ν denotes the modified Bessel function of the second kind of order ν, cf. []. The expression () simplifies if ν = p + / with p N. In this case, [] provides ( k p+/ (r) = exp νr l ) p! (p)! p ( ) p i (p + i)! 8νr. i!(p i)! l i=0 Especially, we have ν =, ν =, k /(r) = ν =, k /(r) = ν =, k /(r) = ( k /(r) = exp r ), l r ( + ( + ( + ( + r ν = 9, k 9/(r) = l + r l ( ) ν =, k (r) = exp r l. ) ( ) r exp, l l ) ( r + r r l l exp l r + r l l + 9 r l + 8r l + r l ), ) exp ( ) exp r ), l ( r l ), () A visualization of this kernels for different values of ν is given in Figure. Obviously, the Sobolev smoothness of the kernel k ν is controlled by the smoothness parameter ν. For increasing values of ν, the respective kernel function k ν exhibits successively more regularity. Especially, the eigenvalues of the Matérn correlation kernels decay like λ m Cm (+ ν d ) () for some C > 0, cf. []. Thus, since the decay of the covariance operator s eigenvalues is known in advance, they are very well suited for numerical examples. Obviously, the Matérn kernels provide rotational symmetry, i.e. they are invariant under isometries of D, since they are only dependent on the particular distance of the points x and y. Thus, we obtain analytic expressions for the eigenvalues of the underlying Hilbert-Schmidt operators, if we choose D = S d to be the unit sphere in R d. More precisely, we may apply the Funk-Hecke formula, cf. [], which reads as follows. Theorem.. Let x S d and f C([, ]), then it holds S d f(x y)y m (y) dσ y = λ m Y m (x) DOI:.0/nla

0.9 0.8 0. Matern / Matern / Matern / Matern 9/ Matern y value 0. 0. 0. 0. 0. 0. 0. 0. 0 0.. x value Figure. Different values for the smoothness parameter ν. with λ m = S d P m (d; t)f(t)( t ) d dt. Here, Y m corresponds to a spherical harmonic of order m and P m (d; t) denotes the polynomial ( d P m (d; t) := m! Γ ) m/ i=0 ( ) i ( t ) i t m i i!(m i)!γ ( i + d A proof of this theorem can be found in []. Especially, for the case d =, the polynomials P m (; t), correspond to the Legendre polynomials, cf. []. Notice, that the Funk-Hecke formula applies to all kernel functions on S d, which depend only on the Euclidean distance r(x, y) = x y. This is easily seen due to r(x, y) = r(x y) = x y for all x, y S d. Figure shows the distribution of the Matérn-kernels eigenvalues for ν = /, /, /, 9/ on S up to an order of magnitude of for the correlation length l =. The constant C is estimated by a least-square fit for the ratio of the rate given by formula () for C = and the exact eigenvalues given by Theorem.. The obtained values of C for each kernel under consideration are denoted in the legend of Figure. The plot indicates, that the fitted rates perfectly match the asymptotic behavior of the eigenvalues. ).. NUMERICAL RESULTS The numerical tests in this section are performed on parametric surfaces Γ R. These surfaces have recently been considered in the context of solving boundary integral equations, cf. [] and the references therein. The following subsection provides some details on the numerical implementation of the considered methods... Implementation For the sake of completeness, we start by briefly recalling the idea of ACA. For more details, we refer to [] and the references therein. DOI:.0/nla

0 8 0 m y value Matern / Matern / Matern / Matern 9/ fit (C=9.9) fit (C=0.) fit (C=08.) fit (C=.98) Figure. Decay of the eigenvalues with related fits. For the application of ACA, we exploit that the correlation kernel k(x, y) of the covariance operator C : H p (D) H p (D) is asymptotically smooth, i.e. the kernel is smooth away from the diagonal {(x, y) D D : x = y}. Then, the coefficient matrix C from () can be partitioned into admissible, i.e. compressible, matrix blocks corresponding to the far-field of C and nonadmissible, i.e. non-compressible, matrix blocks corresponding to the near-field of C. ACA provides a means to compress the admissible matrix blocks of C. A visualization of the block partitioning and the related ranks of C in case of the Matérn-9/ kernel on S is depicted for level in Figure. In each admissible matrix block, we approximate C by a truncated, partially pivoted Gaussian Figure. Partition of the matrix C for the Matérn-9/ kernel with inscribed ranks. elimination, cf. []. To this end, we define the vectors l m, u m R n by the following iterative DOI:.0/nla

scheme, where C adm = [c i,j ] n i,j= is the admissible matrix block under consideration: for m =,,... set u m = û m /[û m ] jm with m û m = [c im,j] n j= [l j ] im u j and l m = [c i,jm ] n i= j= m [u i ] jm l i. i= () A criterion to guarantee the convergence of the algorithm is to choose the pivot element located in (i m, j m )-position as the maximum element in modulus of the remainder C adm L m U m, where we define the matrices L m := [l,..., l m ] and U m := [u..., u m ]. This would require the assembly of the whole matrix block C adm, which is not feasible in practice. Therefore, we employ another pivoting strategy: We choose j m such that [û m ] jm is the largest element in modulus of the row û m. We finally stop the iteration if the criterion l m+ u m+ ε L m U m F is met for some desired accuracy ε > 0. Here and in the following, F denotes the Frobenius norm. Notice that the stopping criterion supposes a saturation assumption, i.e. the assumption that the error is reduced in each step by a constant factor. Compressing each block with the prescribed accuracy ε yields the overall error estimate C C F ε C F, where C denotes the compressed matrix. Obviously, the complexity for the computation of the rank-m-approximation L m U m to the block C adm is O(m n) and the storage cost is O(mn). The latter one can be further reduced by the application of a singular value decomposition and neglecting non-relevant singular values. Notice that () combined with total pivoting would result for symmetric matrix blocks in an algorithm quite similar to the pivoted Cholesky decomposition. Nevertheless, for PCD, we do not have to partition the system matrix into far- and near-field, but directly employ Algorithm to C. In this sense, we may think of PCD as a single-block ACA with total pivoting. Here, the total pivoting is not prohibitive expansive since it is a-priori known that the pivots are located along the main diagonal of C. Furthermore, we have in contrast to ACA a rigorous stopping criterion based on the quantity trace(c L m L m). The implementations of both ACA and PCD rely on the same single-scale code, which means, they use the same quadrature routines for the assembly of the entries of the matrix C. In case of ACA, we end up with a data-sparse representation C of C. Thus, to obtain a representation of the approximate stochastic field a h,m similar to (), we have still to compute the dominant eigen-pairs of C. The representation of C yields a fast matrix-vector product. Therefore, we employ ARPACK, cf. [], to solve the eigen-problem for the compressed matrix C. The size of the Krylov subspace in ARPACK is chosen twice the number of desired eigenvalues, which is a reasonable choice according to []. Both methods have been implemented in the C-programming language, cf. []. Furthermore, in the implementation of ACA, we have employed level and BLAS routines in the assembly of the matrix C and in the matrix-vector product, whenever possible. Additionally, exploiting the symmetry of C, we only assembled the lower triangular part of the matrix C for ACA. Notice that a LAPACK -style implementation of the PCD, cf. [], is not applicable since it relies on the assembly of the entire matrix C, which is not feasible for large values of N... Numerical examples All computations are carried out single threaded on a computing server with two Intel(R) Xeon(R) X0 CPUs with a clock rate of.ghz and 8GB of main memory. Furthermore, we set the correlation length of the Matérn kernels to l = in each example. For the spatial discretization we choose piece-wise constant finite elements, i.e. the ansatz space is Vh, cf. (). For ACA, we set the truncation error ε h. Thus, we expect for both approaches a linear rate of convergence for the http://www.netlib.org/blas/ http://www.netlib.org/lapack/ For example for N =, the storage of the matrix C would require about 80GB of memory in 8-Byte double precision. DOI:.0/nla

stochastic field in terms of the (continuous) traces, i.e. ka ah,m klp (Ω;L (D)). h + p Tr Ch Tr Ch,M. p Therefore, we will measure the error by the quantity Tr Ch Tr Ch,M / Tr Ch. Notice that we consider here the relative error in order to make the error independent of the scaling of the eigenvalues which depends on the size of the domain, cf. (). First example Figure. The unit sphere S represented by patches. As a benchmark, we consider the three-dimensional unit sphere S R represented by congruent patches, see Figure. Thus, with the knowledge from the preceding Section, we can compute the exact eigenvalues and eigenfunctions of the Mate rn covariance functions as reference. Furthermore, we can estimate the truncation error due to (). For the truncation error related to the Mate rn covariance with smoothness parameter ν, it holds that v sz r u ux t λm. CM ν. () Cx ν dx = ν M m=m Notice that the dimension is essentially d = here, since we restrict the Mate rn kernels to the unit sphere S. Thus, to bound the truncation error of the Karhunen-Loe ve expansion by h, we have to ensure that r C ν ν CM h M. ν νh With the estimation of the constant C at hand, cf. Figure, we could now compute the related length of the Karhunen-Loe ve expansion. Unfortunately, this approach yields very large numbers of eigen-pairs to be approximated by ACA. Therefore, we choose another approach. We consider for each respective kernel the sum of those eigenvalues with magnitude larger then as an approximation to the actual trace of the kernel, i.e. Z kν (x, x) dsx = S M max X λm + ε, m= cf. (), with Mmax = arg minm {λm > }. The resulting truncation error is computable due to the knowledge of the exact traces which are equal to π for every ν. We have ε =.8 for Copyright c 0000 John Wiley & Sons, Ltd. Numer. Linear Algebra Appl. (0000) DOI:.0/nla

j ν = / ν = / ν = / ν = 9/ (9) () () () 8 () () () 9 (9) 8 (9) () 0 () () 0 () (9) () () 0 () 9 (8) 9 (9) 0 (9) 8 (89) 9 () (8) () 98 (9) () () 8 (8) 8 80 (900) () (9) () Table I. Different values for the cut-off parameter M j on the unit sphere S. j ν = / ν = / ν = / ν = 9/ () () () () 9 () () () () 9 () 9 () () () (8) (8) 8 () () 9 () 9 () 8 () (9) 9 (8) (8) 89 (9) (9) (8) 9 () () 90 (9) 8 (8) (9) 9 () () Table II. Ranks determined by PCD on the unit sphere S. ν = /, ε =. for ν = /, ε =.9 8 for ν = /, and ε =.0 8 for ν = 9/. The rank on each level j with mesh width h j is then determined according to M j = { Mmax arg min k {,...,M max} m= λ m k m= M max λ m < h m= λ m }, () that is the trace error relative to the scaling of the eigenvalues. The finest level j which we consider here is 8, resulting in 9 finite elements. For the levels j =,..., 8 and ν = /, /, /, 9/, the related cut-off parameters M j are found in Table I. The number in the brackets denotes the size necessary to resolve clusters of eigenvalues by approximating only complete subspaces related to the multiplicity of the respective eigenvalue. This is proposed in [] in order to achieve the optimal performance of ARPACK. Table II shows the ranks determined by PCD. The numbers in front of the brackets correspond to the recompressed ranks, the numbers within the brackets denote the original rank. As it turns out, the ranks computed by PCD are rather optimal in the sense that they reflect the estimated length of the Karhunen-Loéve expansion determined by formula (). Especially for increasing smoothness of the kernel function, the determined rank gets successively better. Remark.. We end up with the spectral decomposition of the approximate covariance C h,m when we solve the eigen-problem (9) for PCD. By truncating the obtained decomposition (0) with the prescribed relative accuracy h, we achieve an a-posteriori recompression of the PCD. This procedure may at most double the approximation error but reduces the rank by up to % on average in our computations for this article. The error plots and related computational times for the numerical experiments on the unit sphere are depicted in Figure and in Figure, respectively. Unfortunately, the computations of ACA as well as PCD with recompression for ν = / and level 8, i.e. for 9 finite elements, could not be carried out since the available main memory has been insufficient. DOI:.0/nla

0 Matern 9/ 0 Matern / trace error trace error error ACA (full subspaces) error ACA (full subspaces) error ACA error ACA error PCD error PCD error PCD (with recompress) error PCD (with recompress) Asymptotics j Asymptotics j 8 8 0 Matern / 0 Matern / trace error trace error error ACA (full subspaces) error ACA (full subspaces) error ACA error ACA error PCD error PCD error PCD (with recompress) error PCD (with recompress) Asymptotics j Asymptotics j 8 8 Figure. Numerical results (errors) on the unit sphere S. Figure shows the trace error for each particular kernel. The expected rate h j is indicated in the plots by the dashed black line. The magenta colored line with boxes shows the error for ACA with clusters of eigenvalues resolved (full subspaces), whereas the red line with circles shows the error for ACA with the exact number of eigenvalues computed by (). The error of the PCD is indicated by the blue lines with squares and finally the error of PCD with recompression is indicated by the cyan colored line with circles. It turns out that all four methods provide the expected rate of convergence in this example. For overview purposes, we have chosen the same colors and markers for each particular method in the subsequent visualizations. Figure shows the computational times for every method and each particular kernel. There seems to be no significant difference in the times for ACA with clusters of eigenvalues resolved and ACA with the exact number of eigenvalues from () for all kernels under consideration. Nevertheless, we observe that PCD is about a factor of times faster than ACA for the smoother kernels, i.e. ν = /, /, 9/ and about a factor of times faster for ν = /. Furthermore, we observe that the computation time consumed by ACA for ν = /, /, 9/ is mostly caused by the assembly of the matrix C, indicated by the green colored line with diamonds, whereas the computation time is governed by the eigenvalue computation for ν = /. Second example In our second example, we consider the plate geometry shown in Figure. It is a rectangle with 0 inscribed, equi-spaced circular holes, which is represented by 0 patches and scaled to a size of.. Here, the computations are carried out on levels j =,...,, where level corresponds to 90 finite elements. Figure 8 contains a visualization of the four orthonormal eigenfunctions corresponding to the four largest eigenvalues of the Matérn kernel with ν = /. In this example, we do not know the number of eigenvalues necessary to achieve the desired precision with ACA and ARPACK. Therefore, we use here the ranks provided by PCD with recompression as reference. The respective values are found in Table III. Again, the numbers in DOI:.0/nla

Matern 9/ time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) 0 0 time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) time (s) time (s) Matern / 8 Matern / 8 8 Matern / time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) time (s) time (s) 0 0 8 Figure. Numerical results (computational times) on the unit sphere S. Figure. Plate geometry represented by 0 patches. front of the brackets correspond to the recompressed ranks of PCD and the numbers within the brackets denote the original ranks. The error plots and related computational times for the numerical experiments on the plate geometry are presented in Figure 9 and in Figure, respectively. The trace error for each particular kernel, i.e. ν = /, /, /, 9/,, and the different methods is found in Figure 9. Again, PCD provides exactly the expected rate of convergence. The behavior of ACA is not that monotone as in the previous example. In case of the smoother kernels, i.e. ν = /, 9/,, the rate of convergence deteriorates in the last step. For ν = / we have a contrary behavior. The rate of convergence is increased from level to and in the last step. Finally, we observe for ν = / an increased rate of convergence from level to on the one hand and an increase of the error in the last step on the other hand. Possibly, these effects are caused by a lack of resolution of the faster oscillating eigenfunctions which are involved in the deflation process Copyright c 0000 John Wiley & Sons, Ltd. Numer. Linear Algebra Appl. (0000) DOI:.0/nla

8 Figure 8. First four orthonormal eigenfunctions on the plate geometry and Mate rn kernel for ν = /. j ν = / () () 8 (8) 8 (9) (9) 98(8) ν = / () 9 (0) () () 8 (9) () ν = / 9 (9) () () () 9 () () ν = 9/ 9 (9) () 0 () 9 (0) 8 (9) () ν= () () () () () () Table III. Ranks determined by PCD on the plate geometry. of the implicit restarted Arnoldi method and the resulting impact on the computation of the sought eigenvalues. Figure shows the computational times for every method and each particular kernel. Here, the times for the recompression of PCD are rather moderate due to the low ranks. Nevertheless, the benefit of the recompression is relatively small here, especially for the smoother kernels, cf. Table III. Again, PCD outperforms ACA by about a factor of, even in the case of ν = /. This is due to the coupling of the number of eigenvalues to be computed to the ranks provided by PCD. Furthermore, as in the previous example, the computation time for ACA is governed by the assembly of the matrix for ν = /, /, 9/, and by the eigenvalue computation for ν = /.. CONCLUDING REMARKS The present article is devoted to the efficient approximation of random fields for numerical applications. It is state of the art to compute a separable representation of the random field under consideration. A very common approach to determine such a representation is the (truncated) Karhunen-Loe ve expansion. Here, one has to solve the eigen-problem for the related covariance operator. We have tackled this task by combining the Adaptive Cross Approximation and ARPACK. Nevertheless, a major drawback of this approach is that the number of eigen-pairs to Copyright c 0000 John Wiley & Sons, Ltd. Numer. Linear Algebra Appl. (0000) DOI:.0/nla

9 0 Gauss 0 Matern 9/ trace error trace error error ACA error ACA error PCD error PCD error PCD (with recompress) error PCD (with recompress) Asymptotics j Asymptotics j 0 Matern / 0 Matern / trace error trace error error ACA error ACA error PCD error PCD error PCD (with recompress) error PCD (with recompress) Asymptotics j Asymptotics j 0 Matern / trace error error ACA error PCD error PCD (with recompress) Asymptotics j Figure 9. Numerical results (errors) on the plate geometry. be approximated has to be known in advance. This might be a problem in practice since the correct number is not feasible in many applications. With the pivoted Cholesky decomposition, we provide a method which overcomes this obstruction. Due to the knowledge of the discretized covariance operator s main diagonal, we are able to a-posteriori control the approximation error in terms of the trace. If, for the application at hand, an orthogonal decomposition of the stochastic field is required, this can be realized relatively cheap by the pivoted Cholesky decomposition in a postprocessing step. The numerical experiments suggest that both approaches provide the optimal rate of convergence. In the comparison of the computational times, we observe however that the pivoted Cholesky decomposition is the superior method. REFERENCES. Ghanem R, Spanos P. Stochastic finite elements. A spectral approach. Springer: New York, 99.. Rasmussen CE, Williams CKI. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press: Cambridge, 00. DOI:.0/nla

0 Gauss time assembly ACA total time ACA time PCD time PCD (with recompress) Matern 9/ time assembly ACA total time ACA time PCD time PCD (with recompress) time (s) time (s) 0 0 Matern / time assembly ACA total time ACA time PCD time PCD (with recompress) Matern / time assembly ACA total time ACA time PCD time PCD (with recompress) time (s) time (s) 0 0 Matern / time assembly ACA total time ACA time PCD time PCD (with recompress) time (s) 0 Figure. Numerical results (computational times) on the plate geometry.. Babuška I, Nobile F, Tempone R. A stochastic collocation method for elliptic partial differential equations with random input data. SIAM Journal on Numerical Analysis 00; ():0.. Babuška I, Tempone R, Zouraris G. Galerkin finite element approximations of stochastic elliptic partial differential equations. SIAM Journal on Numerical Analysis 00; ():800 8.. Frauenfelder P, Schwab C, Todor R. Finite elements for elliptic problems with stochastic coefficients. Computer Methods in Applied Mechanics and Engineering 00; 9(-):0 8.. Matthies H, Keese A. Galerkin methods for linear and nonlinear elliptic stochastic partial differential equations. Computer Methods in Applied Mechanics and Engineering 00; 9(-):9.. Schwab C, Gittelson CJ. Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs. Acta Numerica 0; 0:9. 8. Light WA, Cheney EW. Approximation theory in tensor product spaces. Lecture notes in mathematics, Springer: Berlin-Heidelberg, 98. 9. Loève M. Probability theory. I+II. Fourth edn., no. in Graduate Texts in Mathematics, Springer: New York, 9.. Giebermann K. Multilevel approximation of boundary integral operators. Computing 00; ():8 0.. Schwab C, Todor R. Karhunen-Loève approximation of random fields by generalized fast multipole methods. Journal of Computational Physics 00; :0.. Hackbusch W. Hierarchische Matrizen: Algorithmen und Analysis. Springer: Heidelberg, 009.. Eiermann M, Ernst OG, Ullmann E. Computational aspects of the stochastic finite element method. Computing and Visualization in Science 00; ():. DOI:.0/nla

. Saad Y. Numerical Methods for Large Eigenvalue Problems. Manchester University Press: Manchester, 99.. Bebendorf M, Rjasanow S. Adaptive low-rank approximation of collocation matrices. Computing 00; 0():.. Bebendorf M. Approximation of boundary element matrices. Numerische Mathematik 000; 8(): 89.. Dahmen W, Harbrecht H, Schneider R. Compression techniques for boundary integral equations. Asymptotically optimal complexity estimates. SIAM Journal on Numerical Analysis 00; ():. 8. Harbrecht H, Schneider R. Wavelet Galerkin schemes for boundary integral equations. Implementation and quadrature. SIAM Journal on Scientific Computing 00; (): 0. 9. Beebe NHF, Linderberg J. Simplifications in the generation and transformation of two-electron integrals in molecular calculations. International Journal of Quantum Chemistry 9; ():8 0. 0. Foster L, Waagen A, Aijaz N, Hurley M, Luis A, Rinsky J, Satyavolu C, Way MJ, Gazis P, Srivastava A. Stable and efficient Gaussian process calculations. The Journal of Machine Learning Research 009; :8 88.. Harbrecht H, Peters M, Schneider R. On the low-rank approximation by the pivoted Cholesky decomposition. Applied Numerical Mathematics 0; :8 0.. Sauter SA, Schwab C. Quadrature for hp-galerkin BEM in R. Numerische Mathematik 99; 8(): 8.. Golub GH, Van Loan CF. Matrix Computations. Fourth edn., Johns Hopkins University Press: Baltimore, 0.. Lehoucq RB, Sorensen DC. Deflation techniques for an implicitly restarted Arnoldi iteration. SIAM Journal on Matrix Analysis and Applications 99; ():89 8.. Lehoucq RB, Sorensen DC, Yang C. Arpack User s Guide: Solution of Large-Scale Eigenvalue Problems With Implicityly Restorted Arnoldi Methods (Software, Environments, Tools). SIAM: Philadelphia, 998.. Sorensen DC. Implicit application of polynomial filters in a k-step Arnoldi method. SIAM Journal on Matrix Analysis and Applications 99; (): 8.. Diestel J, Uhl JJ. Vector Measures. Mathematical surveys and monographs, American Mathematical Society, 9. 8. Schatten R. Norm Ideals of Completely Continuous Operators. Ergebnisse der Mathematik und ihrer Grenzgebiete, Springer: Berlin-Göttingen-Heidelberg, 90. 9. Mercer J. Functions of positive and negative type, and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character 909; :. 0. Schatten R. A Theory of Cross-Spaces. Princeton University Press: Princeton, 90.. Temlyakov VN. Approximations of functions with bounded mixed derivative. Trudy Matematicheskogo Instituta im. VA Steklova 98; 8:.. Griebel M, Harbrecht H. Approximation of bi-variate functions: singular value decomposition versus sparse grids. IMA Journal of Numerical Analysis 0; ():8.. Braess D. Finite Elements. Theory, fast solvers, and applications in solid mechanics. Second edn., Cambridge University Press: Cambridge, 00.. Brenner S, Scott L. The mathematical theory of finite element methods. Third edn., Springer: Berlin, 008.. Knyazev A. New estimates for ritz vectors. Mathematics of Computation of the American Mathematical Society 99; (9):98 99.. Ovtchinnikov E. Cluster robust error estimates for the rayleigh ritz approximation i: Estimates for invariant subspaces. Linear algebra and its applications 00; (): 8.. D yakonov EG, McCormick SF. Optimization in Solving Elliptic Problems. CRC Press: Boca Raton, 99. 8. Higham NJ. Analysis of the cholesky decomposition of a semi-definite matrix. in Reliable Numerical Computation, Oxford University Press: Oxford, 990; 8. 9. Higham NJ. Accuracy and Stability of Numerical Algorithms. Second edn., Society for Industrial and Applied Mathematics, 00. 0. Matern B. Spatial Variation. Springer Lecture Notes in Statistics, Springer: New York, 98.. Abramowitz M, Stegun IA. Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables. Applied mathematics series, Dover Publications, 9.. Graham I, Kuo F, Nichols J, Scheichl R, Schwab C, Sloan I. Quasi-Monte Carlo finite element methods for elliptic PDEs with log-normal random coefficient. SAM-Report 0; 0-.. Müller C. Analysis of Spherical Symmetries in Euclidean Spaces. No. 9 in Applied Mathematical Sciences Series, Springer: Berlin-Heidelberg, 998.. Harbrecht H, Peters M. Comparison of fast boundary element methods on parametric surfaces. Computer Methods in Applied Mechanics and Engineering 0; -:9.. Kernighan BW, Ritchie DM. The C programming language. Prentice-Hall: Upper Saddle River, NJ, 988.. Lucas C. LAPACK-style codes for level and pivoted Cholesky factorizations. LAPACK Working 00;. DOI:.0/nla