Multilevel accelerated quadrature for elliptic PDEs with random diffusion Mathematisches Institut Universität Basel Switzerland
Overview Computation of the Karhunen-Loéve expansion Elliptic PDE with uniformly elliptic random diffusion Parameterized problem Discretization Multilevel quadrature Extension to elliptic PDEs with log-normal random diffusion Numerical results
Part. Karhunen-Loéve expansion
Karhunen-Loève expansion To separate spatial and stochastic variables, approximate the stochastic field a L P( Ω;L (D) ) by the truncated Karhunen-Loève expansion a(x,ω) E[a](x) + M λi ϕ i (x)y i (ω) i= with orthogonal collections {ϕ i } L (D) and {Y i } L P (Ω). The Karhunen-Loève expansion involves the computation of the dominant eigenpairs (λ i,ϕ i ) of the integral operator (Cϕ i )(x) = Cov[a](x,y)ϕ i(y)dy = λ i ϕ i (x), x D D with covariance kernel ( )( ) Cov[a](x, y) = a(x,ω) E[a](x) a(y,ω) E[a](y) dp(ω) L (D D). Ω eigenvalue problem for a nonlocal operator requires fast methods Theorem: If a LP( Ω;H p (D) ), then the eigenvalues {λ m } m N of C decay like λ m m (p+d)/d as m.
Discretization I Consider a family of quasi-uniform triangulations T h for D of mesh width h and spaces V r h := {v h : D R : v T is a polynomial of order r for all T T h } L (D). Then, given a function v H p (D) with p r, it holds the approximation estimated v Q h v L (D) := inf v v v h Vh r h L (D) hp v H p (D). Theorem. Let N = dimv r h, λ λ... be the eigenvalues of the covariance operator C and λ,h λ,h... λ N,h those of C h = Q h CQ h. Then, with a = a E[a], it holds a Q h a L P (Ω;L (D)) = tracec tracec h h min{p,r} a L P (Ω;H p (D)) and therefore a Q h a L P (Ω;L (D)) = N (λ m λ m,h ) + λ m. m= m=n+
Discretization II The theorem remains valid if we introduce the additional orthogonal proection P h : V r h U onto an M-dimensional subspace U V r h. The related proected random field is given by a h,m := a h + P h a,h = Q h E[a] + P h Q h a and its covariance according to C h,m := P h C h P h. Theorem: Let C h = Q h CQ h, C h,m = P h C h P h and a h,m = a h + P h a,h. Then, there holds a a h,m L P (Ω;L (D)) tracec h tracec h,m + h min{p,r}, where the hidden constant involves the L P( Ω;H p (D) ) -norm of a. discretization by finite elements yields a generalized eigenvalue problem use the Lanczos method to compute the dominant eigenpairs ARPACK apply a fast boundary element method to provide a fast matrix-vector product ACA
Alternative approach Generalized eigenvalue problem: Ax = λbx, A = [(Cφ i,φ )] i,, B = [(φ i,φ )] i, Inserting the low-rank approximation by the pivoted Cholesky decomposition A A m := L m L T m, L m R n m leads to L m L T mx = λbx B / L m L T mb / x = λ x, x = B / x. Since the nonzero eigenvalues of MM T and M T M coincide, we can replace the large eigenvalue problem by a small one L T mb L }{{ m } x = λ x, x = B L m x. R m m Error estimate: (Bauer/Fike) λ k λ k B / (A A m )B / A A m, k =,,...,m
Eigenvalue computation Gauss kernel exp( x y ): Approximate spectrum for ε =./././. Exponential kernel exp( x y ): Approximate spectrum for ε =./././. 8 8
Covariance kernels of Matèrn class.9.8. Matern / Matern / Matern / Matern 9/ Matern Matern / Matern / Matern / Matern 9/ fit (C=9.9) fit (C=.) fit (C=8.) fit (C=.98) y value.......... x value y value 8 m Matern kernels for ν = p + /: ( k ν (r) = exp ) νr l p p! (p)! i= ( ) (p + i)! 8νr p i. i!(p i)! l ν = / corresponds to the exponential kernel k / (r) = exp( x y /l) ν = corresponds to the Gaussian kernel k (r) = exp ( x y /(l ) ) these are the generating kernels for the Sobolev spaces on R d the eigenvalues behave like λ m m (ν+d)/d
Numerical examples for the sphere ACA and ARPACK ν = / ν = / ν = / ν = 9/ (9) () () () 8 () () () 9 (9) 8 (9) () () () () (9) () () () 9 (8) 9 (9) (9) 8 (89) 9 () (8) () 98 (9) () () 8 (8) 8 8 (9) () (9) () Pivoted Cholesky decomposition ν = / ν = / ν = / ν = 9/ () () () () 9 () () () () 9 () 9 () () () (8) (8) 8 () () 9 () 9 () 8 () (9) 9 (8) (8) 89 (9) (9) (8) 9 () () 9 (9) 8 (8) (9) 9 () () Matern 9/ Matern / Matern / Matern / trace error time (s) trace error trace error error ACA (full subspaces) error ACA (full subspaces) error ACA (full subspaces) error ACA (full subspaces) error ACA error ACA error ACA error ACA error PCD error PCD error PCD error PCD error PCD (with recompress) error PCD (with recompress) error PCD (with recompress) error PCD (with recompress) Asymptotics Asymptotics Asymptotics Asymptotics 8 8 8 8 Matern 9/ time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) time (s) Matern / time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) time (s) Matern / time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) trace error time (s) Matern / time assembly ACA total time ACA (full subspaces) total time ACA time PCD time PCD (with recompress) 8 8 8 8
Numerical examples for a plate geometry ν = / () () 8 (8) 8 (9) (9) 98(8) Matern / Matern / Matern / time assembly ACA total time ACA time PCD time PCD (with recompress) time assembly ACA total time ACA time PCD time PCD (with recompress) time (s) error ACA error PCD error PCD (with recompress) Asymptotics time (s) time (s) time assembly ACA total time ACA time PCD time PCD (with recompress) Matern / time assembly ACA total time ACA time PCD time PCD (with recompress) error ACA error PCD error PCD (with recompress) Asymptotics Matern 9/ error ACA error PCD error PCD (with recompress) Asymptotics time (s) trace error error ACA error PCD error PCD (with recompress) Asymptotics trace error trace error ν= () () () () () () Matern / ν = 9/ 9 (9) () () 9 () 8 (9) () Matern / ν = / 9 (9) () () () 9 () () trace error Matern 9/ ν = / () 9 () () () 8 (9) ()
Part. Uniformly elliptic diffusion
Problem statement elliptic boundary value problems can be solved with high accuracy, provided that the input data are known exactly practical significance of highly accurate numerical solutions is limited due to inexact input data Model problem: (elliptic problem with random coefficient) div x ( a(x,ω) x u(x,ω) ) = f (x) u(x,ω) = (x,ω) D Ω (x,ω) D Ω where D R d is a convex, polygonal domain or C -smooth (d =,) (Ω, Σ, P) is a complete probability space and < a min := essinf x D a(x,ω) esssupa(x,ω) =: a max < almost surely x D Quantities of interest: general functional: QoI[u](x) = F ( u(x,ω) ) dp(ω) Ω expectation: E[u](x) = u(x,ω)dp(ω) Ω ( ) [ variance: Var[u](x) = u(x,ω) E[u](x) dp(ω) = E u ] (x) E[u](x) Ω
Parameterized problem Assume a finite Karhunen-Loéve expansion of a, i.e., m a(x,ω) = E a (x) + σ k ϕ k (x)y k (ω) k= where Y k : Ω [,] are independent random variables with density functions ρ k. By introducing coordinates y := [,] m, we obtain the parameterized diffusion coefficient a(x,y) = E a (x) + and thus arrive at the parametrized problem m σ k ϕ k (x)y k k= div x ( a(x,y) x u(x,y) ) = f (x) u(x,y) = (x,y) D (x,y) D We are thus looking for where ρ(y) = m k= ρ k(y k ). QoI[u](x) = F ( u(x,y) ) ρ(y)dy The map y u(y) is analytic as a mapping H (D) provided that the eigenfunctions in the KL-expansion are in W, (D) ( Cohen/DeVore/Schwab []).
Quadrature in the stochastic variable We shall provide a sequence of quadrature formulae Q l : L ρ( ;X ) X, Q l v = N l ω l,i v(,ξ l,i )ρ(ξ l,i ) i= for the high-dimensional Bochner integral Int : Lρ( ;X ) X, Intv = v(,y)ρ(y)dy where X L (D) denotes a Banach space. It is supposed to provide the error bound (Int Q l )v X ε l v H ( ;X ) ε l = l uniformly in l N, where H ( ;X ) L ρ( ;X ) is a suitable Bochner space. Monte Carlo method quasi-monte Carlo method tensor product quadrature sparse grid quadrature........
Quadrature in the stochastic variable MC method. Satisfies ε l = N / l H ( ;X ) = Lρ( ;X ). q with respect to the root mean square error for Quasi MC method. Let the densities ρ k be in W, ([,]). Then, it typically holds ε l = Nl (logn l ) m and H ( ;X ) = W, mix ( ;X ), where v, := W mix ( ;X ) q y q v(y) yq yq m dy <. m X Gaussian quadrature. If v : X is analytical, then a tensor product Gaussian quadrature rule gives ε l = exp( bn /m l ) and H ( ;X ) = L ( ;X ). In fact, the polynomial chaos approach can be interpreted as a dimension weighted tensor product Gaussian quadrature rule where the weights depend on the numbers {σ k ϕ k L (D) }. Clenshaw-Curtis quadrature. Let the densities ρ k be in W r, ([,]). If v: X has mixed regularity of order r with respect to the parameter y, i.e. α y v L ( ;X ) <. v r, W mix ( ;X ) := max α r Then, a sparse tensor product Clenshaw-Curtis quadrature rule leads to the convergence rate ε l = Nl r (logn l ) m and H ( ;X ) = W r, mix ( ;X ).
Finite element approximation in the spatial variable Introduce a sequence of quasi-uniform triangulations T l of D with h l l. Define the finite element spaces S l (D) := {v C(D) : v τ P for all τ T l and v(x) = for all nodes x D} For any y, the finite element solution to the problem find u l (y) S l (D) such that α(x,y) u l(y) v l dx = D D f v l dx for all v l S l (D) satisfies the error estimates u(y) u l (y) H (D) l f L (D) u (y) u l (y) W, (D) l f L (D) with constants depending only on the ratio of a min and a max.
Variance reduction method For sake of simplicity, assume that the output functional F is a linear functional F : H (D) R. Use the multilevel splitting for a given function u V : for computing u = u + ( u } {{ u } ) + + ( u u ) + ( u }{{} u }{{ } H (D) H (D) H (D) ) F (u ) = F (u ) +F (u u ) + +F (u }{{} u ) +F (u }{{} u ) }{{} Idea. Exploit this decay for quadrature: F ( u (y) ) ρ(y)dy = F ( u (y) ) ρ(y)dy + F ( (u u )(y) ) ρ(y)dy + }{{} + F ( (u u )(y) ) ρ(y)dy + }{{} F ( (u u )(y) ) ρ(y)dy. }{{}
Review of sparse grids Consider two multiscale sequences: V () V () V () H, V () V () V () H The starting point for the sparse grid construction are multilevel decompositions V () = W () W () W (), V () = W () W () W (). The sparse tensor product space is given by V = W () l W () l l+l Given that the dimensions of {V () l } and {V () form geometric series, V contains essentially only degrees of freedom. max { dimv (),dimv () } V offers nearly the same approximation power as V () V () provided that the obect to be approximated offers some extra smoothness by means of mixed regularity. l } W () W () W () W () W () W () W () W () W () W ()
Alternative representation of sparse grids W () W () W () W () W () W () W () W () W () W () W () V () W () V () W () V () W () V () W () V () W () V () W () V () W () V () W () V () W () V () Factoring out with respect to the first component, one can rewrite the sparse grid space according to ( l ) V = W () l W () l = W () l V () l = ( V () l V () ) l V () l (SG) l= l = l= l= and give up the nestedness of {V () l }. sparse grid combination technique Likewise, we can give up the nestedness of {V () l } in the representation ( l ) V = W () l W () l = V () () l W l = V () ( l V () l V () ) l. (SG) l = l= l= l=
Multilevel quadrature estimator For our problem under consideration, the sequence {V () l } corresponds now to finite element spaces of increasing resolution and {V () l } refers to a sequence of quadrature rules of increasing precision. With respect to the representation (SG), we arrive at where F ( u(y) ) ρ(y)dy ( ) Q l F l u(y), l= F l ( u(y) ) := F ( ul (y) ) F ( u l (y) ), F ( u (y) ) :=. (MLQ) Likewise, with respect to the representation (SG), we obtain F ( u(y) ) ρ(y)dy where Q l F ( u l (y) ), l= Q l := Q l Q l, Q :=. (MLQ) Theorem. There holds the identity ( ) Q l F l u(y) = l= Q l F ( u l (y) ). l=
Complexity Theorem. Denote the cost of computing u for a specific y, including the cost of evaluating F, by C, where C σ C (σ > ). Moreover, let the sequence of quadrature formulae be nested, satisfying N θ N (θ > ). Then, the cost to compute (MLQ) is of order ( + )N σ C θ l σ l, l= whereas the cost to compute (MLQ) is of order N C θ l σ l. l= Typically, we have σ =,,8 for n =,,. Thus, we save about % work in D and still.% in D. The cost improvement comes for free. The factor σ might be improved by adaptive mesh refinement, resulting in a larger gain. The use of non-nested quadratures for (MLQ) yields the same cost as (MLQ) This would result in the sparse grid combination technique.
Error estimation for the multilevel quadrature I The error estimation of the multilevel quadrature is based on the generic estimate (Int Q l )(u p u p l ) X (l+l ) f p L (D), (GE) where X = H (D) if p = and X = W, (D) if p =. Theorem. Let {Q l } be a sequence of quadrature rules that satisfies the generic error estimate (GE). Then, the error of the multilevel estimator for the mean and the second moment is bounded by Intup Q l u p l f p L (D). l= X Remark. Here, we consider the equilibration of the accuracy. Instead, one can also consider the equilibration of the cost-benefit ratio or the equilibration of the cost.
Error estimation for the multilevel quadrature II Proof. We shall apply the following multilevel splitting of the error Intup Q l u p l = Intup Q u p + l= X Intu p Q u p X + Q l u p l= l= l= Q l u p l X Q l ( u p u p l) X. The first term simply reflects the quadrature error on the finest quadrature level and is bounded according to Intu p Q u p X f p L (D). In view of the generic estimate (GE), the term inside the sum satisfies ( Q l u p u p ) X l (Int Q l ) ( u p u p ) X l + (Int Q l ) ( u p u p ) X l (l+ l) f p L (D) + (l + l) f p L (D) f p L (D). Thus, we arrive at the assertion as follows Intup Q l u p l f p L (D) + f p L (D) f p L (D). l= X l=
Some remarks The representation (MLQ) yields the traditional multilevel quadrature, often motivated as a variance reduction method. The representation (MLQ) is simple to implement and requires only a black box finite element solver. In particular, adaptivity can be easily employed. Since the Monte Carlo quadrature needs no regularity, the multilevel Monte Carlo quadrature always works. Nevertheless, for the problem under consideration, it can be shown that α ( y u p u p ) l (y) X l α!c α f p L for all α N m. (D) This holds also for log-normally distributed diffusion coefficients. Easy to parallelize if one levelwise distributes specific samples to the different cores. If one is interested in the two-point correlation instead of the variance, one can use a sparse tensor product approximation. The cost complexity depends on the specific quadrature rule: quadrature method d = d = d = MLMC 8 MLQMC m 8 ML(S)GQ 8
Numerical results: Analytical example on the unit ball We solve div ( a(y) u(y) ) = in H (D) where α(y) = ( i= ( y ) ) i. Thus, it holds E[u(y)] = E and therefore [ i= ( y ) ] i = E[u](x) = ( x ). Error - - MLQMC MLCC MLMC Asymptotics h.9.88..... h
Numerical results: A more complex geometry I We solve div ( a(y) u(y) ) = in H (D) where α(x,y) = + exp( x ) ( sin(πx )y + sin(πx )y + sin(πx )y + 8 sin(πx )sin(πx )y + sin(πx )sin(πx )y + ) sin(πx )sin(πx )y. A reference solution is computed on a mesh with 9 9 tetrahedrons (h =.9) by samples of the Halton sequence.
Numerical results: A more complex geometry II expectation variance Error Error - - MLQMC MLCC MLMC Asymptotics h.8... h.. MLQMC MLCC MLMC Asymptotics h -.8..... h
Part. Log-normal diffusion
Problem with log-normal diffusion coefficient Problem: (elliptic problem with log-normally distributed random coefficient) div x ( a(x,ω) x u(x,ω) ) = f (x) u(x,ω) = (x,ω) D Ω (x,ω) D Ω where a(x,ω) = exp ( b(x,ω) ) for a Gaussian random field b where D R d is a convex, polygonal domain or C -smooth (d =,) (Ω, Σ, P) is a complete probability space We assume that a L P (Ω;C (D) ). Then, it holds < a min (ω) := essinf x D with /a min,a max L P (Ω). a(x,ω) esssupa(x,ω) =: a max (ω) < almost surely x D Quantities of interest: general functional: QoI[u](x) = F ( u(x,ω) ) dp(ω) Ω expectation: E[u](x) = u(x,ω)dp(ω) Ω [ ] variance: Var[u](x) = u(x,ω) E[u](x) dp(ω) = Eu (x) E[u] (x) Ω
Parameterized problem Assume a finite Karhunen-Loéve expansion of b, i.e., m b(x,ω) = E[b](x) + σ k ϕ k (x)y k (ω) k= with independent Gaussian random variables Y k N (,). By introducing coordinates y R m we obtain the parameterized diffusion coefficient ( m ) a(x,y) = exp E[b](x) + σ k ϕ k (x)y k k= and thus arrive at the parametrized problem div x ( a(x,y) x u(x,y) ) = f (x) u(x,y) = (x,y) D R m (x,y) D R m We are thus looking for QoI[u](x) = F ( u(x,y) ) ρ(y)dy where ρ(y) = (π) m/ exp( y /). The map y u(y) is analytic as a mapping R m H (D) provided that the eigenfunctions in the KL-expansion are in W, (D) ( Schwab/Hoang []).
Quadrature in the stochastic variable We shall provide a sequence of quadrature formulae for the Bochner integral Q l : L ρ(r m ;X ) X, Q l v = Int : L ρ(r m ;X ) X, Intv = N l ω l,i v(,ξ l,i ) i= R m v(,y)ρ(y)dy where X L (D) denotes a Banach space. It is supposed to provide the error bound (Int Q l )v X ε l v H (R m ;X ) uniformly in l N, where H (R m ;X ) L ρ(r m,x) is a suitable Bochner space. MC method. Satisfies ε l = N / l in the mean for H (R m ;X ) = Lρ(R m ;X ). Quasi MC method. Yields typically ε l = Nl (logn l ) m and H (R m ;X ) =W, mix (Rm ;X ). requires auxiliary density function exp( y /) to ensure convergence! Hermite Gaussian quadrature. If v : R m X is analytical, then the tensor product Hermite Gaussian quadrature rule gives ε l = exp ( bn /(m) l ) and H (R m ;X ) := C σ (R m ;X ) = { v: R m X, v is continuous and max y R m σ(y)v(,y) X < }.
Finite element approximation in the spatial variable Introduce a sequence of quasi-uniform triangulations T l of D with h l l. Define the finite element spaces S l (D) := {v C(D) : v τ P for all τ T l and v(x) = for all nodes x D} For any y, the finite element solution to the problem find u l (y) S l (D) such that α(x,y) u l(y) v l dx = D D f v l dx for all v l S l (D) satisfies the error estimates u(y) u l (y) H (D) l α max (y) α min (y) f L (D) u (y) u l (y) W, (D) l α max (y) α min (y) f L (D). Midpoint rule is sufficient for the spatial quadrature of the diffusion coefficient!
Numerical results: finite dimensional stochastics Problem. Seek u(ω) H (D) such that div ( exp(α(ω)) u(ω) ) = in D where α(x,ω) = χ B (x)ψ (ω) + χ B (x)ψ (ω) + χ B (x)ψ (ω) + χ B (x)ψ (ω) E α (x) =, Cov α (x,y) = χ B (x)χ B (y) i=
Numerical results: finite dimensional stochastics expectation second moment Error Error MC QMC (δ=) QMC (δ=.) QMC (δ=.) GQ l Asymptotics l MC QMC (δ=) QMC (δ=.) QMC (δ=.) GQ l Asymptotics l
Infinite dimensional stochastics: m Karhunen-Loéve expansion: ( m ) a(x,y) = exp E[b](x) + σ k ϕ k (x)y k k= If the number m of terms tends to, the Monte Carlo method does not care. The quasi Monte Carlo method does not care either! Theorem: Define γ k := σ k ϕ k L (D) and assume that γ k k ε for some ε >. Then, the QMC quadrature using Halton points for approximating the expectation of the solution u is polynomial tractable. More precisely, for each δ > there exists a constant such that the error of the N-point QMC quadrature using Halton points satisfies E[u] Q N u H (D) mn +δ f L (D), E u Q N u W, (D) mn +δ f L (D).
Numerical results: infinite dimensional stochastics Problem. Seek u(ω) H(D) such that div exp(a(ω)) u(ω) = in D where D = (, ), E[a](x) =, and Cov[a](x, y) = exp( kx yk/). expectation second moment Error Error MC QMC (δ=) QMC (δ=.) QMC (δ=.) GQ Asymptotics l l 8 MC QMC (δ=) QMC (δ=.) QMC (δ=.) GQ Asymptotics l l 8
Part. Summary
Summary We have shown that the multilevel Monte Carlo method is nothing but the sparse grid combination technique. We have used this knowledge to extend the multilevel Monte Carlo method to general multilevel quadrature methods. We have reversed the construction of the multilevel quadrature which enables us to give up the nestedness of the spatial approximation spaces. The cost can be considerably reduced by application of nested quadrature formulae. Any quadrature that provides the generic estimate (Int Q l )(u p u p l ) X (l+l ) f p L (D) for p =, is feasible for a multilevel quadrature method. requires mixed regularity! If also the spatial problem provides mixed regularity, then one can apply a higher dimensional sparse grid approach. multi-index Monte Carlo method