Non-Intrusive Solution of Stochastic and Parametric Equations

Similar documents
Sampling and Low-Rank Tensor Approximations

Parametric Problems, Stochastics, and Identification

Hierarchical Parallel Solution of Stochastic Systems

NON-LINEAR APPROXIMATION OF BAYESIAN UPDATE

Numerical Approximation of Stochastic Elliptic Partial Differential Equations

Quantifying Uncertainty: Modern Computational Representation of Probability and Applications

Sampling and low-rank tensor approximation of the response surface

Linear Solvers. Andrew Hazel

Paradigms of Probabilistic Modelling

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

Schwarz Preconditioner for the Stochastic Finite Element Method

A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models

A regularized least-squares method for sparse low-rank approximation of multivariate functions

Adaptive low-rank approximation in hierarchical tensor format using least-squares method

Towards Reduced Order Modeling (ROM) for Gust Simulations

Implementation of Sparse Wavelet-Galerkin FEM for Stochastic PDEs

Uncertainty analysis of large-scale systems using domain decomposition

Numerical Solution I

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model

The Conjugate Gradient Method

Numerical Methods for Large-Scale Nonlinear Systems

Fast Numerical Methods for Stochastic Computations

Proper Generalized Decomposition for Linear and Non-Linear Stochastic Models

Sparse Pseudo Spectral Projection Methods with Directional Adaptation for Uncertainty Quantification

Numerical Methods I Solving Nonlinear Equations

A fast and well-conditioned spectral method: The US method

Multigrid Methods and their application in CFD

Galerkin Methods for Linear and Nonlinear Elliptic Stochastic Partial Differential Equations

Numerical Methods in Matrix Computations

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Stochastic Spectral Methods for Uncertainty Quantification

Lecture 1: Center for Uncertainty Quantification. Alexander Litvinenko. Computation of Karhunen-Loeve Expansion:

Reduced Modeling in Data Assimilation

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Numerical solutions of nonlinear systems of equations

Partial Differential Equations with Stochastic Coefficients

Scientific Computing I

Motion Estimation (I) Ce Liu Microsoft Research New England

Polynomial chaos expansions for sensitivity analysis

Overlapping Schwarz preconditioners for Fekete spectral elements

Solving the Stochastic Steady-State Diffusion Problem Using Multigrid

Weighted Residual Methods

256 Summary. D n f(x j ) = f j+n f j n 2n x. j n=1. α m n = 2( 1) n (m!) 2 (m n)!(m + n)!. PPW = 2π k x 2 N + 1. i=0?d i,j. N/2} N + 1-dim.

17 Solution of Nonlinear Systems

SPECTRAL METHODS: ORTHOGONAL POLYNOMIALS

Lecture 24: Starting to put it all together #3... More 2-Point Boundary value problems

for compression of Boundary Integral Operators. Steven Paul Nixon B.Sc.

Stabilization and Acceleration of Algebraic Multigrid Method

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Line Search Methods for Unconstrained Optimisation

Theory and Computation for Bilinear Quadratures

High Performance Nonlinear Solvers

Iterative methods for positive definite linear systems with a complex shift

STOCHASTIC SAMPLING METHODS

A Sobolev trust-region method for numerical solution of the Ginz

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

arxiv: v2 [math.na] 8 Sep 2017

Greedy control. Martin Lazar University of Dubrovnik. Opatija, th Najman Conference. Joint work with E: Zuazua, UAM, Madrid

Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm

ON PROJECTIVE METHODS OF APPROXIMATE SOLUTION OF SINGULAR INTEGRAL EQUATIONS. Introduction Let us consider an operator equation of second kind [1]

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

FEM-FEM and FEM-BEM Coupling within the Dune Computational Software Environment

Nonlinear error dynamics for cycled data assimilation methods

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

The Finite Element Method

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

Learning Mixtures of Truncated Basis Functions from Data

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

SPECIAL RELATIVITY AND ELECTROMAGNETISM

Spectral methods for fuzzy structural dynamics: modal vs direct approach

Rank reduction of parameterized time-dependent PDEs

Iterative Methods for Linear Systems of Equations

A Spectral Approach to Linear Bayesian Updating

Numerical Analysis Comprehensive Exam Questions

Least Squares Approximation

Iterative Solution methods

Solving linear equations with Gaussian Elimination (I)

MODEL REDUCTION BASED ON PROPER GENERALIZED DECOMPOSITION FOR THE STOCHASTIC STEADY INCOMPRESSIBLE NAVIER STOKES EQUATIONS

Polynomial Chaos Expansion of random coefficients and the solution of stochastic partial differential equations in the Tensor Train format

Orthogonality of hat functions in Sobolev spaces

Indefinite and physics-based preconditioning

Fast Multipole BEM for Structural Acoustics Simulation

Matrix assembly by low rank tensor approximation

INTRODUCTION TO FINITE ELEMENT METHODS

Lehrstuhl Informatik V. Lehrstuhl Informatik V. 1. solve weak form of PDE to reduce regularity properties. Lehrstuhl Informatik V

Weighted Residual Methods

Chapter 6. Finite Element Method. Literature: (tiny selection from an enormous number of publications)

MATH 590: Meshfree Methods

Lecture 9: Numerical Linear Algebra Primer (February 11st)

. D CR Nomenclature D 1

Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions

Numerical Methods for Two Point Boundary Value Problems

Projection Methods. (Lectures on Solution Methods for Economists IV) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 7, 2018

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Partitioned Methods for Multifield Problems

Mathematical optimization

Transcription:

Non-Intrusive Solution of Stochastic and Parametric Equations Hermann G. Matthies a Loïc Giraldi b, Alexander Litvinenko c, Dishi Liu d, and Anthony Nouy b a,, Brunswick, Germany b École Centrale de Nantes, GeM, Nantes, France c KAUST, Thuwal, Saudi Arabia d Institute of Aerodynamics and Flow Control, DLR, Brunswick, Germany wire@tu-bs.de http://www.wire.tu-bs.de 13 Overview-Barcelona.tex,v 5.3.1.2 2015/01/06 00:23:34 hgm Exp

Overview 2 1. Parametric equations 2. Stochastic model problem 3. Plain vanilla Galerkin 4. To be or no to be intrusive 5. Numerical Comparison 6. Galerkin and low-rank tensor approximation 7. Non-intrusive computation 8. Numerical examples

General mathematical setup 3 Consider operator equation, physical system modelled by A: A(p; u) = f(p) u U, f F, U space of states, F = U dual space of actions / forcings; Operator and rhs depend on parameters p, well posed for all p P. Iterative solver convergent for all values of p iterates for k = 0,..., u (k+1) (p) = S(p; u (k) (p), R(p; u (k) (p)), with u (k) (p) u (p), where S is one cycle of the solver, and the residuum: R(u (k) ) := R(p; u (k) (p)) := f(p) A(p; u (k) ). When the residuum vanishes R(p; u (p)) = 0 the mapping S has a fixed point u (p) = S(p; u (p), 0).

Model stochastic problem 4 Geometry flow = 0 Sources 2 flow out Aquifier 2D Model model with stochastic data, p 1.5 1 0.5 0 2 1.5 Dirichlet b.c. (κ(x, p) u(x, p)) = f(x, ω) & b.c., x G R d (κ(x, p) u(x, p)) n = g(x, p), x Γ G, p P κ stochastic conductivity, f and g stochastic sinks and sources. One p is a realisation of κ, f, g. 1 0.5 0

Preconditioned residual 5 In the iteration set u (k+1) = u (k) + u (k) with u (k) := S(p; u (k), R(p; u (k) )) u (k), and usually P ( u (k) ) = R(p; u (k) ), so that S(p; u (k) ) = u (k) + P 1 (R(p; u (k) )). (list of arguments shortened) Here P is some preconditioner, which may depend on p, the iteration counter k, and on the current iterate u (k) ; e.g. in Newton s method P = D u A(p; u (k) ).

Iteration 6 Algorithm: Start with some initial guess u (0) k 0 while no convergence do Compute u (k) S(p; u (k), R(p; u (k) )) u (k) u (k+1) u (k) + u (k) k k + 1 end while Uniform contraction: p, u, v : S(p; u(p), R(p; u(p))) S(p; v(p), R(p; v(p))) U ϱ u(p) v(p) U. with ϱ < 1.

Discretisation I 7 Let S R P be an appropriate Hilbert space of real functions on P, look for solution in tensor space U := U S so that u(p) = ι u ι ς ι (p). Normally, discretise first U by choosing finite-dimensional U N U, but results here independent of that. Direct Integration: To compute Quantity of Interest (QoI) Q(u) = Q(u(p), p) µ(dp) w z Q(u(p z ), p z ) P z integrand and u(p z ) have to be computed for all p z : expensive! But decoupled, non-intrusive solves. Want to replace it by proxy / meta model or emulator u(p) u M (p)

Discretisation II 8 (Further) discretise U S by choosing S M = span{ψ α (p)} S to give U S M = U M U S = U. Ansatz u M (ω) = α u αψ α (p) U M, Often Ψ α (p) = Ψ α (θ(p)), where θ(p) = [..., θ l (p),... ] are independent. If Ψ α (θ) = l ψ α l (θ l ), where α = (..., α l,... ), then S M = l S M,l allows for higher degree tensors. Simplest computation for Ψ α to be orthogonal (orthonormal), e.g. in inner product φ, ϕ S = φ(p)ϕ(p) µ(dp) P How to determine the unknown coefficients u α?

Solution procedures I 9 Projection of the solution u(p), or of the residuum R(p; u(p)). Interpolation: Determine the u α by interpolating condition: p β : u(p β ) =! u M (p β ) = u α Ψ α (p β ). α Simplest when Kronecker-δ property Ψ α (p β ) = δ α,β satisfied. Solve equation on interpolation points p β decoupled, non-intrusive solves. Pseudo-spectral projection: Simple as Ψ α are orthonormal. Compute projection inner product (integral) by quadrature, i.e. u α = Ψ α (p)u(p) µ(dp) w z Ψ α (p z )u(p z ), P z solve equation on quadrature points p z decoupled, non-intrusive solve.

Solution procedures II 10 Mapping u( ) u M ( ) is a projection Π, and to describe a general projection, choose ŜM = span{φ α (p)}, projection orthogonal to ŜM: ϕ ŜM : (I Π)u, ϕ = 0, i.e. Ŝ M = im(i Π). Approximation properties are determined by S M, stability by ŜM. Collocation / Interpolation i.e. solve equation on collocation / interpolation points p β, i.e. Φ β (p) = δ(p p β ): R(p β ; u M (p β )) = R(p β ; α u α Ψ α (p β ))! = 0. With Kronecker-δ: R(p β ; u β ) = 0 same as interpolation, decoupled, non-intrusive solve. We worry about the norm Π. Norm of collocation projector Π C may grow with M.

Projectors 11 Pseudo-spectral projector Π P is orthogonal, i.e. Π P = 1. This means that ŜM = S M, normally Φ α = Ψ α Galerkin: Apply Galerkin weighting. β : Φ β (p), R(p; u M (p)) = Φ β (p), R(p; α Coupled equations, is it intrusive? u α Ψ α (p)) = 0. When solved in a partitioned way, residua computed by quadrature, it is non-intrusive, needs only residua on qudrature points. To have norm of projector as small as possible (Bubnov-Galerkin), choose orthogonal projection Φ α = Ψ α,

Galerkin on iteration equation 12 Trick: Project iteration equation. Set u (k) (p) = α u(k) α Ψ α (p)) and u (k) = [..., u (k) β,... ]: u (k+1) = u (k) + u (k) = S(u (k), R(u (k) )) = u (k+1) = u (k) + M u (k), with M u (k) := [..., Ψ β, S(p; u (k) (p), R(p; u (k) (p))),... ] u (k) Define a mapping S(u): S(u) := [..., Ψ β, S(p; α u α Ψ α (p), R(p; α u α Ψ α (p))),... ], then M u (k) = S(u (k) ) u (k) and u (k+1) = u (k) + M u (k) = S(u (k) ).

Convergence 13 Start with some initial guess u (0) k 0 while no convergence do Compute M u (k) as above u (k+1) u (k) + M u (k) k k + 1 end while Nonlinear block Jacobi algorithm: Theorem: The mapping S has the same contraction factor ϱ. This means that the simple nonlinear block Jacobi algorithm converges as before.

The myth about intrusiveness 14 Folklore: Galerkin methods are intrusive. They can be, but don t have to. Question: To be or not to be intrusive? Stochastic Galerkin conditions for iteration equation requires S(u (k) ), approximated by S(u (k) ) S Z (u (k) ) = ( ) υ z Ψ β (p z )S p z, u (k) (p z ), R(p; u (k) (p z )) z. to give M u (k) Z u (k) = S Z (u (k) ) u (k). This requires the evaluation of preconditioned residuum one iteration with the solver at each p z. Theorem still holds with M u (k) replaced by Z u (k) in algorithm.

Numerical example 15 2 R 4 R R R 3 R R R 1 R 5 R 6 A(p; u) := (Ku + λ 1 (p 1 )(u T u) u) = λ 2 (p 2 )f 0 =: f(p). f 0 := [1, 0, 0, 0, 0] T.

Numerical example spec 16 Case 1 Case 2 Case 3 Case 4 λ 1 (p 1 ) p 1 + 2 p 1 + 1.1 p 1 + 2 sin(4p 1 + 2) λ 2 (p 2 ) p 2 + 25 25p 2 + 0.5 10p 2 + 30 10 sin(p 2 ) + 30 c.o.v. 2.5e 2 2.9e+1 1.7e 1 2.2e 1 1e 02 1e 04 2nd order polynomial 3rd order polynomial 4th order polynomial 5th order polynomial RMSE 1e 06 1e 08 10 10 10 8 10 6 10 4 10 2 Convergence criteria (ε tol ) 10 0

Numerical results 17 order solver calls ɛ(l 2 (u)) ɛ(l 1 (u)) ɛ(l 2 (R u )) m P G P G P G P G 2 79 90 6.1e-5 6.1e-5 3.5e-5 3.5e-5 4.1e-5 4.1e-5 3 161 192 3.9e-6 3.9e-6 2.3e-6 2.3e-6 2.6e-6 2.6e-6 4 284 325 2.7e-7 2.7e-7 1.6e-7 1.6e-7 1.8e-7 1.8e-7 5 458 540 2.0e-8 2.0e-8 1.2e-8 1.2e-8 1.4e-8 1.4e-8 Low rank approximation: write u := [..., u α,... ] = (u α,n ) u = α,n u α,ne α e n u r = r 1 j=1 w j η j. Use faster global methods than block Jacobi, e.g. Quasi-Newton. Try and keep a low-rank tensor approximation troughout, from input fields to output solution.

Successive rank-one updates (SR1U) 18 Assume a functional J(p; u), and that A(p; u) f(p) = δ u J(p; u) = 0, so that solution is equivalent with minimising J for each p. Build solution rank-one by rank-one, i.e. with already computed u r := r 1 j=1 w j η j add new term w r η r through: min J(u r + w r η r ) δ w,η J(u r + w r η r ) = 0 w r,η r successive rank-one updates (SR1U), proper generalised decomposition (PGD). This Galerkin procedure only solves small problems, good approximations often with small r.

Low-rank approximation (basic PGD) 19 Define J r (w r, η r ) := J(u r (p) + w r η r ). New w r and η r found via system δ w J r (w r, η r ) = 0 δ η J r (w r, η r ) = 0 w r = 1. Block-Jacobi solver: u 1 0; η 1 1; w 1 0; for r = 1,..., until u r + w r η r accurate enough do : while no convergence do η r η r / η r ; Solve δ w J r (w r, η r ) = 0 for w r ; w r w r / w r ; Solve δ η J r (w r, η r ) = 0 for η r ; end while u r+1 u r + w r η r ; end for Output: a basic (greedy) low-rank approximation u r.

Non-intrusive residual for PGD Non-intrusive approximation of first equation: δ w J r (w r, η r ) = 0 δ u J(u r + w r η r ), η r S = 0 in U 0 = R(p; u r (p) + w r η r (p))η r (p) dp P 20 z υ z R(p z ; u r (p z ) + w r η r (p z )) η r (p z ), 2 nd eq.: δ η J r (w r, η r ) = 0 δ u J(u r + w r η r ), w r U = 0 in S λ S : 0 = R(p; u r (p) + w r η r (p)), w r U λ(p) dp P z υ z R(p z ; u r (p z ) + w r η r (p z )), w r U λ(p z )

Recent improvements 21 Increase u r by more than one term at a time (e.g. 5 10 terms) larger systems to be solved. Use faster algorithm than block-jacobi, e.g. Quasi-Newton methods (here BFGS). Use previous iterates as control variates to have fewer integration points per iteration. Increase accuracy of integration (number of integration points) as iteration converges.

z matrix is chosen to be the linear part B of the Hessian of the functional J. The low-rank approximations are also compared to the full-rank Galerkin approximation computed with the block-jacobi algorithmpgd introduced accuracy in [8], with a stagnation criterion of 10 10. The comparison is made in Table 1 for total degrees d = 2,3,4,5 and ranks 1,2,3,4,5 for the approximations. 22 d =2 d =3 d =4 d =5 Block-Jacobi solver [8] 5.14 10 5 3.31 10 6 2.31 10 7 1.70 10 8 Basic PGD (Algorithm 1) r =1 2.34 10 3 2.34 10 3 2.34 10 3 2.34 10 3 r =2 9.67 10 5 8.22 10 5 8.22 10 5 8.22 10 5 r =3 5.14 10 5 3.39 10 6 8.03 10 7 7.78 10 7 r =4 5.14 10 5 3.31 10 6 2.34 10 7 3.63 10 8 r =5 5.14 10 5 3.31 10 6 2.31 10 7 1.71 10 8 Improved PGD (Algorithm 3) r =1 2.34 10 3 2.34 10 3 2.34 10 3 2.34 10 3 r =2 5.14 10 5 3.31 10 6 2.85 10 7 1.95 10 7 r =3 5.14 10 5 3.31 10 6 2.31 10 7 1.79 10 8 r =4 5.14 10 5 3.31 10 6 2.31 10 7 1.79 10 8 r =5 5.14 10 5 3.31 10 6 2.31 10 7 1.76 10 8 Table 1: Relative error for the approximation resulting from the block-jacobi solver, the basic PGD and the improved algorithm for different total degrees d and different r.

greedy approximation gives satisfying results, even if the result is not optimal compared to the approximation resulting from a direct optimization in low-rank subsets. For the rest of this section, we focus on d = 5 and we measure the efficiency of PGD calls the different algorithms by counting the number of calls to the residual R(ur (pz ); pz ) = b(pz ) A(u(pz ); pz ). The results are reported in Table 2. r=1 r=2 r=3 r=4 Basic PGD (Algorithm 1) Relative error 2.34 10 3 8.22 10 5 7.78 10 7 3.63 10 8 Residual calls 1044 2160 3096 3816 Improved algorithm (Algorithm 3) Relative error 2.34 10 3 1.95 10 7 1.79 10 8 1.79 10 8 Residual calls 1044 2304 2700 2844 r=5 1.71 10 8 4464 1.79 10 8 3024 Table 2: Number of calls to the residual and corresponding relative error for different ranks r for the basic PGD and the improved algorithm. Both algorithms are similar at the beginning until r = 2. When r = 3, Algorithm 3 becomes the most efficient for computing the low-rank approximation. However, if we compare with the block-jacobi solver, the latter one only requires 540 calls to the residual. This suggests that the classical algorithms for computing the low-rank approximation of the solution of nonlinear equations must be reconsidered in terms of efficiency and intrusivity and different approaches must be proposed. 23

24 1 1 0.8 0.8 0.6 0.6 u(p;x) g(p;x) Obstacle example 0.4 0.4 0.2 0.2 0 1 0 1 0.8 1 0.6 0.8 0.6 0.4 0.4 0.2 p 0 0.8 1 0.6 0.8 0.4 x (a) Obstacle: g(p; x). 0.4 0.2 0.2 0 0.6 p 0 0.2 0 x (b) Solution: u(p; x). Figure 2: Obstacle and solution as functions of x and p [3]. 100 10 1 10 2 r 3 SVD of the L2 -projection Algorithm 1 Algorithm 3

0.6 0.4 0 p 0.4 0.2 0.2 0 0.6 0.4 0.4 0.2 p x (a) Obstacle: g(p; x). Obstacle 0 0.2 0 x (b) Solution: u(p; x). example convergence Figure 2: Obstacle and solution as functions of x and p [3]. 100 SVD of the L2 -projection Algorithm 1 Algorithm 3 10 1 Relative error 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 0 10 20 30 40 r 50 60 70 80 Figure 3: Relative error with respect to the rank of the approximation for different algorithms. 25

Conclusion 26 Parametric problems can be emulated. Galerkin methods can be non-intrusive. Convergence can be accelerated by faster global algorithms. For efficiency try and use sparse representation throughout; ansatz in low-rank tensor products, saves storage as well as computation. PGD / SR1U is inherently a Galerkin procedure. Can also be non-intrusive. Low-rank tensor representation can be very accurate.