Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications Dongbin Xiu Department of Mathematics, Purdue University Support: AFOSR FA955-8-1-353 (Computational Math) SF CAREER DMS-64535 (Computational Math) DOE DE-FC52-8A28617 (PSAAP)

Overview Generalized polynomial chaos Stochastic collocation Lagrange interpolation Pseudo spectral gpc Applications Bayesian inverse problem Data assimilation

Stochastic PDE: Uncertainty Quantification via gpc u nz (, txz, ) = L( u ), (, T ] D R t B( u) =, [, T] D R u = u( x, Z), { t = } D R th -order gpc expansion: nz + u (,, ) ˆ t x Z uk(, t x) Φ k( Z), # of basis= n k = uˆ = E[ u( Z) Φ ( Z)] = u( Z) Φ ( Z ) ρ( Z ) dz,, k k k k n n z z z Orthogonal basis: E Z ( Z) ( Z) Φ Φ Φ ( Z) Φ ( Z) ρ( Z) dz = δ i j i j ij Optimality: u u = inf nz u Ψ L 2 2 ρ Z Ψ Π L ρ Z ( ) ( )

Generalized Polynomial Chaos (gpc) Basis functions: Hermite polynomials: seminal work by R. Ghanem Global orthogonal polynomials (Xiu & Karniadakis, 2) Wavelet basis (Le Maitre et al, 4) Piecewise basis (Babuska et al 4, Wan & Karniadakis, 5) Implementations: Stochastic Galerkin Stochastic collocation Properties: Rigorous mathematics High accuracy, fast covergence Curse-of-dimensionality

Stochastic Collocation Collocation: To satisfy governing equations at nodes Sampling: (solution statistics only) Random (Monte Carlo) Deterministic (lattice rule, tensor grid, cubature) Stochastic collocation: To construct polynomial approximations Lagrange interpolation Can not be constructed for any given nodes Interpolation error hard to control Pseudo spectral Utilize gpc polynomial basis Becomes multivariate integration Response surface method Multivariate interpolation Many ad hoc approaches

Stochastic Collocation Lagrange Interpolation odal set: Lagrange interpolation: Θ = Q i Q nz { Z } R 1 i= Q Q j j u ( Z) u( Z ) Lj( Z) Li( Z ) = δij, 1 i, j Q j= 1 Solution: for j=1,,q, u (, txz j, ) = L( u ), in (, T ] D, t B( u) =, [, T] D, j u = u ( x, Z ), { t = } D Tensor product: i1 n ( U U i z ) Sparse grid (Smolyak): q + 1 i q 1 q i q i i i 1 ( 1) ( U U nz ) (Xiu & Hesthaven, SIAM J. Sci. Comput., 5)

Stochastic Collocation: Pseudo Spectral Approach th -order gpc projection u (,, ) ˆ t x Z = uk(, t x) Φk( Z), uˆ = u( Z) Φ ( Z) ρ ( Z) dz. k k gpc-collocation approximation w (, t x, Z) = wˆ (, t x) Φ ( Z), k = k j j j k k k j= 1 k Q wˆ = u( t, x, Z ) Φ ( Z ) α u( Z) Φ ( Z) ρ( Z) dz k = wˆ (, t x) uˆ (, t x), Q k k Aliasing Error: ε u w Q L 2 ( Z ) ρ (Xiu, Comm. Comput Phys, vol. 2, 7)

gpc-collocation: Algorithm j j Q { Z α } j = 1 nz 1. Choose a nodal set, in R Deterministic solver 2. Solve for each j = 1,, Q, u (, txz j, ) = L( u ), in (, T ] D, t B( u) =, [, T] D, j u = u ( x, Z ), { t = } D 3. Evaluate the approximate gpc expansion coefficient Q j k k j j α k j= 1 wˆ = u( t, x, Z ) Φ ( Z ), ; th 4. Construct the -order gpc approximation w ( t, x, Z) = wˆ Φ ( Z). k = 1 k k Post-process Error bound (Xiu, 7): 2 2 2 2 ( ) ε u w ε + ε + M ε C Δ 2 L ( Z) Q Q ρ 1/2 Error Finite-term projection error + aliasing error + umerical error

Parameter Estimation: Bayesian Inverse Approach Stochastic PDE: u nz (, txz, ) = L( u ), (, T ] D R t B( u) =, [, T] D R u = u( x, Z), { t = } D R n n z z Solution: u utxz (,, ):[, T] D R n n z R Prior distribution: n z π ( z) = π ( z ) Z i i i= 1 Estimation of the prior distribution Requires direct measurements of the parameters o/not enough direct measurements? (Use experience/intuition ) How to take advantage of measurements of other variables?

Bayesian Inference Data: n d = G( Z) + e, e R d is i.i.d. Posterior distribution: d π ( d Z) π ( Z) π ( Z) π( Z d) = π( d Z) π( Z) dz Likelihood function: d LZ ( ) π( d Z) = π ( d G( Z)) n i= 1 e i i otes: Difficult to manipulate Classical sampling approaches can be time consuming (MCMC, etc) GPC (Galerkin) based approach: (Marzouk, ajm, Rahn, JCP, 7) gpc approximation: Properties: d L ( Z) π ( Z) π ( Z) = L ( Z) π ( Z) dz d L ( Z) π ( d Z) = π ( d G ( Z)) ei i, i i= 1 Allows direct sampling in term of Z with arbitrarily large samples (Virtually) no additional computational cost forward problem solver only Convergence seems natural n i

Convergence of gpc Bayesian Inference Kullback-Leibler divergence: π1( z) D( π1 π2) π1( z)log dz π ( z) 2 Observation error: 2 e (, σ I), i.i.d. ormal Theorem π d 2. If the gpc expansion G converges to G in Lπ, then the posterior density d converges to π in the sense Moreover, if G( Z) G ( Z) i i, 2 π z then for sufficiently large, L d d ( ) D π π,. D α C, 1 i n, α >, C independent of, d d 2 ( π π ) otes: Fast (exponential) convergence rate is retained Factor of 2 in the convergence rates α. d Z (Marzouk & Xiu, Comm. Comput. Phys. 8)

Parameter Estimation: Supersensitivity Example Burgers equation : Boundary conditions : 2 u u u + u = ν, x 1,1 2 t x x [ ] u( 1) = 1 + δ( Z); u(1) = 1; < δ<< 1 Deterministic results with no uncertainty (Xiu & Karniadakis, Int. J. umer. Eng., 4) 1% uncertainty in left BC

25 exact posterior gpc, p=4 gpc, p=8 δ true 2 p(δ data ) 15 1 5.1.2.3.4.5.6.7.8.9.1 Prior distribution is uniform Measurement noise: e~(,.5 2 ) δ

1 2 error 1 3 D(π π ) G G L 2 6 7 8 9 1 11 12 13 14 15 16 Factor = 2.1 (theory = 2)

Parameter Estimation: Step Function Assume the forward model is a step function Posterior distribution is discontinuous Gibb s oscillations exist Slow convergence with global gpc basis functions 1.2 1 3 2.5 exact posterior gpc posterior.8 2 G(z) or G (z).6.4 π d (z) 1.5 1.2.5 exact forward solution gpc approximation.2 1.8.6.4.2.2.4.6.8 1 z Forward model and its approximation 1.8.6.4.2.2.4.6.8 1 z Posterior distribution and its approximation

D(π π ) 1.1 G G L 2 1.2 1.3 1.4 error 1.5 1.6 1.7 1.8 1.9 2 4 6 8 1 12 14 16 18 2 p Factor = 1.99 (theory = 2)

Kalman Filter for Data Assimilation t m True state (unknown): u R, m 1 Forecast: Observation: Analysis: f du f (, tz) = F(, tu ), t (, T] dt f u (, Z) = u t m d = Hu + ε R, H : R R a f f u = u + K( d Hu ) K = P H ( HP H + R) f T f T 1 (Kalman gain matrix) P = E ( u u )( u u ) f f t f t T R E εε T = Properties: Straightforward for linear dynamic equations Extension to nonlinear equations: Extended KF (EKF) Optimal for Gaussian Explicit calculation of covariance can be costly

Ensemble Kalman Filter (EnKF) Ensemble: f f i ( u ) u (, ), = 1,, i tz i M ( d) = d+ (), ε i i = 1,, i a f f ( ) ( ) e i ( ) u = u + K ( ), i = 1,, M i i d H u i f T f T K = P H ( HP Η + R ) e e e e 1 f f P f = f f T f e ( u u )( u u ) P R εε T R e = Properties: nonlinear dynamics sampling errors Measurement. Can be eliminated by square-root filter (EnSRF) Solution states. Computational cost is of great concern

Error Analysis of the EnKF Assimilation step size: Δ T = t t n+ 1 n Lemma (local error): e M ε + ΔK ε + ΔK H ε f n+ 1 Δt Δt p ( Δ, σ ) O t α M = I KH Δ K = Ke K Theorem (global error): n E E + e exp Λ t ( ) n k n k= 1 Λ ΔT 1 ote the inverse dependence on assimilation step size (Li & Xiu, vol. 197, CMAME 8)

EnKF Example: Linear Wave Equation Model description: Linear advection equation; Periodic domain of length L=1; Wave speed = 1; grid spacing=1; time step = 1; True states are sampled from a Gaussian process, with zero mean and unit variance, and a spatial decorrelation length of 2. The dimension of random space is n z =5. Four measurements uniformly in space are made every 5 time units. Measurement variance is.1. o model error. 1 5 1 2 3 4 5 6 7 8 9 1 1 5 1 2 3 4 5 6 7 8 9 1 8 6 4 2 ( xz, ) R t=5 t=1 t=5 1 2 3 4 5 6 7 8 9 1 R 5 Long-term Wave propagation

Error Behavior of EnKF 5 5.5 T=1 T=5.15 T=1 T=5 6.1 log (Error) 6.5 7 Error.5 7.5 8 8.5 4.5 5 5.5 6 6.5 7 7.5 8 log () w.r.t. ensemble size.5.1.15.2.25.3.35.4.45.5 Standard Deviation of Measurements w.r.t. data noise level

EnKF Error Behavior 6.5 7 7.5 EnKF EnSRF qensrf 2 qensrf 3 8 log(error) 8.5 9 9.5 1 1.5 11 11.5.5 1 1.5 2 2.5 3 3.5 4 log(δ T) qensrf: EnSRF combined with deterministic sampling using optimal cubature

GPC Collocation based Kalman Filter Errors of assimilated results T=1 T=5 T=1, T=1,5 EnKF (=1) 5.3 1-3 3.4 1-3 2.3 1-3 1.7 1-3 EnKF (=1 3 ) 1.5 1-3 1. 1-3 7.4 1-4 6.2 1-4 EnKF (=1 4 ) 4.6 1-4 2.8 1-4 2.4 1-4 1.9 1-4 gpc-kf (=51) 3.5 1-4 1.6 1-4 1.1 1-4 9. 1-5 gpc-kf (=1) 1.8 1-4 7.9 1-5 5.6 1-5 4.6 1-5 5 dimensional random space (Li & Xiu, vol. 197, CMAME 8)

Accuracy Improvement of EnKF via gpc Use cubature equally weighted Use pseudo-spectral gpc Analytical expression in Z f f u (, tz ) ˆ u () t ( Z k Φk ), k = Statistics ( ) u = uˆ, P = uˆ uˆ f f f f k k < k T (Li & Xiu, J. Comput. Phys. 8)

GPC Based Ensemble Kalman Filter Ensemble: f f i ( ) k k ( ) u = u ˆ () t Φ Z, i = 1,, M, M 1 i k = Square-root update: f f f ' a a a ( ) ( ) ( ) ( ) u = u + u, u = u + u, i = 1,, M i i i i Mean state update: u a f ( f = u + K d Hu), Perturbation update: a f f ( u) ( u) K H( u) ' ' ' = +, i = 1,, M i i i ' f T f T K = P H ( HP Η + R) 1 T 1 f T f T f T = + + + ( ) ( ) K PH HPH R HPH R R 1

Example: onlinear Population Dynamics du dt f f u f f = r 1 u, u () = u A 2.8 2.7 2.6 True state Measurement Estimate 2.5 2.4 2.3 2.2 2.1 2.2.4.6.8 1 Time

Error Convergence 3 4 5 Mean Standard deviation 2 Mean Standard deviation log(error) 6 7 8 log(error) 4 6 9 8 1 1 11 1 2 3 4 5 6 7 8 9 Degree of the gpc expansion () 12 4 5 6 7 8 9 1 11 12 13 umber of quardrature points (Q) =8, Q=1 is sufficient

Comparison: gpc-kf vs EnKF 2 Mean (gpc EnSRF) Std (gpc EnSRF) Mean (EnSRF) Std (EnSRF) 4 log(error) 6 8 1 12 1 2 3 4 5 6 log(ensemble size)

onlinear System Example: Lorenz Equations dx dt dy dt dz dt = σ ( y x) = ρ x y xz = xy β z σ = 1, ρ = 28, β = 8/ 3 ( x, y, z ) = (1.5887, 1.531271, 25.4691) 2 X 2 2 4 6 8 1 12 14 16 18 2 Time 5 Y 5 2 4 6 8 1 12 14 16 18 2 Time 6 4 Z 2 2 4 6 8 1 12 14 16 18 2 Time Small deviation in initial condition (.1 in x ) causes large deviation

Qualitative Comparison: gpc EnKF vs EnKF 3 2.5 =2 (gpc EnSRF) =3 (gpc EnSRF) EnSRF Estimate True state 2 1.5 1.5 5 1 15 2 Time GPC EnKF: Q=5 3 =125 EnKF: ensemble size = 14

Summary Point selection is crucial for the efficacy of stochastic collocation GPC expansion is much more than a forward UQ method Bayesian inverse (Marzouk & Xiu, Comm. Comput. Phys, 8) Kalman filter for data assimilation (Li & Xiu, CMAME 8; JCP 8)