Parametric Problems, Stochastics, and Identification

Similar documents
NON-LINEAR APPROXIMATION OF BAYESIAN UPDATE

Non-Intrusive Solution of Stochastic and Parametric Equations

Sampling and Low-Rank Tensor Approximations

Hierarchical Parallel Solution of Stochastic Systems

A Spectral Approach to Linear Bayesian Updating

Quantifying Uncertainty: Modern Computational Representation of Probability and Applications

Numerical Approximation of Stochastic Elliptic Partial Differential Equations

Inverse Problems in a Bayesian Setting

Optimisation under Uncertainty with Stochastic PDEs for the History Matching Problem in Reservoir Engineering

Paradigms of Probabilistic Modelling

Sampling and low-rank tensor approximation of the response surface

Implementation of Sparse Wavelet-Galerkin FEM for Stochastic PDEs

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Solving the Stochastic Steady-State Diffusion Problem Using Multigrid

Galerkin Methods for Linear and Nonlinear Elliptic Stochastic Partial Differential Equations

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model

Estimating functional uncertainty using polynomial chaos and adjoint equations

Dynamic System Identification using HDMR-Bayesian Technique

Stochastic Spectral Approaches to Bayesian Inference

A Stochastic Collocation based. for Data Assimilation

Ensemble Kalman Filter

Partial Differential Equations with Stochastic Coefficients

Schwarz Preconditioner for the Stochastic Finite Element Method

A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models

The Ensemble Kalman Filter:

Introduction to Computational Stochastic Differential Equations

4. DATA ASSIMILATION FUNDAMENTALS

Fast Numerical Methods for Stochastic Computations

Sparse polynomial chaos expansions in engineering applications

Collocation based high dimensional model representation for stochastic partial differential equations

Gaussian Process Approximations of Stochastic Differential Equations

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 1: Introduction

A new Hierarchical Bayes approach to ensemble-variational data assimilation

STOCHASTIC SAMPLING METHODS

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Stochastic Models, Estimation and Control Peter S. Maybeck Volumes 1, 2 & 3 Tables of Contents

Quasi-optimal and adaptive sparse grids with control variates for PDEs with random diffusion coefficient

Stochastic Finite Elements: Computational Approaches to SPDEs

At A Glance. UQ16 Mobile App.

A reduced-order stochastic finite element analysis for structures with uncertainties

Nonlinear Model Reduction for Uncertainty Quantification in Large-Scale Inverse Problems

Sparse Quadrature Algorithms for Bayesian Inverse Problems

DATA ASSIMILATION FOR FLOOD FORECASTING

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Smoothers: Types and Benchmarks

NONLOCALITY AND STOCHASTICITY TWO EMERGENT DIRECTIONS FOR APPLIED MATHEMATICS. Max Gunzburger

Fundamentals of Data Assimila1on

Organization. I MCMC discussion. I project talks. I Lecture.

Bayesian Inverse problem, Data assimilation and Localization

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Solving the steady state diffusion equation with uncertainty Final Presentation

Ergodicity in data assimilation methods

Polynomial Chaos and Karhunen-Loeve Expansion

Fundamentals of Data Assimila1on

Lecture 1: Center for Uncertainty Quantification. Alexander Litvinenko. Computation of Karhunen-Loeve Expansion:

Introduction to Uncertainty Quantification in Computational Science Handout #3

A Vector-Space Approach for Stochastic Finite Element Analysis

Simulating with uncertainty : the rough surface scattering problem

Spectral Representation of Random Processes

Uncertainty analysis of large-scale systems using domain decomposition

Relative Merits of 4D-Var and Ensemble Kalman Filter

An Empirical Chaos Expansion Method for Uncertainty Quantification

Statistics 910, #15 1. Kalman Filter

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Dynamic response of structures with uncertain properties

Evaluation of Non-Intrusive Approaches for Wiener-Askey Generalized Polynomial Chaos

4 Derivations of the Discrete-Time Kalman Filter

Monte Carlo Methods for Uncertainty Quantification

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Fundamentals of Data Assimilation

F denotes cumulative density. denotes probability density function; (.)

Stochastic Elastic-Plastic Finite Element Method for Performance Risk Simulations

A Maximum Entropy Method for Particle Filtering

A Note on the Particle Filter with Posterior Gaussian Resampling

The Bayesian approach to inverse problems

Gaussian Process Approximations of Stochastic Differential Equations

Dinesh Kumar, Mehrdad Raisee and Chris Lacor

Hierarchical Bayes Ensemble Kalman Filter

A Reduced Basis Approach for Variational Problems with Stochastic Parameters: Application to Heat Conduction with Variable Robin Coefficient

Performance Evaluation of Generalized Polynomial Chaos

Adaptive ensemble Kalman filtering of nonlinear systems

Sequential Monte Carlo Samplers for Applications in High Dimensions

Application of the Ensemble Kalman Filter to History Matching

Uncertainty Quantification in Computational Models

The Kalman Filter ImPr Talk

The Ensemble Kalman Filter:

Polynomial chaos expansions for sensitivity analysis

Linear Algebra Massoud Malek

Quadrature for Uncertainty Analysis Stochastic Collocation. What does quadrature have to do with uncertainty?

Gaussian Filtering Strategies for Nonlinear Systems

Benjamin L. Pence 1, Hosam K. Fathy 2, and Jeffrey L. Stein 3

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Multi-Element Probabilistic Collocation Method in High Dimensions

State Space Representation of Gaussian Processes

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Introduction. J.M. Burgers Center Graduate Course CFD I January Least-Squares Spectral Element Methods

(Extended) Kalman Filter

Transcription:

Parametric Problems, Stochastics, and Identification Hermann G. Matthies a B. Rosić ab, O. Pajonk ac, A. Litvinenko a a, b University of Kragujevac c SPT Group, Hamburg wire@tu-bs.de http://www.wire.tu-bs.de $Id: 12_Valencia.tex,v 3.1.1.4 2013/03/05 18:26:52 hgm Exp $

Overview 2 1. Parameter identification 2. Parametric forward problem 3. Bayesian updating, inverse problems 4. Tensor approximation 5. Bayesian computation 6. Examples

To fix ideas: example problem 3 Geometry 2 Aquifier 1.5 1 0.5 0 2 1.5 2D Model Simple stationary model of groundwater flow with stochastic data x (κ(x, ω) x u(x, ω)) = f(x, ω) & b.c., x G R d κ(x, ω) x u(x, ω) = g(x, ω), x Γ G, ω Ω. Parameter q(x, ω) = log κ(x, ω) is uncertain, the stochastic conductivity κ, as well as f and g sinks and sources. 1 0.5 0

Realisation of κ(x, ω) 4

Mathematical setup 5 Consider operator equation, physical system modelled by A: A(u) = f u U, f F, v U : A(u), v = f, v, U space of states, F = U dual space of actions / forcings. Solution operator: u = U(f), inverse of A. Operator depends on parameters q Q, hence state u is also function of q: A(u; q) = f(q) u = U(f; q). Measurement operator Y with values in Y: y = Y (q; u) = Y (q, U(f; q)).

Forward parametric problem 6 Parametric elements: operator A( ; q), rhs f(q), state u(q), r(q). Goal are representations of r(q) W, i.e. r : Q W. Help from inner product R on subspace R R Q. In case Q is a measure / probability space, R = L 2. To each parametric element corresponds linear map R : W ˆr ˆr r( ) R R. Key is self-adjoint positive map C = R R : W W. Spectral factorisation of C leads to Karhunen-Loève representation, a tensor product rep., corresponds to SVD of R (a.k.a. POD). Each factorisation C = B B leads to a tensor representation, (ex.: smoothed white noise) a 1 1 correspondence between factorisations and representations.

Setting for the identification process 7 General idea: We observe / measure a system, whose structure we know in principle. The system behaviour depends on some quantities (parameters), which we do not know uncertainty. We model (uncertainty in) our knowledge in a Bayesian setting: as a probability distribution on the parameters. We start with what we know a priori, then perform a measurement. This gives new information, to update our knowledge (identification). Update in probabilistic setting works with conditional probabilities Bayes s theorem. Repeated measurements lead to better identification.

Inverse problem 8 For given f, measurement y is just a function of q. This function is usually not invertible ill-posed problem, measurement y does not contain enough information. In Bayesian framework state of knowledge modelled in a probabilistic way, parameters q are uncertain, and assumed as random. Bayesian setting allows updating / sharpening of information about q when measurement is performed. The problem of updating distribution state of knowledge of q becomes well-posed. Can be applied successively, each new measurement y and forcing f may also be uncertain will provide new information.

Model with uncertainties 9 For simplicity assume that Q is a Hilbert space, and q(ω) has finite variance q Q S := L 2 (Ω), so that q L 2 (Ω, Q) = Q L 2 (Ω) = Q S =: Q. System model is now A(u(ω); q(ω)) = f(ω) a.s. in ω Ω, state u = u(ω) becomes U-valued random variable (RV), element of a tensor space U = U S. v U : As variational statement: E ( A(u( ); q( )), v ) = E ( f( ), v ) =: f, v. Leads to well-posed stochastic PDE (SPDE).

Representation of randomness 10 Parameters q modelled as Q-valued (a vector space) RVs on some probability space (Ω, P, A), with expectation operator E (q) = q. RVs q : Ω Q (and u(q)) may be represented in the following ways: Samples: the best known representation, i.e. q(ω 1 ),..., q(ω N ),... Distribution of q. This is the push-forward measure q P on Q. Moments of q, like E (q... q) (mean, covariance,...). Functional/Spectral: Functions of other (known) RVs, like Wiener s polynomial chaos, i.e. q(ω) = q(θ 1 (ω)),..., θ M (ω),...) =: q(θ). Sampling and functional representation work with vectors, allows linear algebra in computation.

Computational approaches 11 Representation determines algorithms: Distributions Kolmogorov / Fokker-Planck equations. Needs new software, deterministic solver u = S(f, q) not used. Moments New (sometimes difficult) equations. Needs new software, deterministic solver mostly not used. Sampling Domain of direct integration methods; (quasi) Monte Carlo, sparse (Smolyak) grids, etc. Obviously non-intrusive; software interface solve. Functional / Spectral (1) Interpolation / collocation. Based on samples of solution, non-intrusive, solve interface. (2) Galerkin at first sight intrusive, but with quadrature is also non-intrusive, precond. residual interface. Allows greedy rank-1

Conditional probability and expectation 12 With state u U = U S a RV, the quantity to be measured y(ω) = Y (q(ω), u(ω))) Y := Y S is also uncertain, a random variable. A new measurement z is performed, composed from the true value y Y and a random error ɛ: z(ω) = y + ɛ(ω) Y. Classically, Bayes s theorem gives conditional probability P(I q M z ) = P(M z I q ) P(I q ); P(M z ) expectation with this posterior measure is conditional expectation. Kolmogorov starts from conditional expectation E ( M z ), from this conditional probability via P(I q M z ) = E ( χ Iq M z ).

Update 13 The conditional expectation is defined as orthogonal projection onto the closed subspace L 2 (Ω, P, σ(z)): E(q σ(z)) := P Q q = argmin q L2 (Ω,P,σ(z)) q q 2 L 2 The subspace Q := L 2 (Ω, P, σ(z)) represents the available information, estimate minimises Φ( ) := q ( ) 2 over Q. More general loss functions than mean square error are possible. The update, also called the assimilated value q a (ω) := P Q q = E(q σ(z)), is a Q-valued RV and represents new state of knowledge after the measurement. Reduction of variance Pythagoras: q 2 L 2 = q q a 2 L 2 + q a 2 L 2 Doob-Dynkin: Q = {ϕ Q : ϕ = φ Y, φ measurable }

Important points I 14 The probability measure P is not the object of desire. It is the distribution of q, a measure on Q push forward of P: q P(E) := P(q 1 (E)) for measurable E Q. Bayes s original formula changes P, leaves q as is. Kolmogorov s conditional expectation changes q, leaves P as is. In both cases the update is a new q P. P (a probability measure) is on positive part of unit sphere, whereas q is free in a vector space. This will allow the use of (multi-)linear algebra and tensor approximations.

Important points II 15 Identification process: Use forward problem A(u(ω); q(ω)) = f(ω) to forecast new state u f (ω) and measurement y f (ω) = Y (q(ω), u f (ω))). Perform minimisation of loss function to obtain update map / filter. Use innovation in inverse problem to update forecast q f to obtain assimilated (updated) q a with update map. All operations in vector space, use tensor approximations throughout.

Approximation 16 Minimisation equivalent to orthogonality: find φ L 0 (Y, Q) p Q : D qa Φ(q a (φ)), p L2 = q q a, p L2 = 0, Approximation of Q : take Q n Q Q n := {ϕ Q : ϕ = ψ n Y, ψ n a n th degree polynomial} i.e. ϕ = 0 H + 1 H Y + + k H where k H Y k + + n H Y n, L k s (Y, Q) is symmetric and k-linear. With q a (φ) = q a (( 0 H,..., k H,..., n H )) = n orthogonality implies l = 0,..., n : k=0 H k z k = P Qn q, D ( l H ) Φ(q a( 0 H,..., k H,..., n H )) = 0

Determining the n-th degree Bayesian update 17 and l = 0 : l = 1 : With the abbreviations p v k := E ( p v k) = p(ω) v(ω) k P(dω), k H z (l+k) := z l ( k H we have for the unknowns ( H Ω z k) = E ( z l ( k H z k)), 0,..., k H,..., n H ) 0 H + k H z k + n H z n = q, 0 H z + k H z (1+k) + n H z (1+n) = q z,...... l = n : 0 H z n + k H z (n+k) + n H z 2n = q z n a linear system with symmetric positive definite Hankel operator matrix ( z (l+k) ) l,k.

Bayesian update in components 18 For short l = 0,..., n : n k H z (l+k) = q z l, k=0 For finite dimensional spaces, or after discretisation, in components (or à la Penrose in symbolic index notation): let q = (q m ), z = (z j ), and k H = ( k Hj m 1...j k ), then: l = 0,..., n; z j1 z j l ( 0 H m ) + + z j1 z j l+1 z j l+k ( k Hj m l+1...j l+k )+ + z j1 z j l+1 z j l+n ( n H m j l+1...j l+n ) = q m z j1 z j l. (Einstein summation convention used) matrix does not depend on m it is identically block diagonal.

Special cases 19 For n = 0 (constant functions) q a = 0 H = q (= E (q)). For n = 1 the approximation is with affine functions 0 H 0 H + 1 H z = q z + 1 H z z = q z = (remember that [cov qz ] = q z q z ) 1 H 0 H = q 1 H z ( z z z z ) = q z q z 1 H = [cov qz ][cov zz ] 1 (Kalman s solution), 0 H = q [cov qz ][cov zz ] 1 z, and finally q a = 0 H + 1 H z = q + [cov qz ][cov zz ] 1 (z z ).

Case with prior information 20 Here we have prior information Q f and prior estimate q f (ω) (forecast) and measurements z generating via Y a subspace Q y Q. We now need projection onto Q a = Q f + Q y, with reformulation as an orthogonal direct sum: Q a = Q f + Q y = Q f (Q y Q f ) = Q f Q. The update / conditional expectation / assimilated value is the orthogonal projection q a = q f + P Q q = q f + q, where q is the innovation. Compute q a by approximating: Q n Q. We now take n = 1.

Simplification 21 The case n = 1 linear functions, projecting onto Q 1 is well known: this is the linear minimum variance estimate ˆq a. Theorem: (Generalisation of Gauss-Markov) ˆq a (ω) = q f (ω) + 1 H (z(ω) y f (ω)), where the linear Kalman gain operator 1 H : Y Q is 1 H := [cov qz ][cov zz ] 1 = [cov qy ][cov yy + cov ɛɛ ] 1. (The normal Kalman filter is a special case.) Or in tensor space q Q = Q S: ˆq a = q f + ( 1 H I)(z y f ).

Deterministic model, discretisation, solution 22 Remember operator equation: A(u) = f u U, f F. Solution is usually by first discretisation A(u) = f u U N U, f F N = U N F, and then (iterative) numerical solution process u k+1 = S(u k ), lim u k = u. k S evaluates (pre-conditioned) residua f A(u k ). Similarly for model with uncertainty: A(u(ω); q(ω)) = f(ω), assume {v j } N j=1 a basis in U N, then the approx. solution in U N S N u(ω) = u j (ω)v j. j=1

Discretisation by functional approximation 23 Choose subspace S B S with basis {X β } B β=1, make ansatz for each u j (ω) β uβ j X β(ω), giving u(ω) = j,β u β j X β(ω)v j = j,β u β j X β(ω) v j. Solution is in tensor product U N,B := U N S B U S = U. State u(ω) represented by tensor u := u B N := {uβ j }β=1,...,b j=1,...,n, (β is usually multi-index) similarly for all other quantities, fully discrete forward model is obtained by weighting residual with Ξ α with ansatz inserted: α : Ξ α (ω), f(ω) A j,β u β j X β(ω)v j ; q(ω) S = 0.

Stochastic forward problem 24 generally coupled system of equations for u = {u β j }: A(u; q) = f, y = Y(q; u). If Ξ α ( ) = δ( ω α ), system decouples collocation / interpolation; may use for each ω α original solver S (obviously non-intrusive). If Ξ α ( ) = X α ( ) Bubnov-Galerkin conditions; with numerical integration uses also original solver S and is also non-intrusive. In greedy rank-one update tensor solver one uses Bubnov-Galerkin conditions (proper gener. decomp. (PGD)/ succ. rank-1 upd. (SR1U)/ alt. least squ. (ALS)), also possible by non-intrusive use of original S. For update: 1 H = 1 H I computed analytically (X β =Hermite basis) [cov qy ] = α>0 α! qα (y α ) T ; [cov yy ] = α>0 α! yα (y α ) T.

Important points III 25 Update formulation in vector spaces. This makes tensor representation possible. Parametric problems lead to tensor (or separated) representations. Sparse approximation by low-rank representation. Possible for forward problem (progressive or iterative). Possible for inverse problem. Low-rank approximation can be kept throughout update.

Example 1: Identification of multi-modal dist 26 Setup: Scalar RV x with non-gaussian multi-modal truth p(x); Gaussian prior; Gaussian measurement errors. Aim: Identification of p(x). 10 updates of N = 10, 100, 1000 measurements.

Example 2: Lorenz-84 chaotic model 27 Setup: Non-linear, chaotic system u = f(u), u = [x, y, z] Small uncertainties in initial conditions u 0 have large impact. Aim: Sequentially identify state u t. Methods: PCE representation and PCE updating and sampling representation and (Ensemble Kalman Filter) EnKF updating. Poincaré cut for x = 1.

Example 2: Lorenz-84 PCE representation 28 PCE: Variance reduction and shift of mean at update points. Skewed structure clearly visible, preserved by updates.

Example 2: Lorenz-84 non-gaussian identification 29 PCE EnKF truth measurement + posterior prior

Example 3: diffusion schematic representation 30 f u=s(q,f) A( q,u) Forward (u?) Inverse (q?) z Y(q,u)

Measurement patches 31 1 0.5 0 0.5 1 1 0 1 447 measurement patches 1 0.5 0 0.5 1 1 0 1 1 0.5 0 0.5 1 1 0 1 239 measurement patches 1 0.5 0 0.5 1 1 0 1 120 measurement patches 10 measurement patches

Convergence plot of updates 32 10 0 Relative error ε a 10 1 447 pt 239 pt 120 pt 60 pt 10 pt 10 2 0 1 2 3 4 Number of sequential updates

Forecast and Assimilated pdfs 33 6 κ f 4 κ a PDF 2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 κ

Spatial Error Distribution 34 a) ε a [%] b) ε a [%] c) I [%] 5 10 80 8 70 1 0 0 1 1 0 1 6 1 0 1 1 0 1 60 1 0 1 1 0 1

Example 4: plate with hole 35 Forward problem: the comparison of the mean values of the total displacement fo r deterministic, initial and stochastic configuration

Relative variance of shear modulus estimate 36 Relative RMSE of variance [%] after 4th update in 10% equally distributed m easurment points

Probability density shear modulus 37 Comparison of prior and posterior distribution

Conclusion 38 Parametric problems lead to tensor representation. Inverse problems via Bayes s theorem. Bayesian update is a projection. For efficiency try and use sparse representation throughout; ansatz in low-rank tensor products, saves storage as well as computation. Bayesian update compatible with low-rank representation.