Scalable algorithms for optimal experimental design for infinite-dimensional nonlinear Bayesian inverse problems

Size: px
Start display at page:

Download "Scalable algorithms for optimal experimental design for infinite-dimensional nonlinear Bayesian inverse problems"

Transcription

1 Scalable algorithms for optimal experimental design for infinite-dimensional nonlinear Bayesian inverse problems Alen Alexanderian (Math/NC State), Omar Ghattas (ICES/UT-Austin), Noémi Petra (Applied Math/UC Merced), Georg Stadler (Courant/NYU) Data Assimilation and Inverse Problems Mathematics Institute, University of Warwick Coventry, United Kingdom February 23, 2016

2 The conceptual optimal experimental design problem Data: d Rq Model parameter field: m = {m(x)}x D Bayesian inverse problem: Use data d to infer the model parameter field m in the Bayesian sense, i.e., update the prior state of knowledge on m Optimal experimental design: How to design observation system for d to optimally infer m? Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

3 Some relevant references on OED for PDEs E. Haber, L. Horesh, L. Tenorio, Numerical methods for experimental design of large-scale linear ill-posed inverse problems, Inverse Problems, 24: , E. Haber, L. Horesh, L. Tenorio, Numerical methods for the design of large-scale nonlinear discrete ill-posed inverse problems, Inverse Problems, 26, , E. Haber, Z. Magnant, C. Lucero, L. Tenorio, Numerical methods for A-optimal designs with a sparsity constraint for ill-posed inverse problems, Computational Optimization and Applications, 1-22, Q. Long, M. Scavino, R. Tempone, S Wang, Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations, Comp. Meth. in Appl. Mech. and Eng., 259:24 39, Q. Long, M. Scavino, R. Tempone, S. Wang, A Laplace method for under-determined Bayesian optimal experimental designs, Comp. Meth. in Applied Mech. and Engineering, 285: , F. Bisetti, D. Kim, O. Knio, Q. Long, R. Tempone, Optimal Bayesian experimental design for priors of compact support with application to shocktube experiments for combustion kinetics, Intl. Journal for Num. Methods in Engineering, X. Huan and Y.M. Marzouk, Simulation-based optimal Bayesian experimental design for nonlinear systems, Journal of Computational Physics, 232: , X. Huan, Y. Marzouk, Gradient-based stochastic optimization methods in Bayesian experimental design, Intl. Journal for Uncertainty Quantification, 4(6), Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

4 How we define an optimal design Given locations x i of possible sensor sites, choose optimal weights w i for a subset (using sparsity constraints): { } x1,..., x design := ns w 1,..., w ns ground truth + sensor locations MAP solution Obtain data, solve Bayesian inverse problem: data + likelihood, prior = posterior variance We seek an A-optimal design: minimize average posterior variance Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

5 Bayesian inverse problem in Hilbert space (A. Stuart, 2010) D: bounded domain H = L 2 (D) m H: parameter Parameter-to-observable map: f : H R q Additive Gaussian noise: Likelihood: π like (d m) exp d = f(m) + η, η N (0, Γ noise ) Gaussian prior measure: µ prior = N (m pr, C prior ) Bayes Theorem (in infinite dimensions): dµ post dµ prior π like (d m) { 1 2 (f(m) d)t Γ 1 noise (f(m) d) } ( ) dµ post π like (d m) dµ prior A.M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica, Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

6 The experimental design and a weighted inference problem { } x1,..., x design := ns w 1,..., w ns Want w i {0, 1} w-weighted data-likelihood: relax = 0 w i 1 threshold = w i {0, 1} { π like (d m; w) exp 1 } 2 (f(m) d)t W σ (f(m) d) f: parameter-to-observable map W := diag(w i ) W σ := 1 σ 2 noise W Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

7 A-optimal design in Hilbert space Covariance function: c post (x, y) = Cov {m(x), m(y)} average posterior variance = 1 c post (x, x) dx D Covariance operator: [C post u](x) = Mercer s Theorem: A-optimal design criterion: D D D c post (x, y)u(y) dy c post (x, x) dx = tr(c post ) Choose a sensor configuration to minimize tr(c post ) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

8 Relationship to some other OED criteria For Gaussian prior and noise, and linear parameter-to-observable map: Maximizing the expected information gain is equivalent to D-optimal design: { E µprior Ed m {D kl (µ post, µ prior )} } = 1 2 log det Γpost + 1 log det Γprior 2 Minimizing Bayes risk of posterior mean is equivalent to A-optimal design: { { E µprior Ed m m mpost(d) 2}} = m m post(d) 2 π like (d m) dd µ prior (dm) For infinite dimensions, see: H Y = tr(γ post) A. Alexanderian, P. Gloor, and O. Ghattas, On Bayesian A- and D-optimal experimental designs in infinite dimensions, Bayesian Analysis, to appear, Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

9 Challenges The infinite-dimensional Bayesian inverse problem is merely an inner problem within OED Need trace of posterior covariance operator, which would be intractable in the large-scale setting We approximate the posterior covariance by the inverse of the Hessian at the maximum a posteriori (MAP) point, but explicit construction of the Hessian is still very expensive For nonlinear inverse problem, Hessian depends on data, which are not available a priori For efficient optimization, we need derivatives of the trace of the inverse Hessian with respect to the sensor weights We seek scalable algorithms, i.e., those whose cost (in terms of forward PDE solves) is independent of the data dimension and the parameter dimension Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

10 A-optimal design problem: Linear case Linear parameter-to-observable map: f(m) := Fm Gaussian prior measure: µ prior = N (m prior, C prior ) Gaussian prior and noise = Gaussian posterior m post = arg min m 1 W 1/2 (Fm d) C 1 2 Γ 1 prior noise 2 (m m prior), m m prior }{{} J (m) := negative log-posterior OED problem: C post (w) = H 1 (w), H(w) = F W σ F + C 1 prior minimize w [ ( ) ] 1 tr F W σ F + C 1 prior + penalty A. Alexanderian, N. Petra, G. Stadler, O. Ghattas, A-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems with regularized l 0-sparsification, SIAM Journal on Scientific Computing, 36(5):A2122 A2148, Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

11 A-optimal design problem: Nonlinear case Nonlinear case: Gaussian prior and noise Gaussian posterior Use Gaussian approximation to posterior: for given w and d Compute the maximum a posteriori probability (MAP) estimate m MAP(w; d) m MAP(w; d) := arg min J (m, w, d) m Gaussian approximation to posterior at MAP point Problem: data d not available a priori N ( m MAP(w; d), H 1[ m MAP(w; d), w; d ]) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

12 A-optimal designs for nonlinear inverse problems General formulation: minimize average posterior variance: minimize tr (H 1[ m MAP (w; d), w; d ]) w Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

13 A-optimal designs for nonlinear inverse problems General formulation: minimize expected average posterior variance: minimize E d {tr (H 1[ m MAP (w; d), w; d ])} w Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

14 A-optimal designs for nonlinear inverse problems General formulation: minimize expected average posterior variance: minimize E µprior E d m {tr (H 1[ m MAP (w; d), w; d ])} w Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

15 A-optimal designs for nonlinear inverse problems General formulation: minimize expected average posterior variance: ( minimize E µprior E d m {tr H 1[ m MAP (w; d), w; d ])} + γp (w) w P (w): sparsifying penalty function In practice: get data from a few training models m 1,..., m nd m i : draws from prior Training data: d i = f(m i ) + η i, i = 1,..., n d Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

16 A-optimal designs for nonlinear inverse problems General formulation: minimize expected average posterior variance: ( minimize E µprior E d m {tr H 1[ m MAP (w; d), w; d ])} + γp (w) w P (w): sparsifying penalty function In practice: get data from a few training models m 1,..., m nd m i : draws from prior Training data: d i = f(m i ) + η i, i = 1,..., n d The problem to solve in practice: minimize w 1 n d n d i=1 ( tr H 1[ ] ) m MAP (w; d i ), w; d i + γp (w) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

17 Randomized trace estimation A R n n symmetric Trace estimator: tr(a) 1 n tr n tr i=1 z T i Az i, z i random vectors Gaussian trace estimator: z i independent draws from N (0, I) For z N (0, I) E { z T Az } = tr(a) Var { z T Az } = 2 A 2 F Clustered eigenvalues = Good approximation with small n tr Efficient means of approximating trace of inverse Hessian M. Hutchinson, A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines (1990). H. Avron and S. Toledo, Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix (2011). Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

18 Optimization problem for computing A-optimal design OED objective function: Ψ(w) = 1 n d tr [ H 1( w; d i ) ] n d i=1 Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

19 Optimization problem for computing A-optimal design OED objective function: Ψ(w) = 1 n d n tr n d n tr zk, H 1( w; d i ) z k i=1 k=1 Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

20 Optimization problem for computing A-optimal design OED objective function: Ψ(w) = 1 n d n tr n d n tr z k, H 1( w; d i ) z k }{{} y ik i=1 k=1 Auxiliary variable: y ik = H 1( w; d i )z k Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

21 Optimization problem for computing A-optimal design OED objective function: Ψ(w) = 1 n d n tr n d n tr z k, H 1( w; d i ) z k }{{} y ik i=1 k=1 Auxiliary variable: y ik = H 1( w; d i )z k OED optimization problem minimize w where 1 n d n tr n d n tr z k, y ik + γp (w) i=1 k=1 ( m MAP (w; d i ) = arg min J m, w; d i ), i = 1,..., n d m H ( ) m MAP (w; d i ), w; d i yik = z k, i = 1,..., n d, k = 1,..., n tr Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

22 Application: subsurface flow Forward problem (e m u) = f u = g e m u n = h in D on Γ D on Γ N u: pressure field m: log-permeability field (inversion parameter) Left: true parameter; Right: pressure-field Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

23 Bayesian inverse problem: Gaussian approximation MAP point is solution to minimize m where J (m, w; d) := 1 2 (Bu d)t W σ(bu d) m m prior, m m prior E (e m u) = f u = g e m u n = 0 in D on Γ D on Γ N B is observation operator Cameron-Martin inner-product: m 1, m 2 E = C 1/2 prior m1, C 1/2 prior m2, m 1, m 2 range(c 1/2 prior ) Covariance of Gaussian approximation to posterior: with m = m MAP (w; d) C = H(m, w; d) 1 = [ 2 mj (m, w; d)] 1 Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

24 Optimization problem for A-optimal design 1 minimize w [0,1] ns n d n tr n d n tr z k, y ik + γp (w) i=1 k=1 where for i = 1,..., n d, k = 1,..., n tr (e mi u i )=f (e mi p i )= B W σ (Bu i d i ) C 1 prior (m i m pr ) + e mi u i p i = 0 }{{} mj (m i) (e mi v ik )= (y ik e mi u i ) (e mi q ik )= (y ik e mi p i ) B W σ Bv ik C 1 prior y ik + y ik e mi u i p i + e mi( ) v ik p i + u i q ik = z k }{{} H(m i)y ik Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

25 The Lagrangian for the OED problem L OED ( w, {u i}, {m i}, {p i}, {v ik }, {q ik }, {y ik }, {u i }, {m i }, {p i }, {v ik }, {q ik }, {y ik }) := 1 n d n tr n d + i=1 i=1 n d ntr z k, y ik + γp (w) i=1 k=1 [ e m i u i, u i f, u i h, u i ΓN ] n d [ m + e i p i, p i + B W σ(bu i d i), p ] i n d [ + (mi m pr), m i E + m ] i em i u i, p i i=1 n d ntr [ m + e i v ik, v ik + yik e m i u i, vik ] i=1 k=1 n d ntr [ m + e i q ik, q ik + yik e m i p i, q ik + B W σbv ik, qik ] i=1 k=1 n d ntr [ + y ik em i v ik, p i + y ik, y ik E + y ik em i u i, q ik + (y ik y ike m i u i, p i) i=1 k=1 z k, y ] ik Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

26 The adjoint OED problem and the gradient The adjoint OED problem for {u i }, {m i }, {p i }, {v ik }, {q ik }, {y ik } B W σ Bq ik (y ike mi p i ) (e mi v ik) = 0 e mi q ik p i + C 1 prior y ik + y ike m u i p i + e mi u i v ik = 1 n d n tr z k (e mi q ik) (y ike m u i ) = 0 B W σ Bp i (m i e mi p i ) (e mi u i ) = b 1 i e mi p i p i + C 1 prior m i + m i e mi u i p i + e mi u i u i = b 2 i (e mi p i ) (m i e mi u i ) = b 3 i Gradient for the OED problem: n d g(w) = i=1 q ik, y ik, and v ik noise (Bu n d n tr i d i ) Bp i + Γ 1 noise Bv ik Bqik, Γ 1 i=1 k=1 can be eliminated: qik = v ik, yik = y ik, vik = q ik Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

27 Computing the OED objective function and its gradient Solve inverse problems: m i (w), u i (m i (w)) and p i (m i (w)), i = 1,..., n d Solve for (v ik, y ik, q ik ), i = 1,..., n d, k = 1,..., n tr : (e m i v ik )= (y ik e m i u i ) (e m i q ik )= (y ik e m i p i ) B W σbv ik C 1 prior y ik + y ik e m i u i p i + e m i ( v ik p i + u i q ik ) = zk Objective function evaluation: Ψ(w) = 1 nd n d n tr i=1 Solve for (p i, m i, u i ), i = 1,..., n d ntr k=1 z k, y ik Gradient: B W σbp i (m i em i p i ) (e m i u i ) = b1 i e m i p i p i + C 1 prior m i + m i em i u i p i + e m i u i u i = b2 i (e m i p i ) (m i em i u i) = b 3 i n d g(w) = Γ 1 noise (Bu i d i ) Bp i 1 n d n tr Γ 1 noise n i=1 d n Bv ik Bv ik tr i=1 k=1 Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

28 Computational cost: the number of PDE solves r: rank of Hessian of the data misfit Independent of mesh (beyond a certain resolution) Independent of sensor dimension (beyond a certain dimension) Cost (in PDE solves) of evaluating OED objective function & gradient 1 Cost of solving inverse problems 2 r n d N newton 2 Cost of computing OED objective 2 r n tr n d 3 Cost of computing OED gradient 2 n tr n d + 2 r n d Low-rank approx to misfit Hessian = Cost in (2) and (3) 2 r n d OED optimization problem: Solved via quasi-newton interior point Number of quasi-newton iterations insensitive to parameter/sensor dimension Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

29 The prior and training models Prior knowledge: parameter value at few points, correlation truth prior mean prior variance Prior mean: smooth least-squares fit to point measurements N Z 2 1 αx mpr = arg min hm, Ami + δi (x) m(x) mtrue (x) dx m 2 2 i=1 D PN Prior covariance: Cprior = (A + α i=1 δi ) 2, Am = (D m) Draws from prior used to generate training data for OED Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

30 Computing an optimal design with sparsification Optimization problem 1 The functions f ε minimize w where 1 n d n tr n d n tr z k, y ik + γp ε (w) i=1 k=1 ( ) m MAP (w; d i ) = arg min J m, w; d i m H ( ) m MAP (w; d i ), w; d i yik = z k n d = 5, n tr = 20 Continuation l 0 -penalty n s P ε (w) := f ε (w i ) i=1 Solve the problem with ε The optimal weight vector w Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

31 Effectiveness of the optimal design Numerical test: how well can we recover the true model from true data? Optimal Random Truth Designs with 10 sensors Comparing MAP point (top row) and posterior variance (Bottom row) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

32 Effectiveness of the optimal design Numerical test: how well can we recover the true model from true data? 10 sensor designs 20 sensor designs tr ( Γpost(w) ) Random Optimal Random Optimal E rel (w) E rel (w) d = f(m true) + η Γ post(w) := H 1 (m MAP(w; d), w; d) E rel (w) := mmap(w; d) mtrue m true Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

33 Effectiveness of the optimal design Numerical test: how well we do on average? 10 sensor designs 20 sensor designs Ψ(w) Random Optimal Random Optimal E rel (w) E rel (w) E rel (w) = 1 n d n d i=1 m MAP(w; d i) m i m i Ψ(w) = 1 n d n d tr ( H 1 (m MAP(w; d i), w; d ) i) i=1 d i generated from n d = 50 draws of m i from the prior (different from the n d = 5 used to solve the OED problem) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

34 Scalability with respect to parameter/sensor dimension obj func/grad evaluation OED optimization inner CG iters outer CG 500 2,000 3,500 5,000 BFGS iterations ,000 3,500 5,000 parameter dimension inner CG iters outer CG BFGS iterations sensor dimension Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

35 Inversion of a realistic permeability field (e m u) = f u = 0 e m u n = 0 in D on Γ D on Γ N Γ D : at the corners Γ N : the rest of boundary Point source at center Log-permeability field from SPE Comparative Solution Project Injection well at the center Production wells at four corners Parameter (and state) dimension: n = 10, 202 Potential sensor locations: n s = 128 Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

36 SPE model: the prior Mean Standard deviation Random draw Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

37 SPE model: inference with the optimal design Posterior samples Truth Posterior mean Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

38 SPE model: Compare optimal vs random design Optimal Random average var = average var = average var = Designs with 32 sensors Observed data collected using true parameter on high resolution mesh (n ) Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

39 Challenges revisited Infinite-dimensional inference Proper choice of prior, mass-weighted inner-product Need trace of posterior covariance (inverse of Hessian, large, dense, expensive matvecs) Randomized trace-estimator, Gaussian approximation in nonlinear case In nonlinear inverse problem, Hessian depends on data (not available a priori) Random draws from prior to generate training data Optimal design has combinatorial complexity in general w-weighted formulation, relax weights 0 w j 1, continuation to get 0 1 Conventional OED algorithms intractable for large-scale problems Infinite-dimensional formulation, gradient based optimization, scalable algorithms Gaussian assumption (on posterior of inverse problem) remains via linearized parameter-to-observable map; current worked aimed at higher-order approximation A. Alexanderian, N. Petra, G. Stadler, and O. Ghattas, A fast and scalable method for A-optimal design of experiments for infinite-dimensional Bayesian nonlinear inverse problems, SIAM Journal on Scientific Computing, 38(1):A243 A272, Omar Ghattas (UT Austin) Large-scale OED February 23, / 32

Scalable Algorithms for Optimal Control of Systems Governed by PDEs Under Uncertainty

Scalable Algorithms for Optimal Control of Systems Governed by PDEs Under Uncertainty Scalable Algorithms for Optimal Control of Systems Governed by PDEs Under Uncertainty Alen Alexanderian 1, Omar Ghattas 2, Noémi Petra 3, Georg Stadler 4 1 Department of Mathematics North Carolina State

More information

Dimension-Independent likelihood-informed (DILI) MCMC

Dimension-Independent likelihood-informed (DILI) MCMC Dimension-Independent likelihood-informed (DILI) MCMC Tiangang Cui, Kody Law 2, Youssef Marzouk Massachusetts Institute of Technology 2 Oak Ridge National Laboratory 2 August 25 TC, KL, YM DILI MCMC USC

More information

(Extended) Kalman Filter

(Extended) Kalman Filter (Extended) Kalman Filter Brian Hunt 7 June 2013 Goals of Data Assimilation (DA) Estimate the state of a system based on both current and all past observations of the system, using a model for the system

More information

Bayesian inverse problems with Laplacian noise

Bayesian inverse problems with Laplacian noise Bayesian inverse problems with Laplacian noise Remo Kretschmann Faculty of Mathematics, University of Duisburg-Essen Applied Inverse Problems 2017, M27 Hangzhou, 1 June 2017 1 / 33 Outline 1 Inverse heat

More information

Uncertainty quantification for inverse problems with a weak wave-equation constraint

Uncertainty quantification for inverse problems with a weak wave-equation constraint Uncertainty quantification for inverse problems with a weak wave-equation constraint Zhilong Fang*, Curt Da Silva*, Rachel Kuske** and Felix J. Herrmann* *Seismic Laboratory for Imaging and Modeling (SLIM),

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

Towards Large-Scale Computational Science and Engineering with Quantifiable Uncertainty

Towards Large-Scale Computational Science and Engineering with Quantifiable Uncertainty Towards Large-Scale Computational Science and Engineering with Quantifiable Uncertainty Tan Bui-Thanh Computational Engineering and Optimization (CEO) Group Department of Aerospace Engineering and Engineering

More information

UT-LANL DOE/ASCR ISICLES Project PARADICE: Uncertainty Quantification for Large-Scale Ice Sheet Modeling and Simulation.

UT-LANL DOE/ASCR ISICLES Project PARADICE: Uncertainty Quantification for Large-Scale Ice Sheet Modeling and Simulation. UT-LANL DOE/ASCR ISICLES Project PARADICE: Uncertainty Quantification for Large-Scale Ice Sheet Modeling and Simulation Omar Ghattas Representing the rest of the UT-Austin/LANL team: Don Blankenship Tan

More information

Uncertainty quantification for Wavefield Reconstruction Inversion

Uncertainty quantification for Wavefield Reconstruction Inversion Uncertainty quantification for Wavefield Reconstruction Inversion Zhilong Fang *, Chia Ying Lee, Curt Da Silva *, Felix J. Herrmann *, and Rachel Kuske * Seismic Laboratory for Imaging and Modeling (SLIM),

More information

Probing the covariance matrix

Probing the covariance matrix Probing the covariance matrix Kenneth M. Hanson Los Alamos National Laboratory (ret.) BIE Users Group Meeting, September 24, 2013 This presentation available at http://kmh-lanl.hansonhub.com/ LA-UR-06-5241

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Variational data assimilation

Variational data assimilation Background and methods NCEO, Dept. of Meteorology, Univ. of Reading 710 March 2018, Univ. of Reading Bayes' Theorem Bayes' Theorem p(x y) = posterior distribution = p(x) p(y x) p(y) prior distribution

More information

Parameter estimation by implicit sampling

Parameter estimation by implicit sampling Parameter estimation by implicit sampling Matthias Morzfeld 1,2,, Xuemin Tu 3, Jon Wilkening 1,2 and Alexandre J. Chorin 1,2 1 Department of Mathematics, University of California, Berkeley, CA 94720, USA.

More information

Stochastic Spectral Approaches to Bayesian Inference

Stochastic Spectral Approaches to Bayesian Inference Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

ICES REPORT March Tan Bui-Thanh And Mark Andrew Girolami

ICES REPORT March Tan Bui-Thanh And Mark Andrew Girolami ICES REPORT 4- March 4 Solving Large-Scale Pde-Constrained Bayesian Inverse Problems With Riemann Manifold Hamiltonian Monte Carlo by Tan Bui-Thanh And Mark Andrew Girolami The Institute for Computational

More information

Scalable kernel methods and their use in black-box optimization

Scalable kernel methods and their use in black-box optimization with derivatives Scalable kernel methods and their use in black-box optimization David Eriksson Center for Applied Mathematics Cornell University dme65@cornell.edu November 9, 2018 1 2 3 4 1/37 with derivatives

More information

Uncertainty quantification for Wavefield Reconstruction Inversion

Uncertainty quantification for Wavefield Reconstruction Inversion using a PDE free semidefinite Hessian and randomize-then-optimize method Zhilong Fang *, Chia Ying Lee, Curt Da Silva, Tristan van Leeuwen and Felix J. Herrmann Seismic Laboratory for Imaging and Modeling

More information

arxiv: v1 [math.oc] 11 Jan 2018

arxiv: v1 [math.oc] 11 Jan 2018 ESTIMATION OF THE ROBIN COEFFICIENT FIELD IN A POISSON PROBLEM WITH UNCERTAIN CONDUCTIVITY FIELD Ruanui Nicholson, 1, Noémi Petra, 2 & Jari P. Kaipio 3 arxiv:1801.03592v1 [math.oc] 11 Jan 2018 1 Department

More information

c 2012 Society for Industrial and Applied Mathematics

c 2012 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 34, No. 6, pp. A2837 A2871 c 2012 Society for Industrial and Applied Mathematics ADAPTIVE HESSIAN-BASED NONSTATIONARY GAUSSIAN PROCESS RESPONSE SURFACE METHOD FOR PROBABILITY

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. hree-dimensional variational assimilation (3D-Var) In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. Lorenc (1986) showed

More information

Fundamentals of Data Assimila1on

Fundamentals of Data Assimila1on 014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Multi-Objective Stochastic Optimization of Magnetic Fields. David S Bindel, Cornell University

Multi-Objective Stochastic Optimization of Magnetic Fields. David S Bindel, Cornell University Multi-Objective Stochastic Optimization of Magnetic Fields David S Bindel, Cornell University Current State of the Art (STELLOPT) Optimizer Calculate! " (physics + engineering targets) Adjust plasma boundary

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Resolving the White Noise Paradox in the Regularisation of Inverse Problems 1 / 32 Resolving the White Noise Paradox in the Regularisation of Inverse Problems Hanne Kekkonen joint work with Matti Lassas and Samuli Siltanen Department of Mathematics and Statistics University of

More information

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Controlled Diffusions and Hamilton-Jacobi Bellman Equations Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

4. DATA ASSIMILATION FUNDAMENTALS

4. DATA ASSIMILATION FUNDAMENTALS 4. DATA ASSIMILATION FUNDAMENTALS... [the atmosphere] "is a chaotic system in which errors introduced into the system can grow with time... As a consequence, data assimilation is a struggle between chaotic

More information

A Note on the Particle Filter with Posterior Gaussian Resampling

A Note on the Particle Filter with Posterior Gaussian Resampling Tellus (6), 8A, 46 46 Copyright C Blackwell Munksgaard, 6 Printed in Singapore. All rights reserved TELLUS A Note on the Particle Filter with Posterior Gaussian Resampling By X. XIONG 1,I.M.NAVON 1,2 and

More information

Bayesian Inverse problem, Data assimilation and Localization

Bayesian Inverse problem, Data assimilation and Localization Bayesian Inverse problem, Data assimilation and Localization Xin T Tong National University of Singapore ICIP, Singapore 2018 X.Tong Localization 1 / 37 Content What is Bayesian inverse problem? What is

More information

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem,

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications Dongbin Xiu Department of Mathematics, Purdue University Support: AFOSR FA955-8-1-353 (Computational Math) SF CAREER DMS-64535

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Variational Bayesian Inference Techniques

Variational Bayesian Inference Techniques Advanced Signal Processing 2, SE Variational Bayesian Inference Techniques Johann Steiner 1 Outline Introduction Sparse Signal Reconstruction Sparsity Priors Benefits of Sparse Bayesian Inference Variational

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

A Fresh Look at the Bayes Theorem from Information Theory

A Fresh Look at the Bayes Theorem from Information Theory A Fresh Look at the Bayes Theorem from Information Theory Tan Bui-Thanh Computational Engineering and Optimization (CEO) Group Department of Aerospace Engineering and Engineering Mechanics Institute for

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Generalized Randomized Maximum Likelihood

Generalized Randomized Maximum Likelihood Generalized Randomized Maximum Likelihood Andreas S. Stordal & Geir Nævdal ISAPP, Delft 2015 Why Bayesian history matching doesn t work, cont d Previous talk: Why Bayesian history matching doesn t work,

More information

c 2016 Society for Industrial and Applied Mathematics

c 2016 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 8, No. 5, pp. A779 A85 c 6 Society for Industrial and Applied Mathematics ACCELERATING MARKOV CHAIN MONTE CARLO WITH ACTIVE SUBSPACES PAUL G. CONSTANTINE, CARSON KENT, AND TAN

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors Sean Borman and Robert L. Stevenson Department of Electrical Engineering, University of Notre Dame Notre Dame,

More information

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing George Papandreou and Alan Yuille Department of Statistics University of California, Los Angeles ICCV Workshop on Information

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run

More information

Full-Waveform Inversion with Gauss- Newton-Krylov Method

Full-Waveform Inversion with Gauss- Newton-Krylov Method Full-Waveform Inversion with Gauss- Newton-Krylov Method Yogi A. Erlangga and Felix J. Herrmann {yerlangga,fherrmann}@eos.ubc.ca Seismic Laboratory for Imaging and Modeling The University of British Columbia

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Simultaneous estimation of wavefields & medium parameters

Simultaneous estimation of wavefields & medium parameters Simultaneous estimation of wavefields & medium parameters reduced-space versus full-space waveform inversion Bas Peters, Felix J. Herrmann Workshop W- 2, The limit of FWI in subsurface parameter recovery.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Evan Kwiatkowski, Jan Mandel University of Colorado Denver December 11, 2014 OUTLINE 2 Data Assimilation Bayesian Estimation

More information

Generalization. A cat that once sat on a hot stove will never again sit on a hot stove or on a cold one either. Mark Twain

Generalization. A cat that once sat on a hot stove will never again sit on a hot stove or on a cold one either. Mark Twain Generalization Generalization A cat that once sat on a hot stove will never again sit on a hot stove or on a cold one either. Mark Twain 2 Generalization The network input-output mapping is accurate for

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Convergence of the Ensemble Kalman Filter in Hilbert Space

Convergence of the Ensemble Kalman Filter in Hilbert Space Convergence of the Ensemble Kalman Filter in Hilbert Space Jan Mandel Center for Computational Mathematics Department of Mathematical and Statistical Sciences University of Colorado Denver Parts based

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

Gradient Enhanced Universal Kriging Model for Uncertainty Propagation in Nuclear Engineering

Gradient Enhanced Universal Kriging Model for Uncertainty Propagation in Nuclear Engineering Gradient Enhanced Universal Kriging Model for Uncertainty Propagation in Nuclear Engineering Brian A. Lockwood 1 and Mihai Anitescu 2 1 Department of Mechanical Engineering University of Wyoming 2 Mathematics

More information

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model On a Data Assimilation Method coupling, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model 2016 SIAM Conference on Uncertainty Quantification Basile Marchand 1, Ludovic

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Steps in Uncertainty Quantification

Steps in Uncertainty Quantification Steps in Uncertainty Quantification Challenge: How do we do uncertainty quantification for computationally expensive models? Example: We have a computational budget of 5 model evaluations. Bayesian inference

More information

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1 Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger

More information

Gaussian with mean ( µ ) and standard deviation ( σ)

Gaussian with mean ( µ ) and standard deviation ( σ) Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (

More information

Bayesian parameter estimation in predictive engineering

Bayesian parameter estimation in predictive engineering Bayesian parameter estimation in predictive engineering Damon McDougall Institute for Computational Engineering and Sciences, UT Austin 14th August 2014 1/27 Motivation Understand physical phenomena Observations

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Computational statistics

Computational statistics Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial

More information

CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning

CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning CS242: Probabilistic Graphical Models Lecture 4A: MAP Estimation & Graph Structure Learning Professor Erik Sudderth Brown University Computer Science October 4, 2016 Some figures and materials courtesy

More information

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University

More information

Announcements. Proposals graded

Announcements. Proposals graded Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Bayesian Paradigm. Maximum A Posteriori Estimation

Bayesian Paradigm. Maximum A Posteriori Estimation Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2008-001 Numerical methods for optimal experimental design of large-scale ill-posed problems by Eldad Haber Mathematics and Computer Science EMORY UNIVERSITY Numerical methods for optimal

More information

Non-linear least-squares inversion with data-driven

Non-linear least-squares inversion with data-driven Geophys. J. Int. (2000) 142, 000 000 Non-linear least-squares inversion with data-driven Bayesian regularization Tor Erik Rabben and Bjørn Ursin Department of Petroleum Engineering and Applied Geophysics,

More information

Lecture 3: Adaptive Construction of Response Surface Approximations for Bayesian Inference

Lecture 3: Adaptive Construction of Response Surface Approximations for Bayesian Inference Lecture 3: Adaptive Construction of Response Surface Approximations for Bayesian Inference Serge Prudhomme Département de mathématiques et de génie industriel Ecole Polytechnique de Montréal SRI Center

More information

arxiv: v2 [math.pr] 27 Oct 2015

arxiv: v2 [math.pr] 27 Oct 2015 A brief note on the Karhunen-Loève expansion Alen Alexanderian arxiv:1509.07526v2 [math.pr] 27 Oct 2015 October 28, 2015 Abstract We provide a detailed derivation of the Karhunen Loève expansion of a stochastic

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 08 Problem Set 3: Extracting Estimates from Probability Distributions Last updated: April 9, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote

More information

A new Hierarchical Bayes approach to ensemble-variational data assimilation

A new Hierarchical Bayes approach to ensemble-variational data assimilation A new Hierarchical Bayes approach to ensemble-variational data assimilation Michael Tsyrulnikov and Alexander Rakitko HydroMetCenter of Russia College Park, 20 Oct 2014 Michael Tsyrulnikov and Alexander

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz 1 Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz Technische Universität München Fakultät für Mathematik Lehrstuhl für Numerische Mathematik jonas.latz@tum.de November

More information

Hierarchical sparse Bayesian learning for structural health monitoring. with incomplete modal data

Hierarchical sparse Bayesian learning for structural health monitoring. with incomplete modal data Hierarchical sparse Bayesian learning for structural health monitoring with incomplete modal data Yong Huang and James L. Beck* Division of Engineering and Applied Science, California Institute of Technology,

More information

Application of the Ensemble Kalman Filter to History Matching

Application of the Ensemble Kalman Filter to History Matching Application of the Ensemble Kalman Filter to History Matching Presented at Texas A&M, November 16,2010 Outline Philosophy EnKF for Data Assimilation Field History Match Using EnKF with Covariance Localization

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks Yingfei Wang, Chu Wang and Warren B. Powell Princeton University Yingfei Wang Optimal Learning Methods June 22, 2016

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

Robust error estimates for regularization and discretization of bang-bang control problems

Robust error estimates for regularization and discretization of bang-bang control problems Robust error estimates for regularization and discretization of bang-bang control problems Daniel Wachsmuth September 2, 205 Abstract We investigate the simultaneous regularization and discretization of

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University

More information

LA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018

LA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018 LA Support for Scalable Kernel Methods David Bindel 29 Sep 2018 Collaborators Kun Dong (Cornell CAM) David Eriksson (Cornell CAM) Jake Gardner (Cornell CS) Eric Lee (Cornell CS) Hannes Nickisch (Phillips

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

Probabilistic numerics for deep learning

Probabilistic numerics for deep learning Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling

More information