An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

Size: px
Start display at page:

Download "An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information."

Transcription

1 An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem, even without discretization errors is ill-posed. Computational Methods in Inverse Problems, Mat

2 Analytic solution Noiseless model: g(t) = Apply the Fourier transform, a(t s)f(s)ds. ĝ(k) = e ikt g(t)dt. The Convolution Theorem implies so, by inverse Fourier transform, ĝ(k) = â(k) f(k), f(t) = 2π itk ĝ(k) e â(k) dk. Computational Methods in Inverse Problems, Mat

3 Example: Gaussian kernel, noisy data: ( a(t) = exp ) 2πω 2 2ω 2 t2. In the frequency domain, â(k) = exp ( 2 ) ω2 k 2. If ĝ(k) = â(k) f(k) + ê(k), the estimate f est (t) of f based on the Convolution Theorem is f est (t) = f(t) + ( ) ê(k)exp 2π 2 ω2 k 2 + ikt dk, which may not be even well defined unless ê(k) drops faster than superexponentially at high frequencies. Computational Methods in Inverse Problems, Mat

4 Finite sampling, finite interval: Discretized problem g(t j ) = a(t j s)f(s)ds + e j, j m. Discrete approximation of the integral by a quadrature rule. Use the piecewise constant approximation with uniformly distributed sampling, a(t j s)f(s)ds n j= n a(t j s k )f(s k ) = n a jk x k, j= where s k = k n, x k = f(s k ), a jk = n a(t j s k ). Computational Methods in Inverse Problems, Mat

5 Discrete equation Additive Gaussian noise: likelihood b = Ax + e, b j = g(t j ). Stochastic extension: X, E and B random variables, B = AX + E. Assume that E is Gaussian white noise with variance σ 2, ( E N (, σ 2 I), or π noise (e) exp ) 2σ 2 e 2, where σ 2 is assumed known. Likelihood density ( π(b x) exp ) b Ax 2. 2σ2 Computational Methods in Inverse Problems, Mat

6 Prior according to the information available Prior information: The signal f varies rather slowly, the slope being not more than of the order one, and vanishes at t =. Write an autoregressive Markov model (ARMA): X j = X j + θw j, W j N (, ), X =. x j x j STD=γ Computational Methods in Inverse Problems, Mat

7 How do we choose the variance θ of the innovation process? Use the slope information: The slope is S j = n(x j X j ) = n θw j and P { S j < 2n } θ.95, 2n θ = two standard deviations. A reasonable choice would then be 2n θ =, or θ = 4n 2. Computational Methods in Inverse Problems, Mat

8 The ARMA model in matrix form: X j X j = θw j, j n, gives where LX = θw, L = W N (, I),. Prior density: (/ θ)lx = W, so π prior (x) exp( 2θ Lx 2 ), θ = 4n 2. Computational Methods in Inverse Problems, Mat

9 Posterior density From Bayes formula, the posterior density is π post (x) = π(x b) π prior (x)π(b x) ( exp 2σ 2 b Ax 2 ) 2θ Lx 2. This is a Gaussian density, whose maximum is the Maximum A Posteriori (MAP) estimate, ( x MAP = argmin 2σ 2 b Ax 2 + ) 2θ Lx 2, or, equivalently, the least squares solution of the system [ ] [ ] σ A σ θ /2 x = b. L Computational Methods in Inverse Problems, Mat

10 Posterior covariance: by completing the square, 2σ 2 b Ax 2 + ( 2θ Lx 2 = x T σ 2 AT A + ) θ LT L }{{} =Γ = ( x Γ ) T ( σ 2 AT b Γ x Γ ) σ 2 AT b ( ) x+2x T σ 2 AT b + σ 2 b 2 + terms independent of x. Denoting x = Γ σ 2 AT b = ( σ 2 AT A + ) θ LT L σ 2 AT b. Computational Methods in Inverse Problems, Mat

11 The posterior density is π post (x) exp ( ) 2 (x x ) T Γ (x x ), or in other words, π post N (x, Γ), Γ = ( σ 2 AT A + ) θ LT L, x = Γ σ 2 AT b = ( σ 2 AT A + ) θ LT L σ 2 AT b = x MAP. Computational Methods in Inverse Problems, Mat

12 Example Gaussian convolution kernel, ( a(t) = exp ) 2πω 2 2ω 2 t2 Number of discretization points = 6. See the attached Matlab file Deconvloution.m: Test : Input function triangular pulse,, ω =.5. f(t) = { αt, t t =.7, β( t), t > t, where α =.6, β = α t t =.4 >. Hence, the prior information about the slopes is slightly too conservative. Computational Methods in Inverse Problems, Mat

13 Posterior credibility envelope Plotting the MAP estimate and the 95% credibility envelope: If X is distributed according to the Gaussian posterior density, var(x j ) = Γ jj. With 95% a posteriori probability (x ) j 2 Γ jj < X j < (x ) j + 2 Γ jj. Try the algorithm with various values of θ. Computational Methods in Inverse Problems, Mat

14 Test 2: Input function, whose slope is locally larger than one: ( f(t) = (t.5)exp ) (t.5)2, δ 2 =. 2δ2 The slope in the interval [.4,.6] is much larger than one, while outside the interval it is very small. The algorithm does not perform well. Assume that we, for some source, have the following prior information: Prior information: The signal is almost constant (the slope of the order., say) outside the interval [.4,, 6], while within this interval, the slope can be of order. Computational Methods in Inverse Problems, Mat

15 Non-stationary ARMA model Write Write an autoregressive Markov model (ARMA): X j = X j + θ j W j, W j N (, ), X =, i.e., the variance of the innovation process varies. How to choose the variance? Slope information: The slope is S j = n(x j X j ) = n θ j W j and P { S j < 2n } θ j.95, 2n θ = two standard deviations. A reasonable choice would then be 2n {,.4 < tj <.6 θ j =. else Computational Methods in Inverse Problems, Mat

16 Define the variance vector θ R n, θ j = 25 n 2, if.4 < t j <.6 θ j = 4n 2, else, The ARMA model in matrix form: D /2 LX = W N (, I), where D = diag(θ). This leads to a prior model π prior (x) exp ( 2 ) D /2 Lx 2. Computational Methods in Inverse Problems, Mat

17 Posterior density From Bayes formula, the posterior density is π post (x) = π(x b) π prior (x)π(b x) ( exp 2σ 2 b Ax 2 ) 2 D /2 Lx 2. This is a Gaussian density, whose maximum is the Maximum A Posteriori (MAP) estimate, ( x MAP = argmin 2σ 2 b Ax 2 + ) 2 D /2 Lx 2, or, equivalently, the least squares solution of the system [ ] [ ] σ A σ D /2 x = b. L Computational Methods in Inverse Problems, Mat

18 Gaussian posterior density Posterior mean and covariance π post N (x, Γ), where Γ = ( ) σ 2 AT A + L T D L, x = Γ σ 2 AT b = ( σ 2 AT A + L T D L ) σ 2 AT b = x MAP. Computational Methods in Inverse Problems, Mat

19 Test 2 continued: Use the non-stationary ARMA model based prior. Try different values for the variance in the interval. Observe the effect on the posterior covariance envelopes. Test 3: Go back to the stationary ARMA model, and assume this time that f(t) = {, τ < t < τ 2, else where τ = 5/6, τ 2 = 9/6. Observe that the prior is badly in conflict with the reality. Notice that the posterior envelopes are very tight, but the true signal is not within. Is it wrong to say: The p% credibility envelopes contain the true signal with probability p%. Computational Methods in Inverse Problems, Mat

20 Smoothness prior and discontinuities Assume that the prior information is: Prior information: The signal is almost constant (the slope of the order., say), but is suffers a jump of order one somewhere around t 5 = 6/6 and t 9 = 9/6. Build an ARMA model based on this information: θ j =, except around j = 5 and j = 9, 4n2 and adjust the variances around the jumps to correspond to the jump amplitude, e.g., θ j = 4, corresponding to standard deviation θ j =.5. Notice the large posterior variance around the jumps! Computational Methods in Inverse Problems, Mat

21 Prior based on incorrect information Note: I avoid the term incorrect prior: prior is what we believe a priori, it is not right or wrong, but evidence may prove to be against it or supporting it. In simulations, it is easy to judge a prior incorrect, because we control the truth. Test 4: What happens if we have incorrect information concerning the jumps? We believe that there is a third jump, which does not exist in the input; or the prior information concerning the jump locations is completely wrong. Computational Methods in Inverse Problems, Mat

22 A step up: unknown variance Going back to the original problem, but with less prior information: Prior information: The signal f varies rather slowly, and vanishes at t =, but no particular information about the slope is available. As before, start with the initial ARMA model: or in the matrix form, X j = X j + θw j, W j N (, ), X =, LX = θw, W N (, I). As before, we write the prior, pretending that we knew the variance, π prior (x θ) = C exp ( 2θ ) Lx 2. Computational Methods in Inverse Problems, Mat

23 Normalizing constant: the integral of the density has to be one, = C(θ) exp ( 2θ ) Lx 2 dx, R n and with the change of variables x = θz, dx = θ n/2 dz, we obtain so we deduce that = θ n/2 C(θ) exp ( 2 ) Lz 2 dz, R } n {{} independent of θ C(θ) θ n/2, and we may write π prior (x θ) exp ( 2θ Lx 2 n2 ) log θ. Computational Methods in Inverse Problems, Mat

24 Stochastic extension Since θ is not known, it will be treated as a random variable Θ. Any information concerning Θ is then coded the prior probability density called the hyperprior, π hyper (θ). The inverse problem is to infer on the pair of unknown, (X, Θ). The joint prior density is π prior (x, θ) = π prior (x θ)π hyper (θ). The posterior density of (X, Θ) is, by Bayes formula, π post (x, θ) π prior (x, θ)π(b x) = π prior (x θ)π hyper (θ)π(b x). Computational Methods in Inverse Problems, Mat

25 Here, we assume that the only prior information concerning Θ is its positivity, π hyper (θ)π + (θ) = {, θ >, θ Note: this is, in fact an improper density, since i is not integrable. In practice, we assume an upper bound that we hope will never play a role. The posterior density is now π post (x, θ) π + (θ)exp ( 2σ 2 b Ax 2 2θ Lx 2 n ) 2 log θ. Computational Methods in Inverse Problems, Mat

26 Sequential optimization: Iterative algorithm for the MAP estimate. Initialize θ = θ >, k =. 2. Update x, 3. Update θ, x k = argmax { π post (x θ k ) }. θ k = argmax { π post (θ x k ) }. 4. Increase k by one and repeat from 2. until convergence. Notice: π post (x θ) means that we simply fix θ. Computational Methods in Inverse Problems, Mat

27 Updating steps in practice: Updating x: Since θ = θ k is fixed, we simply have { x k = argmin 2σ 2 b Ax 2 + } Lx 2, 2θ k which is the Good Old Least Squares Solution (GOLSQ). Updating θ: The likelihood does not depend on θ, and so we have { θ k = argmin 2θ Lx k 2 + n } 2 log θ, which is the zero of the derivative of the objective function, or 2θ 2 Lx k 2 + n 2θ =, θ k = Lx k 2 n. Computational Methods in Inverse Problems, Mat

28 Undetermined variances Qualitative description of problem: Given a noisy indirect observation, recover a signal or an image that varies slowly except for unknown number of jumps of unknown size and location. Information concerning the jumps: The jumps should be sudden, suggesting that the variances should be mutually independent. There is no obvious preference of one location over others, therefore the components should be identically distributed. Only a few variances can be significantly large, while most of them should be small, suggesting a hyperprior that allows rare outliers. Computational Methods in Inverse Problems, Mat

29 Candidate probability densities: Gamma distribution, θ j Gamma(α, θ ), inverse Gamma distribution, π hyper (θ) n j= ( θ α j exp θ ) j, () θ θ j InvGamma(α, θ ), π hyper (θ) n j= ( θ α j exp θ ). (2) θ j Computational Methods in Inverse Problems, Mat

30 Random Draws x 4 x Computational Methods in Inverse Problems, Mat

31 Estimating the MAP Outline of the algorithm is as follows:. Initialize θ = θ, k =. 2. Update the estimate of the increments: x k = argmax π(x, θ k b). 3. Update the estimate of the variances θ: θ k = argmax π(x k, θ b). 4. Increase k by one and repeat from 2. until convergence. Computational Methods in Inverse Problems, Mat

32 In practice: F (x, θ b) = log ( π(x, θ b) ), which becomes, when the hyperprior is the gamma distribution, F (x, θ b) 2σ 2 Az b D /2 Lx 2 + n θ j θ j= and, when the hyperprior is the inverse gamma distribution, where D = D θ. F (x, θ b) 2σ 2 Ax b D /2 Lx 2 + θ n j= θ j + ( α 3 ) n log θ j, 2 j= ( α + 3 ) n log θ j, 2 j= Computational Methods in Inverse Problems, Mat

33 . Updating x: x k = argmin ( 2σ 2 Ax b 2 + ) 2 D /2 Lx 2, D = D θ k, that is, x k is the least squares solution of the system [ ] [ ] (/σ)a (/σ)b D /2 x =. L 2. Updating θ: θ k j satisfies θ j F (x k, θ) = 2 ( z k j θ j ) 2 + θ ( α 3 ) =, 2 θ j z k = Lx k, which has an explicit solution, θj k (zj = θ η k + )2 + η 2θ 2, η = 2 ( α 3 ). 2 Computational Methods in Inverse Problems, Mat

34 Computed example: Signal and data Gaussian blur, noise level 2% Computational Methods in Inverse Problems, Mat

35 MAP estimate, Gamma hyperprior x Iteration 5 t Iteration 5 t 2 3 Computational Methods in Inverse Problems, Mat

36 MAP estimate, Gamma hyperprior 2.5 x Computational Methods in Inverse Problems, Mat

37 MAP estimate, Inverse Gamma hyperprior Iteration 5 t Iteration 5 t 2 3 Computational Methods in Inverse Problems, Mat

38 MAP estimate, Inverse Gamma hyperprior Computational Methods in Inverse Problems, Mat

Inverse Problems in the Bayesian Framework

Inverse Problems in the Bayesian Framework Inverse Problems in the Bayesian Framework Daniela Calvetti Case Western Reserve University Cleveland, Ohio Raleigh, NC, July 2016 Bayes Formula Stochastic model: Two random variables X R n, B R m, where

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning

A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning D Calvetti 1 A Pascarella 3 F Pitolli 2 E Somersalo 1 B Vantaggi 2 1 Case Western Reserve University Department

More information

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J. Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical

More information

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Resolving the White Noise Paradox in the Regularisation of Inverse Problems 1 / 32 Resolving the White Noise Paradox in the Regularisation of Inverse Problems Hanne Kekkonen joint work with Matti Lassas and Samuli Siltanen Department of Mathematics and Statistics University of

More information

Marginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density

Marginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density Marginal density If the unknown is of the form x = x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density πx 1 y) = πx 1, x 2 y)dx 2 = πx 2 )πx 1 y, x 2 )dx 2 needs to be

More information

The problem is to infer on the underlying probability distribution that gives rise to the data S.

The problem is to infer on the underlying probability distribution that gives rise to the data S. Basic Problem of Statistical Inference Assume that we have a set of observations S = { x 1, x 2,..., x N }, xj R n. The problem is to infer on the underlying probability distribution that gives rise to

More information

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016 Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström

More information

Inverse problem and optimization

Inverse problem and optimization Inverse problem and optimization Laurent Condat, Nelly Pustelnik CNRS, Gipsa-lab CNRS, Laboratoire de Physique de l ENS de Lyon Decembre, 15th 2016 Inverse problem and optimization 2/36 Plan 1. Examples

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing George Papandreou and Alan Yuille Department of Statistics University of California, Los Angeles ICCV Workshop on Information

More information

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.

DSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc. DSGE Methods Estimation of DSGE models: Maximum Likelihood & Bayesian Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de Summer

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

Statistical Estimation of the Parameters of a PDE

Statistical Estimation of the Parameters of a PDE PIMS-MITACS 2001 1 Statistical Estimation of the Parameters of a PDE Colin Fox, Geoff Nicholls (University of Auckland) Nomenclature for image recovery Statistical model for inverse problems Traditional

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

Computer Vision Group Prof. Daniel Cremers. 3. Regression

Computer Vision Group Prof. Daniel Cremers. 3. Regression Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the

More information

Gibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel

Gibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel Gibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel K(x, y) = n π(y i y 1,..., y i 1, x i+1,..., x m ), i=1 and we set r(x) = 0. (move every time)

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Miscellaneous. Regarding reading materials. Again, ask questions (if you have) and ask them earlier

Miscellaneous. Regarding reading materials. Again, ask questions (if you have) and ask them earlier Miscellaneous Regarding reading materials Reading materials will be provided as needed If no assigned reading, it means I think the material from class is sufficient Should be enough for you to do your

More information

Lecture 6: Bayesian Inference in SDE Models

Lecture 6: Bayesian Inference in SDE Models Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs

More information

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30 Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Spatially Adaptive Smoothing Splines

Spatially Adaptive Smoothing Splines Spatially Adaptive Smoothing Splines Paul Speckman University of Missouri-Columbia speckman@statmissouriedu September 11, 23 Banff 9/7/3 Ordinary Simple Spline Smoothing Observe y i = f(t i ) + ε i, =

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

MAT Inverse Problems, Part 2: Statistical Inversion

MAT Inverse Problems, Part 2: Statistical Inversion MAT-62006 Inverse Problems, Part 2: Statistical Inversion S. Pursiainen Department of Mathematics, Tampere University of Technology Spring 2015 Overview The statistical inversion approach is based on the

More information

Probabilistic numerics for deep learning

Probabilistic numerics for deep learning Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Bayesian modelling of fmri time series

Bayesian modelling of fmri time series Bayesian modelling of fmri time series Pedro A. d. F. R. Højen-Sørensen, Lars K. Hansen and Carl Edward Rasmussen Department of Mathematical Modelling, Building 321 Technical University of Denmark DK-28

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Expectation Maximization Deconvolution Algorithm

Expectation Maximization Deconvolution Algorithm Epectation Maimization Deconvolution Algorithm Miaomiao ZHANG March 30, 2011 Abstract In this paper, we use a general mathematical and eperimental methodology to analyze image deconvolution. The main procedure

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

ECE531 Homework Assignment Number 6 Solution

ECE531 Homework Assignment Number 6 Solution ECE53 Homework Assignment Number 6 Solution Due by 8:5pm on Wednesday 3-Mar- Make sure your reasoning and work are clear to receive full credit for each problem.. 6 points. Suppose you have a scalar random

More information

X t = a t + r t, (7.1)

X t = a t + r t, (7.1) Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30 Problem Set MAS 6J/1.16J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Variational Methods in Bayesian Deconvolution

Variational Methods in Bayesian Deconvolution PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

A Bayesian Model for Root Computation

A Bayesian Model for Root Computation A Bayesian Model for Root Computation Josef Schicho, RICAM joint work with Hanna K. Pikkarainen, RICAM Approximate Algebraic Computation Observation: choose any computational problem in algebra that involves

More information

Stochastic Spectral Approaches to Bayesian Inference

Stochastic Spectral Approaches to Bayesian Inference Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak Gaussian processes and bayesian optimization Stanisław Jastrzębski kudkudak.github.io kudkudak Plan Goal: talk about modern hyperparameter optimization algorithms Bayes reminder: equivalent linear regression

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49 State-space Model Eduardo Rossi University of Pavia November 2013 Rossi State-space Model Financial Econometrics - 2013 1 / 49 Outline 1 Introduction 2 The Kalman filter 3 Forecast errors 4 State smoothing

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors

Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors Division of Engineering & Applied Science Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors IPAM, UCLA, November 14, 2017 Matt Dunlop Victor Chen (Caltech) Omiros Papaspiliopoulos (ICREA,

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I 1 Sum rule: If set {x i } is exhaustive and exclusive, pr(x

More information

Encoding or decoding

Encoding or decoding Encoding or decoding Decoding How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms for extracting a stimulus

More information

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions December 14, 2016 Questions Throughout the following questions we will assume that x t is the state vector at time t, z t is the

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information