An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.
|
|
- Emerald Garrison
- 5 years ago
- Views:
Transcription
1 An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem, even without discretization errors is ill-posed. Computational Methods in Inverse Problems, Mat
2 Analytic solution Noiseless model: g(t) = Apply the Fourier transform, a(t s)f(s)ds. ĝ(k) = e ikt g(t)dt. The Convolution Theorem implies so, by inverse Fourier transform, ĝ(k) = â(k) f(k), f(t) = 2π itk ĝ(k) e â(k) dk. Computational Methods in Inverse Problems, Mat
3 Example: Gaussian kernel, noisy data: ( a(t) = exp ) 2πω 2 2ω 2 t2. In the frequency domain, â(k) = exp ( 2 ) ω2 k 2. If ĝ(k) = â(k) f(k) + ê(k), the estimate f est (t) of f based on the Convolution Theorem is f est (t) = f(t) + ( ) ê(k)exp 2π 2 ω2 k 2 + ikt dk, which may not be even well defined unless ê(k) drops faster than superexponentially at high frequencies. Computational Methods in Inverse Problems, Mat
4 Finite sampling, finite interval: Discretized problem g(t j ) = a(t j s)f(s)ds + e j, j m. Discrete approximation of the integral by a quadrature rule. Use the piecewise constant approximation with uniformly distributed sampling, a(t j s)f(s)ds n j= n a(t j s k )f(s k ) = n a jk x k, j= where s k = k n, x k = f(s k ), a jk = n a(t j s k ). Computational Methods in Inverse Problems, Mat
5 Discrete equation Additive Gaussian noise: likelihood b = Ax + e, b j = g(t j ). Stochastic extension: X, E and B random variables, B = AX + E. Assume that E is Gaussian white noise with variance σ 2, ( E N (, σ 2 I), or π noise (e) exp ) 2σ 2 e 2, where σ 2 is assumed known. Likelihood density ( π(b x) exp ) b Ax 2. 2σ2 Computational Methods in Inverse Problems, Mat
6 Prior according to the information available Prior information: The signal f varies rather slowly, the slope being not more than of the order one, and vanishes at t =. Write an autoregressive Markov model (ARMA): X j = X j + θw j, W j N (, ), X =. x j x j STD=γ Computational Methods in Inverse Problems, Mat
7 How do we choose the variance θ of the innovation process? Use the slope information: The slope is S j = n(x j X j ) = n θw j and P { S j < 2n } θ.95, 2n θ = two standard deviations. A reasonable choice would then be 2n θ =, or θ = 4n 2. Computational Methods in Inverse Problems, Mat
8 The ARMA model in matrix form: X j X j = θw j, j n, gives where LX = θw, L = W N (, I),. Prior density: (/ θ)lx = W, so π prior (x) exp( 2θ Lx 2 ), θ = 4n 2. Computational Methods in Inverse Problems, Mat
9 Posterior density From Bayes formula, the posterior density is π post (x) = π(x b) π prior (x)π(b x) ( exp 2σ 2 b Ax 2 ) 2θ Lx 2. This is a Gaussian density, whose maximum is the Maximum A Posteriori (MAP) estimate, ( x MAP = argmin 2σ 2 b Ax 2 + ) 2θ Lx 2, or, equivalently, the least squares solution of the system [ ] [ ] σ A σ θ /2 x = b. L Computational Methods in Inverse Problems, Mat
10 Posterior covariance: by completing the square, 2σ 2 b Ax 2 + ( 2θ Lx 2 = x T σ 2 AT A + ) θ LT L }{{} =Γ = ( x Γ ) T ( σ 2 AT b Γ x Γ ) σ 2 AT b ( ) x+2x T σ 2 AT b + σ 2 b 2 + terms independent of x. Denoting x = Γ σ 2 AT b = ( σ 2 AT A + ) θ LT L σ 2 AT b. Computational Methods in Inverse Problems, Mat
11 The posterior density is π post (x) exp ( ) 2 (x x ) T Γ (x x ), or in other words, π post N (x, Γ), Γ = ( σ 2 AT A + ) θ LT L, x = Γ σ 2 AT b = ( σ 2 AT A + ) θ LT L σ 2 AT b = x MAP. Computational Methods in Inverse Problems, Mat
12 Example Gaussian convolution kernel, ( a(t) = exp ) 2πω 2 2ω 2 t2 Number of discretization points = 6. See the attached Matlab file Deconvloution.m: Test : Input function triangular pulse,, ω =.5. f(t) = { αt, t t =.7, β( t), t > t, where α =.6, β = α t t =.4 >. Hence, the prior information about the slopes is slightly too conservative. Computational Methods in Inverse Problems, Mat
13 Posterior credibility envelope Plotting the MAP estimate and the 95% credibility envelope: If X is distributed according to the Gaussian posterior density, var(x j ) = Γ jj. With 95% a posteriori probability (x ) j 2 Γ jj < X j < (x ) j + 2 Γ jj. Try the algorithm with various values of θ. Computational Methods in Inverse Problems, Mat
14 Test 2: Input function, whose slope is locally larger than one: ( f(t) = (t.5)exp ) (t.5)2, δ 2 =. 2δ2 The slope in the interval [.4,.6] is much larger than one, while outside the interval it is very small. The algorithm does not perform well. Assume that we, for some source, have the following prior information: Prior information: The signal is almost constant (the slope of the order., say) outside the interval [.4,, 6], while within this interval, the slope can be of order. Computational Methods in Inverse Problems, Mat
15 Non-stationary ARMA model Write Write an autoregressive Markov model (ARMA): X j = X j + θ j W j, W j N (, ), X =, i.e., the variance of the innovation process varies. How to choose the variance? Slope information: The slope is S j = n(x j X j ) = n θ j W j and P { S j < 2n } θ j.95, 2n θ = two standard deviations. A reasonable choice would then be 2n {,.4 < tj <.6 θ j =. else Computational Methods in Inverse Problems, Mat
16 Define the variance vector θ R n, θ j = 25 n 2, if.4 < t j <.6 θ j = 4n 2, else, The ARMA model in matrix form: D /2 LX = W N (, I), where D = diag(θ). This leads to a prior model π prior (x) exp ( 2 ) D /2 Lx 2. Computational Methods in Inverse Problems, Mat
17 Posterior density From Bayes formula, the posterior density is π post (x) = π(x b) π prior (x)π(b x) ( exp 2σ 2 b Ax 2 ) 2 D /2 Lx 2. This is a Gaussian density, whose maximum is the Maximum A Posteriori (MAP) estimate, ( x MAP = argmin 2σ 2 b Ax 2 + ) 2 D /2 Lx 2, or, equivalently, the least squares solution of the system [ ] [ ] σ A σ D /2 x = b. L Computational Methods in Inverse Problems, Mat
18 Gaussian posterior density Posterior mean and covariance π post N (x, Γ), where Γ = ( ) σ 2 AT A + L T D L, x = Γ σ 2 AT b = ( σ 2 AT A + L T D L ) σ 2 AT b = x MAP. Computational Methods in Inverse Problems, Mat
19 Test 2 continued: Use the non-stationary ARMA model based prior. Try different values for the variance in the interval. Observe the effect on the posterior covariance envelopes. Test 3: Go back to the stationary ARMA model, and assume this time that f(t) = {, τ < t < τ 2, else where τ = 5/6, τ 2 = 9/6. Observe that the prior is badly in conflict with the reality. Notice that the posterior envelopes are very tight, but the true signal is not within. Is it wrong to say: The p% credibility envelopes contain the true signal with probability p%. Computational Methods in Inverse Problems, Mat
20 Smoothness prior and discontinuities Assume that the prior information is: Prior information: The signal is almost constant (the slope of the order., say), but is suffers a jump of order one somewhere around t 5 = 6/6 and t 9 = 9/6. Build an ARMA model based on this information: θ j =, except around j = 5 and j = 9, 4n2 and adjust the variances around the jumps to correspond to the jump amplitude, e.g., θ j = 4, corresponding to standard deviation θ j =.5. Notice the large posterior variance around the jumps! Computational Methods in Inverse Problems, Mat
21 Prior based on incorrect information Note: I avoid the term incorrect prior: prior is what we believe a priori, it is not right or wrong, but evidence may prove to be against it or supporting it. In simulations, it is easy to judge a prior incorrect, because we control the truth. Test 4: What happens if we have incorrect information concerning the jumps? We believe that there is a third jump, which does not exist in the input; or the prior information concerning the jump locations is completely wrong. Computational Methods in Inverse Problems, Mat
22 A step up: unknown variance Going back to the original problem, but with less prior information: Prior information: The signal f varies rather slowly, and vanishes at t =, but no particular information about the slope is available. As before, start with the initial ARMA model: or in the matrix form, X j = X j + θw j, W j N (, ), X =, LX = θw, W N (, I). As before, we write the prior, pretending that we knew the variance, π prior (x θ) = C exp ( 2θ ) Lx 2. Computational Methods in Inverse Problems, Mat
23 Normalizing constant: the integral of the density has to be one, = C(θ) exp ( 2θ ) Lx 2 dx, R n and with the change of variables x = θz, dx = θ n/2 dz, we obtain so we deduce that = θ n/2 C(θ) exp ( 2 ) Lz 2 dz, R } n {{} independent of θ C(θ) θ n/2, and we may write π prior (x θ) exp ( 2θ Lx 2 n2 ) log θ. Computational Methods in Inverse Problems, Mat
24 Stochastic extension Since θ is not known, it will be treated as a random variable Θ. Any information concerning Θ is then coded the prior probability density called the hyperprior, π hyper (θ). The inverse problem is to infer on the pair of unknown, (X, Θ). The joint prior density is π prior (x, θ) = π prior (x θ)π hyper (θ). The posterior density of (X, Θ) is, by Bayes formula, π post (x, θ) π prior (x, θ)π(b x) = π prior (x θ)π hyper (θ)π(b x). Computational Methods in Inverse Problems, Mat
25 Here, we assume that the only prior information concerning Θ is its positivity, π hyper (θ)π + (θ) = {, θ >, θ Note: this is, in fact an improper density, since i is not integrable. In practice, we assume an upper bound that we hope will never play a role. The posterior density is now π post (x, θ) π + (θ)exp ( 2σ 2 b Ax 2 2θ Lx 2 n ) 2 log θ. Computational Methods in Inverse Problems, Mat
26 Sequential optimization: Iterative algorithm for the MAP estimate. Initialize θ = θ >, k =. 2. Update x, 3. Update θ, x k = argmax { π post (x θ k ) }. θ k = argmax { π post (θ x k ) }. 4. Increase k by one and repeat from 2. until convergence. Notice: π post (x θ) means that we simply fix θ. Computational Methods in Inverse Problems, Mat
27 Updating steps in practice: Updating x: Since θ = θ k is fixed, we simply have { x k = argmin 2σ 2 b Ax 2 + } Lx 2, 2θ k which is the Good Old Least Squares Solution (GOLSQ). Updating θ: The likelihood does not depend on θ, and so we have { θ k = argmin 2θ Lx k 2 + n } 2 log θ, which is the zero of the derivative of the objective function, or 2θ 2 Lx k 2 + n 2θ =, θ k = Lx k 2 n. Computational Methods in Inverse Problems, Mat
28 Undetermined variances Qualitative description of problem: Given a noisy indirect observation, recover a signal or an image that varies slowly except for unknown number of jumps of unknown size and location. Information concerning the jumps: The jumps should be sudden, suggesting that the variances should be mutually independent. There is no obvious preference of one location over others, therefore the components should be identically distributed. Only a few variances can be significantly large, while most of them should be small, suggesting a hyperprior that allows rare outliers. Computational Methods in Inverse Problems, Mat
29 Candidate probability densities: Gamma distribution, θ j Gamma(α, θ ), inverse Gamma distribution, π hyper (θ) n j= ( θ α j exp θ ) j, () θ θ j InvGamma(α, θ ), π hyper (θ) n j= ( θ α j exp θ ). (2) θ j Computational Methods in Inverse Problems, Mat
30 Random Draws x 4 x Computational Methods in Inverse Problems, Mat
31 Estimating the MAP Outline of the algorithm is as follows:. Initialize θ = θ, k =. 2. Update the estimate of the increments: x k = argmax π(x, θ k b). 3. Update the estimate of the variances θ: θ k = argmax π(x k, θ b). 4. Increase k by one and repeat from 2. until convergence. Computational Methods in Inverse Problems, Mat
32 In practice: F (x, θ b) = log ( π(x, θ b) ), which becomes, when the hyperprior is the gamma distribution, F (x, θ b) 2σ 2 Az b D /2 Lx 2 + n θ j θ j= and, when the hyperprior is the inverse gamma distribution, where D = D θ. F (x, θ b) 2σ 2 Ax b D /2 Lx 2 + θ n j= θ j + ( α 3 ) n log θ j, 2 j= ( α + 3 ) n log θ j, 2 j= Computational Methods in Inverse Problems, Mat
33 . Updating x: x k = argmin ( 2σ 2 Ax b 2 + ) 2 D /2 Lx 2, D = D θ k, that is, x k is the least squares solution of the system [ ] [ ] (/σ)a (/σ)b D /2 x =. L 2. Updating θ: θ k j satisfies θ j F (x k, θ) = 2 ( z k j θ j ) 2 + θ ( α 3 ) =, 2 θ j z k = Lx k, which has an explicit solution, θj k (zj = θ η k + )2 + η 2θ 2, η = 2 ( α 3 ). 2 Computational Methods in Inverse Problems, Mat
34 Computed example: Signal and data Gaussian blur, noise level 2% Computational Methods in Inverse Problems, Mat
35 MAP estimate, Gamma hyperprior x Iteration 5 t Iteration 5 t 2 3 Computational Methods in Inverse Problems, Mat
36 MAP estimate, Gamma hyperprior 2.5 x Computational Methods in Inverse Problems, Mat
37 MAP estimate, Inverse Gamma hyperprior Iteration 5 t Iteration 5 t 2 3 Computational Methods in Inverse Problems, Mat
38 MAP estimate, Inverse Gamma hyperprior Computational Methods in Inverse Problems, Mat
Inverse Problems in the Bayesian Framework
Inverse Problems in the Bayesian Framework Daniela Calvetti Case Western Reserve University Cleveland, Ohio Raleigh, NC, July 2016 Bayes Formula Stochastic model: Two random variables X R n, B R m, where
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationA hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning
A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning D Calvetti 1 A Pascarella 3 F Pitolli 2 E Somersalo 1 B Vantaggi 2 1 Case Western Reserve University Department
More informationDeblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.
Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical
More informationResolving the White Noise Paradox in the Regularisation of Inverse Problems
1 / 32 Resolving the White Noise Paradox in the Regularisation of Inverse Problems Hanne Kekkonen joint work with Matti Lassas and Samuli Siltanen Department of Mathematics and Statistics University of
More informationMarginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density
Marginal density If the unknown is of the form x = x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density πx 1 y) = πx 1, x 2 y)dx 2 = πx 2 )πx 1 y, x 2 )dx 2 needs to be
More informationThe problem is to infer on the underlying probability distribution that gives rise to the data S.
Basic Problem of Statistical Inference Assume that we have a set of observations S = { x 1, x 2,..., x N }, xj R n. The problem is to infer on the underlying probability distribution that gives rise to
More informationSpatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016
Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström
More informationInverse problem and optimization
Inverse problem and optimization Laurent Condat, Nelly Pustelnik CNRS, Gipsa-lab CNRS, Laboratoire de Physique de l ENS de Lyon Decembre, 15th 2016 Inverse problem and optimization 2/36 Plan 1. Examples
More informationThe Metropolis-Hastings Algorithm. June 8, 2012
The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationEfficient Variational Inference in Large-Scale Bayesian Compressed Sensing
Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing George Papandreou and Alan Yuille Department of Statistics University of California, Los Angeles ICCV Workshop on Information
More informationDSGE Methods. Estimation of DSGE models: Maximum Likelihood & Bayesian. Willi Mutschler, M.Sc.
DSGE Methods Estimation of DSGE models: Maximum Likelihood & Bayesian Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de Summer
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationStatistical techniques for data analysis in Cosmology
Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction
More informationStatistical Estimation of the Parameters of a PDE
PIMS-MITACS 2001 1 Statistical Estimation of the Parameters of a PDE Colin Fox, Geoff Nicholls (University of Auckland) Nomenclature for image recovery Statistical model for inverse problems Traditional
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationDynamic System Identification using HDMR-Bayesian Technique
Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in
More informationComputer Vision Group Prof. Daniel Cremers. 3. Regression
Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the
More informationGibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel
Gibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel K(x, y) = n π(y i y 1,..., y i 1, x i+1,..., x m ), i=1 and we set r(x) = 0. (move every time)
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationMiscellaneous. Regarding reading materials. Again, ask questions (if you have) and ask them earlier
Miscellaneous Regarding reading materials Reading materials will be provided as needed If no assigned reading, it means I think the material from class is sufficient Should be enough for you to do your
More informationLecture 6: Bayesian Inference in SDE Models
Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs
More informationProblem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30
Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSpatially Adaptive Smoothing Splines
Spatially Adaptive Smoothing Splines Paul Speckman University of Missouri-Columbia speckman@statmissouriedu September 11, 23 Banff 9/7/3 Ordinary Simple Spline Smoothing Observe y i = f(t i ) + ε i, =
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationMAT Inverse Problems, Part 2: Statistical Inversion
MAT-62006 Inverse Problems, Part 2: Statistical Inversion S. Pursiainen Department of Mathematics, Tampere University of Technology Spring 2015 Overview The statistical inversion approach is based on the
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationStrong Lens Modeling (II): Statistical Methods
Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution
More informationBayesian modelling of fmri time series
Bayesian modelling of fmri time series Pedro A. d. F. R. Højen-Sørensen, Lars K. Hansen and Carl Edward Rasmussen Department of Mathematical Modelling, Building 321 Technical University of Denmark DK-28
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationExpectation Maximization Deconvolution Algorithm
Epectation Maimization Deconvolution Algorithm Miaomiao ZHANG March 30, 2011 Abstract In this paper, we use a general mathematical and eperimental methodology to analyze image deconvolution. The main procedure
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.
More informationECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering
ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationECE531 Homework Assignment Number 6 Solution
ECE53 Homework Assignment Number 6 Solution Due by 8:5pm on Wednesday 3-Mar- Make sure your reasoning and work are clear to receive full credit for each problem.. 6 points. Suppose you have a scalar random
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
More informationSequential Monte Carlo Samplers for Applications in High Dimensions
Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex
More informationProblem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30
Problem Set MAS 6J/1.16J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationVariational Methods in Bayesian Deconvolution
PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationA Bayesian Model for Root Computation
A Bayesian Model for Root Computation Josef Schicho, RICAM joint work with Hanna K. Pikkarainen, RICAM Approximate Algebraic Computation Observation: choose any computational problem in algebra that involves
More informationStochastic Spectral Approaches to Bayesian Inference
Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationGaussian Processes for Machine Learning
Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationF denotes cumulative density. denotes probability density function; (.)
BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationGaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak
Gaussian processes and bayesian optimization Stanisław Jastrzębski kudkudak.github.io kudkudak Plan Goal: talk about modern hyperparameter optimization algorithms Bayes reminder: equivalent linear regression
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationKalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q
Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationAlgorithmisches Lernen/Machine Learning
Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines
More informationChapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)
HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts
ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes
More informationPattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods
Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49
State-space Model Eduardo Rossi University of Pavia November 2013 Rossi State-space Model Financial Econometrics - 2013 1 / 49 Outline 1 Introduction 2 The Kalman filter 3 Forecast errors 4 State smoothing
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationCOMP 551 Applied Machine Learning Lecture 19: Bayesian Inference
COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationInverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology
Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x
More informationFast Likelihood-Free Inference via Bayesian Optimization
Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology
More informationRobust MCMC Sampling with Non-Gaussian and Hierarchical Priors
Division of Engineering & Applied Science Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors IPAM, UCLA, November 14, 2017 Matt Dunlop Victor Chen (Caltech) Omiros Papaspiliopoulos (ICREA,
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More information1 Bayesian Linear Regression (BLR)
Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,
More informationBayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I
Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I 1 Sum rule: If set {x i } is exhaustive and exclusive, pr(x
More informationEncoding or decoding
Encoding or decoding Decoding How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms for extracting a stimulus
More informationCIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions
CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions December 14, 2016 Questions Throughout the following questions we will assume that x t is the state vector at time t, z t is the
More informationSimulation. Where real stuff starts
1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?
More information