Inverse Problems in the Bayesian Framework
|
|
- Sherilyn Greer
- 5 years ago
- Views:
Transcription
1 Inverse Problems in the Bayesian Framework Daniela Calvetti Case Western Reserve University Cleveland, Ohio Raleigh, NC, July 2016
2 Bayes Formula Stochastic model: Two random variables X R n, B R m, where B is the observed quantity, X is the quantity of primary interest. Goal: Find the posterior probability density, π X (x b) = density of X, given the observation B = b, b = b observed.
3 Bayes Formula Prior information: Given the prior density π X (x), encoding our prior information or prior belief about the possible vaues of X. Likelihood: Assuming that X = x, what would the forward model predict for the value distribution of B? Encode this information in π B (b x). Evidence: With prior and likelihood, compute π B (b) = π XB (x, b)dx = π B (b x)π X (x)dx. R n R n Bayes formula for probability densities π X (x b) = π B(b x)π X (x). π B (b)
4 Linear Inverse Problems Consider the problem of estimating x from b = Ax + e, A R m n. Stochastic extension: Write B = AX + E, and assume the noise and prior model X N (0, D), E N (0, C). Usually it is assumed that X and E are independent. In particular, E { XE T} = E { X } E { E } T = 0.
5 Linear Inverse Problems However, this is not necessary, and we may have E { XE T} = R R n m. Define a new random variable [ X Z = B ] R n+m. Covariance matrix of Z: ZZ T = = [ ] X [ X B T B T ] [ XX T XB T ] BX T BB T. Compute the expectation of this matrix.
6 Linear Inverse Problems Expectations: E { XX T} = D, E { XB T} = E { X (AX + E) T} = E { XX T A T + XE T} = E { XX T} A T + E { XE T} = DA T + R. Furthermore, E { BX T} = E { XB T} T = AD + R T,
7 Linear Inverse Problems and, finally, E { BB T} = E { (AX + E)(AX + E) T} = E { AXX T A T + EX T A T + AXE T + EE T} = ADA T + R T A T + AR + C. Conclusion: [ Cov(Z) = D AD + R T DA T + R ADA T + R T A T + AR + C ] [ Γ11 Γ = 12 Γ 21 Γ 22 ].
8 Schur Complements and Conditioning Given a Gaussian random variable [ ] X Z = R n+m B with covariance Γ R (m+n) (m+n), what is the probability density of X, given B = b?
9 Schur Complements and Conditioning Assume that X N (0, Γ), where Γ R n n is a given SPD matrix. Partitioning of X, X = [ X1 X 2 ] R k R n k. Question: Assume that X 2 = x 2 is observed. What is the conditional probability density of X 1, π X1 (x 1 x 2 ) =?
10 Schur Complements and Conditioning Write π X (x) = π X1,X 2 (x 1, x 2 ). Bayes formula: The distribution of unknown part x 1 provided that x 2 is known, is π X1 (x 1 x 2 ) π X1,X 2 (x 1, x 2 ), x 2 = x 2,observed. In terms of the Gaussian density, π X1,X 2 (x 1, x 2 ) exp ( 1 ) 2 x T Γ 1 x. (1)
11 Schur Complements and Conditioning Partitioning of the covariance matrix: [ ] Γ11 Γ Γ = 12 R n n, (2) Γ 21 Γ 22 where and Γ 11 R k k, Γ 22 R (n k) (n k), k < n, Γ 12 = Γ T 21 R k (n k).
12 Schur Complements and Conditioning Precision matrix B = Γ 1. Partition B: [ B11 B B = 12 B 21 B 22 ] R n n. (3) Quadratic form x T Bx appearing in the exponential: [ ] [ ] [ B11 B Bx = 12 x1 B11 x = 1 + B 12 x 2 B 21 B 22 x 2 B 21 x 1 + B 22 x 2 ],
13 Schur Complements and Conditioning x T Bx = [ x1 T x2 T ] [ ] B 11 x 1 + B 12 x 2 B 21 x 1 + B 22 x 2 = x1 T ( ) ( ) B11 x 1 + B 12 x 2 + x T 2 B21 x 1 + B 22 x 2 = x1 T B 11 x 1 + 2x1 T B 12 x 2 + x2 T B 22 x 2 = ( x 1 + B 1 11 B ) TB11 ( 12x 2 x1 + B 1 11 B ) 12x 2 + x2 T ( B22 B 21 B 1 11 B 12) x2. }{{} independent of x 1
14 Schur Complements and Conditioning From this key equation for conditional densities it follows that ( π X1 (x 1 x 2 ) exp 1 ( x1 + B B ) TB11 ( 12x 2 x1 + B 1 11 B ) ) 12x 2. Thus the conditional density is Gaussian, with mean and covariance matrix x 1 = B 1 11 B 12x 2, C = B Question: How to express these formulas in terms of Γ?
15 Schur Complements and Conditioning Consider a partitioned SPD matrix Γ R n n. For any v R k, x 0 v T Γ 11 v = [ v T 0 ] [ ] [ Γ 11 Γ 12 v T Γ 21 Γ 22 0 ] > 0, showing the positive definiteness of Γ 11. The same holds for Γ 22. In particular, Γ 11 and Γ 22 are invertible.
16 Schur Complements and Conditioning To calculate the inverse of Γ, we solve the equation Γx = y in block form. By partitioning, x = [ x1 x 2 ] R k R n k, y = [ y1 y 2 ] R k R n k. we have Γ 11 x 1 + Γ 12 x 2 = y 1, Γ 21 x 1 + Γ 22 x 2 = y 2.
17 Schur Complements and Conditioning Eliminate x 2 from the second equation, x 2 = Γ 1 ( ) 22 y2 Γ 21 x 1, substitute back into the first equation: Γ 11 x 1 + Γ 12 Γ 1 ( ) 22 y2 Γ 21 x 1 = y1, and by rearranging the terms, ( Γ11 Γ 12 Γ 1 22 Γ ) 21 x1 = y 1 Γ 12 Γ 1 22 y 2. Define the Schur complement of Γ 22 : Γ22 = Γ 11 Γ 12 Γ 1 22 Γ 21
18 Schur Complements and Conditioning It can be shown that Γ 22 must be invertible, and therefore x 1 = 1 Γ 22 y 1 1 Γ 22 Γ 12Γ 1 22 y 2. Similarly, interchanging the roles of x 1 and x 2, where x 2 = is the Schur complement of Γ Γ 11 y 1 2 Γ 11 Γ 21Γ 1 11 y 1, Γ11 = Γ 22 Γ 21 Γ 1 11 Γ 12 In matrix form: [ ] [ x1 = x 2 Γ 1 22 Γ 1 22 Γ 12Γ 1 22 Γ 1 11 Γ 1 11 Γ 21Γ 1 11 ] [ ] y1 y 2
19 Schur Complements and Conditioning Conclusion: [ Γ 1 = Γ 1 22 Γ 1 22 Γ 12Γ 1 22 Γ 1 11 Γ 1 11 Γ 21Γ 1 11 ] [ B11 B = 12 B 21 B 22 ]. Thus The conditional density π X1 (x 1 x 2 ) is a Gaussian π X1 (x 1 x 2 ) N (x 1, C), where and x 1 = B 1 11 B 12x 2 = Γ 12 Γ 1 22 x 2, C = B 1 11 = Γ 22.
20 Linear Inverse Problems For simplicity, let us assume that R = 0. Posterior density π X (x b) is a Gaussian density, with mean and covariance x = Γ 12 Γ 1 22 b = DAT( ADA T + C ) 1 b, Φ = Γ 22 = Γ 11 Γ 12 Γ 1 22 Γ 21 = D DA T( ADA T + C ) 1 AD.
21 Example: Numerical differentiation Discretization: f (t) = t 0 g(τ)dτ + noise. b = Ax + e, where 1 A = n
22 Stochastic extension Stochastic model B = AX + E. Model for noise: Independent components, Probability density: π E (e) = E j N (0, σ 2 ), 1 j n. ( ) 1 n/2 ( 2πσ 2 exp 1 ) 2σ 2 e 2. Likelihood: ( π B (b x) exp 1 b Ax 2 2σ2 ).
23 Autoregressive Prior Models Discrete model, x j = g(t j ), t j = j n, 0 j n, Consider two possible prior models: 1. We know that x 0 = 0, and believe that the absolute value of the slope of g is bounded by some m 1 > We know that x 0 = x n = 0 and believe that the curvature of g is bounded by some m 2 > 0.
24 Autoregressive models 1. Slope: g (t j ) x j x j 1, h = 1 h n, Prior information: We believe that x j x j 1 h m 1 with some uncertainty. 2. Curvature: g (t j ) x j 1 2x j + x j+1 h 2. Prior information: We believe that x j 1 2x j + x j+1 h 2 m 2 with some uncertainty.
25 Autoregressive model In both cases, we assume that x j is a realization of a random variable X j. Boundary conditions: 1. X 0 = 0 with certainty. Probabilistic model for X j, 1 j n. 2. X 0 = X n = 0 with certainty. Probabilistic model for X j, 1 j n 1.
26 Autoregressive prior models 1. First order prior: X j = X j 1 + γw j, W j N (0, 1), γ = h m Second order prior: X j = 1 2 (X j 1 + X j+1 ) + γw j, W j N (0, 1), γ = 1 2 h2 m 2. x j x j x j+1 x j 1 STD=γ x j 1 STD=γ
27 Matrix form: first order model System of equations: X 1 = X 1 X 0 = γw 1 X 2 X 1 = γw 2.. X n X n 1 = γw n L 1 = Rn n, X = X 1 X 2. X n, W = W 1 W 2. W n. L 1 X = γw, W N (0, I n ),
28 Matrix form: second order model System of equations: X 2 2X 1 = X 2 2X 1 + X 0 = γw 1 X 3 2X 2 + X 1 = γw 2.. 2X n 1 X n 2 = X n 2X n 1 + X n 2 = γw n 1 L 2 = R(n 1) (n 1), X = X 1 X 2., X n 1 L 2 X = γw, W N (0, I n 1 ),
29 Prior density Given a model that is, LX = γw, W N (0, I), π W (w) exp ( 12 ) w 2, we conclude that π X (x) exp ( 1 ) ( 2γ 2 Lx 2 = exp 1 [ ] ) 1 2 x T γ 2 LT L x. The inverse of the covariance matrix = precision matrix is D 1 = 1 γ 2 LT L.
30 Testing a Prior Question: Given a covariance D, how can we check if the prior corresponds to our expectations? Symmetric decomposition of the precision matrix (let γ = 1 for simplicity): D 1 = L T L. We know that W = LX N (0, I). Sampling of X : 1. Draw a realization w N (0, I n ) 2. Set x = L 1 w.
31 Random draws from priors Generate m draws from the prior using the Matlab command randn. n = 100; t = (0:1/n:1); m = 5; % number of discretization intervals % number of draws % First order model. Boundary condition X_0 = 0 L1 = diag(ones(1,n),0) - diag(ones(1,n-1),-1); gamma = 1/n; % m_1 = 1 W = gamma*randn(n,m); X = L1\W;
32 Plots of the random draws
33 Belief envelopes Diagonal elements of the posterior covariance: Γ jj = ηj 2 = posterior variance of X j. Posterior belief: x j 2η j < X j < x j + 2η j with posterior probability 95%.
34 Bayesian solution Mean solution and 2 STD belief envelope
35 Linear Inverse Problems An alternative (but equivalent) formula for the posterior: Prior, ( π X (x) exp 1 ) 2 x T D 1 x, and likelihood, Posterior density: π B (b x) exp ( 1 ) 2 (b Ax)T C 1 (b Ax). π X (x b) π X (x)π B (b x) exp ( 12 Q(x) ),
36 Linear Inverse Problems Quadratic term in the exponential: Q(x) = (b Ax) T C 1 (b Ax) + x T D 1 x Collect the terms of the same order in x together: Q(x) = x T ( A T C 1 A + D 1) x 2x T A T C 1 b + b T C 1 b. }{{} =M Complete the square: Q(x) = (x T M 1 A T C 1 b) T M(x T M 1 A T C 1 b) +...
37 Linear Inverse Problems Conclusion: The posterior mean and covariance have alternative expressions, x = ( A T C 1 A + D 1) 1 A T C 1 b and Φ = ( A T C 1 A + D 1) 1. The formula for x is also known as Wiener filtered solution.
38 Tikhonov Regularization Revisited Consider the linear model b = Ax + e, e N (0, C). Likelihood: π B (b x) exp Assume a Gaussian prior: or, in terms of densities, π X (x) exp ( 1 ) 2 (b Ax)T C 1 (b Ax). X N (0, D), ( 1 ) 2 x T D 1 x.
39 Tikhonov Regularization Revisited Bayes formula: p X (x b) p X (x)p B (b x) ( = exp 1 2 x T D 1 x 1 ) 2 (b Ax)T C 1 (b Ax). By writing D 1 = L T L, the negative of the exponent is H(x) = 1 ( b Ax 2 2 C + Lx 2), a Tikhonov functional.
40 Tikhonov Regularization Revisited Therefore, x MAP = argmax { π X (x b) } = argmin { b Ax 2 C + Lx 2}. The Tikhonov regularization parameter is absorbed in the prior covariance matrix as well as the noise covariance matrix.
41 Non-Gaussian models The Gaussian models are insufficient when the forward model is non-linear, the prior is non-gaussian, the noise is non-additive, the noise is non-gaussian.
42 Monte Carlo Integration Assume that a probability density π X is given in R n. Problem: Estimate numerically an integral of type E { f (X ) } = f (x)π X (x)dx =? R n Expectation of X : f (x) = x. Covariance of X : f (x) = (x x)(x x) T.
43 Monte Carlo Integration Difficulties with numerical integration using quadrature methods: We may not know the support of µ X (Support: The set in which the function is not vanishing). Where should we put our quadrature points? If n is large, an integration grid becomes huge: K points/direction means K n grid points. Try Monte Carlo integration!
44 Monte Carlo Integration Example: Given a two-dimensional set Ω R 2. Estimate the area of Ω. Raindrop integration: Assume that Ω Q = [0, a] [0, b]. Draw points from uniform density over Q: { x 1, x 2,..., x N}, x j Uniform(Q). Estimate of the area Ω : solve for Ω. Ω Q = Ω ab # of points x j Ω, N
45 Monte Carlo Integration The approximation corresponds to Monte Carlo integral Ω Q = 1 χ Ω (x)dx 1 Q Q N N χ Ω (x j ), j=1 where χ Ω (x) = { 1 if x Ω 0 ifx / Ω and 1/N is the equal weight that every point x j has. 1 Q χ Q(x) = uniform density over Q
46 Monte Carlo Integration Generalize: Given a probability density π X, write f (x)π X (x)dx 1 R n N N f (x j ), j=1 where { x 1, x 2,..., x N} is drawn independently from the probability distribution π X. Problem: How does one draw from a probability density in R n?
47 Sampling and Markov chains: Random walk Random walk is a process of moving around by taking random steps. Most elementary random walk: 1. Start at a point of your choice x 0 R n. 2. Draw a random vector w 1 N (0, I ) and set x 1 = x 0 + σw Repeat the process: Set x k+1 = x k + σw k+1, w k+1 N (0, I ).
48 Sampling and Markov chains: Random walk In terms of random variables: X k+1 = X k + σw k+1, W k+1 N (0, I n ). The conditional density of X k+1, given X k = x k is π(x k+1 x k ) = ( 1 (2πσ 2 exp 1 ) ) n/2 2σ 2 x k x k+1 2 = q k (x k, x k+1 ). The function q k is called the kth transition kernel.
49 Sampling and Markov chains: Random walk Since q 0 = q 1 = q 2 =..., i.e., the step is always equally distributed, we call the random walk time invariant. (k = time). The chain { Xk, k = 0, 1, } of random variables, is called discrete time stochastic process. Particular feature: the probability distribution X k depends of the past only through the previous member X k 1 : π(x k+1 x 0, x 1,..., x k ) = π(x k+1 x k ). A stochastic process having this property is called a Markov chain.
50 Sampling and Markov chains: Random walk Given: an arbitrary transition kernel q, a random variable X with probability density π X (x) = p(x), generate a new random variable Y by using the kernel q(x, y), that is, π(y x) = q(x, y). Question: What is the probability density of this new variable? The answer is found by marginalization, π Y (y) = π(y x)π X (x)dx = q(x, y)p(x)dx.
51 Sampling and Markov chains: Random walk If the probability density of the new variable is equal to the one of the old one, that is, q(x, y)p(x)dx = p(y), p is called an invariant density of the transition kernel q. The classical problem in the theory of Markov chains is: Given a transition kernel, find the corresponding invariant density.
52 Invariant density and sampling Recall the sampling problem: Given a probability density p = p(x), generate a sample that is distributed according to it. If we had a transition kernel q with invariant density p, generating such sample from p(x) would be easy: Start with some x 0 ; draw x 1 from q(x 0, x 1 ); In general, given x k, draw x k+1 from q(x k, x k+1 ). Rephrasing the sampling problem: Given a probability density p, find a kernel q such that p is its invariant density.
53 Metropolis Hastings algorithm Given a transition density y K(x, y), x R n current point, consider a Markov process: if x R n is the current point, we have two possibilities: 1. Stay at x with probability r(x), 0 r(x) < 1, 2. Move by using a transition kernel K(x, y).
54 Let x and y be realizations of random variables X, Y. π X (x) = p(x), and y is generated according to the algorithm above. Question: What is the probability density of Y?
55 Let A be the event that we opt for moving from x, A be the event of staying put. The probability of Y B R n assuming a move is P { Y B X = x, A } = K(x, y)dy. The kernel K is scaled so that P { X = x, A } = P { Y R n X = x, A } = K(x, y)dy = 1 r(x). R n (4) B
56 On the other hand, if we stay put, Y B happens only if X B, P { Y B X = x, A } = r(x)χ B (x) = where χ B is the characteristic function of B. { r(x), if x B, 0, if x / B,
57 Hence, the total probability of arriving from x to B is P { Y B X = x } = P { Y B X = x, A } + P { Y B X = x, A } = K(x, y)dy + r(x)χ B (x). B
58 Marginalize over x and calculate the probability of Y B P { Y B } = P { Y B X = x } p(x)dx = = = B B ( p(x) ( ( B ) K(x, y)dy dx + χ B (x)r(x)p(x)dx ) p(x)k(x, y)dx dy + r(x)p(x)dx B ) p(x)k(x, y)dx + r(y)p(y) dy.
59 Since P { Y B } = B π(y)dy, we must have π Y (y) = p(x)k(x, y)dx + r(y)p(y). Our goal is then to find a kernel K such that π Y (y) = p(y), that is p(y) = p(x)k(x, y)dx + r(y)p(y), or, equivalently, (1 r(y))p(y) = p(x)k(x, y)dx.
60 Substituting (4) in this formula, with the roles of x and y interchanged, we obtain p(y)k(y, x)dx = p(x)k(x, y)dx This equation is called the balance equation. This holds, in particular, if the integrands are equal, p(y)k(y, x) = p(x)k(x, y). The latter equation is known as detailed balance equation.
61 Metropolis-Hastings algorithm Start by a selecting a proposal distribution, or candidate generating kernel q(x, y); The kernel should be chosen so that generating a Markov chain with it is easy. A Gaussian kernel is a popular choice.
62 Metropolis-Hastings algorithm If q satisfies the detailed balance equation, i.e., p(y)q(y, x) = p(x)q(x, y), we are done, since p is an invariant density. More likely, the equality does not hold. If p(y)q(y, x) < p(x)q(x, y). (5) force the detailed balance equation to hold, defining K as where α is chosen so that K(x, y) = α(x, y)q(x, y), p(y)α(y, x)q(y, x) = p(x)α(x, y)q(x, y).
63 Metropolis-Hastings algorithm The kernel α need not be symmetric, so let α(y, x) = 1. Now the other factor is uniquely determined. We must have α(x, y) = p(y)q(y, x) p(x)q(x, y) < 1. Observe that if the inequality (5) goes the other way, interchange the roles of x and y, and let α(x, y) = 1. In summary K(x, y) = α(x, y)q(x, y), { } p(y)q(y, x) α(x, y) = min 1,. p(x)q(x, y)
64 Metropolis-Hastings algorithm Draws in two phases: 1. Given x, draw y using the transition kernel q(x, y). 2. Calculate the acceptance ratio, α(x, y) = p(y)q(y, x) p(x)q(x, y). 3. Flip the α coin: draw t Uniform([0, 1]); if α > t, accept y, otherwise stay where you are.
65 Metropolis-Hastings algorithm: Random walk proposal If q(x, y) = q(y, x), the algorithm simplifies: 1. Given x, draw y using the transition kernel q(x, y). 2. Calculate the acceptance ratio, α(x, y) = p(y) p(x). 3. Flip the α coin: draw t Uniform([0, 1]); if α > t, accept y, otherwise stay where you are.
66 Random walk Metropolis-Hastings 1. Set sample size N. Pick initial point x 1. Set k = Propose a new point, y = x k + δw, w N (0, I). 3. Compute the acceptance ratio, α = π(y) π(x k ). 4. Flip α-coin: Draw ξ Uniform([0, 1]), 4.1 If α ξ, accept: x k+1 = y, 4.2 If α < ξ, stay put: x k+1 = x k. 5. If k < N, increase k k + 1 and continue from 2., else stop.
67 Two steps Build the program in two steps: 1. Random walk sampling, no rejections 2. Add the rejection step
68 Sampling, no rejections nsample = 10000; % Sample size x = [0;0]; % Initial point step = 0.1; % Step size of the random walk Sample = NaN(2,nsample); % For memory allocation Sample(:,1) = x; for j = 2:N y = x + step*randn(2,1); end % Accept unconditionally x = y; Sample(:,j) = x;
69 Add the α-coin Write the condition in logarithmic form: π(y) π(x) > t log π(y) log π(x) > log t,
70 x = [0;0]; % Initial point step = 0.1; % Step size of the random walk Sample = NaN(2,nsample); % For memory allocation Sample(:,1) = x; logpx =... for j = 2:N y = x + step*randn(2,1); logpy =... t = rand; if logpy - logpx > log(t) % accept x = y; logpx = logpy; end Sample(:,j) = x; end
71 If a move is not accepted, the previous point has to be repeated:..., x k 1, x k, x k, x k, x k+1, x k+2,... }{{} rejections The acceptance rate tells the relative rate of acceptances. Too low acceptance rate: The chain does not move Too high acceptance rate: The chain is essentially a random walk and learns nothing of the underlying distribution (cf. raising kids). What is a good acceptance rate? Rule of thumb: 15%-35%.
72 Example: Inverse problem in chemical kinetics Reversible single reaction pair, A k 1 B, k 1 Data: With known initial values, measure [A](t j ), 1 j n for The noisy observation model is t min = t 1 < t 2 < < t n = t max. b j = [A](t j ) + e j, e j = additive noise, noise level =σ. Inverse Problem: Estimate k 1 and k 1.
73 Forward model: Mass Balance Equations Denote Assuming unit volume, c 1 (t) = [A](t), c 2 (t) = [B](t). or where dc 1 dt dc 2 dt = k 1 c 1 + k 1 c 2, c 1 (0) = c 01 = k 1 c 1 k 1 c 2, c 2 (0) = c 02 [ dc dt = Kc, c = c1 c 2 [ k1 k K = 1 k 1 k 1 ], ].
74 Eigenvalues of K are Time constant Eigenvectors: v 1 = λ 1 = 0, λ 2 = k 1 k 1. [ 1 δ τ = ] [, v 2 = 1 k 1 + k ], δ = k 1 k 1. Solution: c = αv 1 + βv 2 e t/τ, α, β R.
75 Initial conditions imply In particular, α = c 01 + c δ, β = δc 01 c δ where c 1 (t) = f (t; k) = c 01 + c δ k = [ k1 k 1 δc 01 c 02 e t/τ, 1 + δ ].
76 Observation model: Data consists of n measurements of c 1 (t) corrupted by additive noise, b j = f (t j, k) + e j, 1 j n. Observation errors e j mutually independent, zero mean normally distributed, e j N (0, σ 2 ).
77 Likelihood density Assume that the noise is Gaussian white noise: ( π noise (e) exp 1 ) 2σ 2 e 2. Likelihood density is π(b k) exp 1 2σ 2 n (b j f (t j, k)) 2. j=1
78 Posterior density Flat prior over an interval: We believe that 0 < k 1 K 1, 0 < k 1 K 1, with some reasonable upper bounds. Write π prior (k) χ [0,K1 ](k 1 )χ [0,K 1 ](k 1 ). Posterior density by Bayes formula, π(k b) π prior (k)π(b k). Contour plots of the posterior density?
79 Data K 1 = 6, K 1 = 2, five uniformly distributed measurements of c 1 (t) over the interval [t min, t max ], t min = 0.1 τ, t max = 4.1 τ (left), t min = 5 τ, t max = 9 τ (right)
80 Posterior densities K 1 = 6, K 1 = 2, five uniformly distributed measurements of c 1 (t) over the interval [t min, t max ], t min = 0.1 τ, t max = 4.1 τ (left), t min = 5 τ, t max = 9 τ (right) The hair cross indicates the value used for data generation.
81 MCMC exploration Generate the data: Define t 1 t 2 t =. t n, b = b 1 b 2. b n, where b j = c 01 + c δ true δ truec 01 c δ true e t/τtrue + e j. δ true = k 1,true k 1,true, τ true = 1 k 1,true + k 1,true.
82 Random walk Metropolis-Hastings Start with the transient measurements. White noise proposal, k prop = k + δw, w N (0, I ). Choose first δ = 0.1, different initial points k 0 = (1, 2) or k 0 = (5, 0.1). Acceptance rates with these values are of the order 45%.
83 Scatter plots Transient measurements, t min = 0.1 τ, t max = 4.1 τ
84 Sample histories:first component Initial value k 1 = 1 (left) and k 1 = 5 (right)
85 Sample histories: Second component Initial value k 2 = 2 (left) and k 2 = 0.2 (right)
86 Burn-in: first component Initial value k 1 = 1 (left) and k 1 = 5 (right)
87 Burn-in: Second component Initial value k 2 = 2 (left) and k 2 = 0.2 (right)
88 Scatter plots: Steady state measurement t min = 5 τ, t max = 9 τ. Use the same step size as before. Initial point (k 1, k 2 ) = (1, 2).
89 Sample histories, a.k.a. fuzzy worms Increase the step size Initial point (k 1, k 2 ) = (1, 2). Acceptance remains high, about 55%
90 Scatter plots, steady state data Increase the step size Initial point (k 1, k 2 ) = (1, 2). Acceptance remains high, about 55%
91 Fuzzy worms
An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.
An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem,
More informationMarginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density
Marginal density If the unknown is of the form x = x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density πx 1 y) = πx 1, x 2 y)dx 2 = πx 2 )πx 1 y, x 2 )dx 2 needs to be
More information16 : Markov Chain Monte Carlo (MCMC)
10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationWinter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo
Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationMarkov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018
Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationIn many cases, it is easier (and numerically more stable) to compute
In many cases, it is easier (and numerically more stable) to compute r u (x, y) := log p u (y) + log q u (y, x) log p u (x) log q u (x, y), and then accept if U k < exp ( r u (X k 1, Y k ) ) and reject
More informationStrong Lens Modeling (II): Statistical Methods
Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods
Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationThe Metropolis-Hastings Algorithm. June 8, 2012
The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings
More informationBayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems
Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.
More informationMarkov Chain Monte Carlo (MCMC)
School of Computer Science 10-708 Probabilistic Graphical Models Markov Chain Monte Carlo (MCMC) Readings: MacKay Ch. 29 Jordan Ch. 21 Matt Gormley Lecture 16 March 14, 2016 1 Homework 2 Housekeeping Due
More informationI. Bayesian econometrics
I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods
More information19 : Slice Sampling and HMC
10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationECE295, Data Assimila0on and Inverse Problems, Spring 2015
ECE295, Data Assimila0on and Inverse Problems, Spring 2015 1 April, Intro; Linear discrete Inverse problems (Aster Ch 1 and 2) Slides 8 April, SVD (Aster ch 2 and 3) Slides 15 April, RegularizaFon (ch
More informationMCMC and Gibbs Sampling. Kayhan Batmanghelich
MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction
More informationLearning the hyper-parameters. Luca Martino
Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth
More informationChapter 7. Markov chain background. 7.1 Finite state space
Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider
More informationAPM 541: Stochastic Modelling in Biology Bayesian Inference. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 53
APM 541: Stochastic Modelling in Biology Bayesian Inference Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall 2013 1 / 53 Outline Outline 1 Introduction 2 Conjugate Distributions 3 Noninformative priors
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationAnswers and expectations
Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationHastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model
UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced
More informationGibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel
Gibbs Sampler Componentwise sampling directly from the target density π(x), x R n. Define a transition kernel K(x, y) = n π(y i y 1,..., y i 1, x i+1,..., x m ), i=1 and we set r(x) = 0. (move every time)
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationComputer Vision Group Prof. Daniel Cremers. 14. Sampling Methods
Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationIntroduction to Machine Learning
Introduction to Machine Learning 12. Gaussian Processes Alex Smola Carnegie Mellon University http://alex.smola.org/teaching/cmu2013-10-701 10-701 The Normal Distribution http://www.gaussianprocess.org/gpml/chapters/
More informationMarkov chain Monte Carlo Lecture 9
Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationProbabilistic Graphical Models Lecture 17: Markov chain Monte Carlo
Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More information1 Introduction. Systems 2: Simulating Errors. Mobile Robot Systems. System Under. Environment
Systems 2: Simulating Errors Introduction Simulating errors is a great way to test you calibration algorithms, your real-time identification algorithms, and your estimation algorithms. Conceptually, the
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationBayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference
1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE
More informationThe University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),
The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), Geoff Nicholls (Statistics, Oxford) fox@math.auckland.ac.nz
More informationMarkov Chains and MCMC
Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationVariational Learning : From exponential families to multilinear systems
Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains
More informationA review of probability theory
1 A review of probability theory In this book we will study dynamical systems driven by noise. Noise is something that changes randomly with time, and quantities that do this are called stochastic processes.
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationBayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I
Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I 1 Sum rule: If set {x i } is exhaustive and exclusive, pr(x
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationSimulation - Lectures - Part III Markov chain Monte Carlo
Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT 2018. J. Berestycki. 1 / 50 Outline Markov
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationResults: MCMC Dancers, q=10, n=500
Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV
More informationF denotes cumulative density. denotes probability density function; (.)
BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationDRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute
Inversion basics y = Kx + ε x ˆ = (K T K) 1 K T y Erkki Kyrölä Finnish Meteorological Institute Day 3 Lecture 1 Retrieval techniques - Erkki Kyrölä 1 Contents 1. Introduction: Measurements, models, inversion
More information16 : Approximate Inference: Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution
More informationMarkov Chain Monte Carlo
Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More informationMobile Robotics II: Simultaneous localization and mapping
Mobile Robotics II: Simultaneous localization and mapping Introduction: probability theory, estimation Miroslav Kulich Intelligent and Mobile Robotics Group Gerstner Laboratory for Intelligent Decision
More informationECE 636: Systems identification
ECE 636: Systems identification Lectures 3 4 Random variables/signals (continued) Random/stochastic vectors Random signals and linear systems Random signals in the frequency domain υ ε x S z + y Experimental
More informationMARKOV CHAIN MONTE CARLO
MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationQuantifying Uncertainty
Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationMCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham
MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering
ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:
More informationNumerical integration and importance sampling
2 Numerical integration and importance sampling 2.1 Quadrature Consider the numerical evaluation of the integral I(a, b) = b a dx f(x) Rectangle rule: on small interval, construct interpolating function
More informationMarkov Chain Monte Carlo
1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationComputational Science and Engineering (Int. Master s Program)
Computational Science and Engineering (Int. Master s Program) TECHNISCHE UNIVERSITÄT MÜNCHEN Master s Thesis SPARSE GRID INTERPOLANTS AS SURROGATE MODELS IN STATISTICAL INVERSE PROBLEMS Author: Ao Mo-Hellenbrand
More informationUse of probability gradients in hybrid MCMC and a new convergence test
Use of probability gradients in hybrid MCMC and a new convergence test Ken Hanson Methods for Advanced Scientific Simulations Group This presentation available under http://www.lanl.gov/home/kmh/ June
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters
ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters D. Richard Brown III Worcester Polytechnic Institute 26-February-2009 Worcester Polytechnic Institute D. Richard Brown III 26-February-2009
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More information