Zig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017
|
|
- Clinton Lucas
- 5 years ago
- Views:
Transcription
1 Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
2 Acknowledgements Collaborators Andrew Duncan Paul Fearnhead Antonietta Mira Gareth oberts Financial support Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
3 Outline 1 Motivation: Markov Chain Monte Carlo 2 One-dimensional Zig-Zag process 3 Multi-dimensional ZZP 4 Subsampling 5 Doubly intractable likelihood Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
4 Bayesian inference In Bayesian inference we typically deal with a posterior density π(x) = π(x; y) L(y x)π 0 (x), x d, where L(y x) is the likelihood of the data y given parameter x d, and π 0 is a prior density for x. Quantities of interest are e.g. posterior mean xπ(x) dx, posterior variance x 2 π(x) dx ( xπ(x) dx ) 2, tail probability 1 {x c} π(x) dx. All of these involve integrals of the form h(x)π(x) dx. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
5 Evaluating h(x)π(x) dx Possible approaches: 1 Explicit (analytic) integration. arely possible Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
6 Evaluating h(x)π(x) dx Possible approaches: 1 Explicit (analytic) integration. arely possible 2 Numerical integration. Curse of dimensionality Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
7 Evaluating h(x)π(x) dx Possible approaches: 1 Explicit (analytic) integration. arely possible 2 Numerical integration. Curse of dimensionality 3 Monte Carlo. Draw independent samples (X 1, X 2,... ) from π and use the law of large numbers. equires independent samples from π 1 h(x)π(x) dx = lim K K K h(x k ). k=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
8 Evaluating h(x)π(x) dx Possible approaches: 1 Explicit (analytic) integration. arely possible 2 Numerical integration. Curse of dimensionality 3 Monte Carlo. Draw independent samples (X 1, X 2,... ) from π and use the law of large numbers. equires independent samples from π 4 Markov Chain Monte Carlo. Construct an ergodic Markov chain (X 1, X 2,... ) with invariant distribution π(x) dx, use Birkhoff s ergodic theorem. 1 h(x)π(x) dx = lim K K K h(x k ). k=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
9 One-dimensional Zig-Zag process Dynamics Continuous time Current state (X (t), Θ(t)) { 1, +1}. Move X (t) in direction Θ(t) = ±1 until a switch occurs. The switching intensity is λ(x (t), Θ(t)) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
10 elation between switching rate and potential Lf (x, θ) = θ df + λ(x, θ)(f (x, θ) f (x, θ)), x, θ { 1, +1}. dx Potential U(x) = log π(x) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
11 elation between switching rate and potential Lf (x, θ) = θ df + λ(x, θ)(f (x, θ) f (x, θ)), x, θ { 1, +1}. dx Potential U(x) = log π(x) π is invariant if and only if λ(x, +1) λ(x, 1) = U (x) for all x. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
12 elation between switching rate and potential Lf (x, θ) = θ df + λ(x, θ)(f (x, θ) f (x, θ)), x, θ { 1, +1}. dx Potential U(x) = log π(x) π is invariant if and only if λ(x, +1) λ(x, 1) = U (x) for all x. Equivalently, λ(x, θ) = γ(x) + max (0, θu (x)), γ(x) 0. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
13 elation between switching rate and potential Lf (x, θ) = θ df + λ(x, θ)(f (x, θ) f (x, θ)), x, θ { 1, +1}. dx Potential U(x) = log π(x) π is invariant if and only if λ(x, +1) λ(x, 1) = U (x) for all x. Equivalently, λ(x, θ) = γ(x) + max (0, θu (x)), γ(x) 0. Example: Gaussian distribution N (0, σ 2 ) Density π(x) exp( x 2 /(2σ 2 )) Potential U(x) = x 2 /(2σ 2 ) Derivative U (x) = x/σ 2 Switching rates λ(x, θ) = (θx/σ 2 ) + + γ(x) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
14 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
15 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) +1) λ(x, 1) = U (x). Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
16 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
17 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
18 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
19 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
20 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
21 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx = (f (x, 1) f (x, +1))(λ(x, +1) λ(x, 1)) π(x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
22 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx = (f (x, 1) f (x, +1))(λ(x, +1) λ(x, 1)) π(x) dx = (f (x, 1) f (x, +1))U (x)π(x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
23 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx = (f (x, 1) f (x, +1))(λ(x, +1) λ(x, 1)) π(x) dx = (f (x, 1) f (x, +1))U (x)π(x) dx = (f (x, 1) f (x, +1))π (x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
24 Proof of invariance of π exp( U) Lf (x, θ) = θ f (x, θ) + λ(x, θ) (f (x, θ) f (x, θ)), x λ(x, +1) λ(x, 1) = U (x). Markov semigroup P(t)f (x, θ) = E x,θ f (X (t), Θ(t)) π stationary means that P(t)f (x, θ)π(x) dx = f (x, θ)π(x) dx f D(L), t 0. θ=±1 θ=±1 Differentiating gives the equivalent condition: θ=±1 Lf (x, θ)π(x) dx = 0, f D(L). λ(x, θ) (f (x, θ) f (x, θ)) π(x) dx θ=±1 = {λ(x, +1) (f (x, 1) f (x, +1)) + λ(x, 1) (f (x, +1) f (x, 1))} π(x) dx = (f (x, 1) f (x, +1))(λ(x, +1) λ(x, 1)) π(x) dx = (f (x, 1) f (x, +1))U (x)π(x) dx = (f (x, 1) f (x, +1))π (x) dx = (f (x, 1) f (x, +1))π(x) dx = θ df (x, θ)π(x) dx. θ=±1 dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
25 Use in Monte Carlo (X (t), Θ(t)) t 0 has invariant distribution proportional to π(x). If ergodic, 1 T lim h(x (s)) ds = h(x)π(x) dx. T T 0 How to use in computations Either: Numerically integrate 1 T T 0 h(x s) ds for some finite T > 0, or Define (X 1, X 2,... ) by setting X k = X (k ) for some > 0; use as in traditional MCMC. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
26 CLT for the 1D Zig-Zag process [B., Duncan, Limit theorems for the Zig-Zag process, 2016] X (t) satisfies a Central Limit Theorem (CLT) for observable h if 1 T [h(x s ) E π h(x )] ds N (0, σh). 2 T Example: unimodal potential/density function X (t) 0 S + 1 S T 0 + T 1 T 1 + S 2 + T 2 T 2 + T 3 T 3 + S 1 S 2 S 3 t Say Y i = T + i h(x T + s ) ds. i 1 CLT for ZZP follows essentially from CLT for N(t) i=1 Y i. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
27 CLT for the 1D Zig-Zag process [B., Duncan, Limit theorems for the Zig-Zag process, 2016] General formula for asymptotic variance σh 2 = 2 (λ(x, +1) + λ(x, 1)) φ (x) 2 π(x) dx where L Langevin φ = h := h π(h). Langevin diffusion: σh 2 = 2 φ (x) 2 π(x) dx Cool results Computational efficiency for ZZP better than IID sampling for Gaussian (oscillatory ACF) Student-t distribution, ν degrees of freedom Langevin diffusion satisfies CLT for ν > 2 Zig-Zag process satisfies CLT for ν > 1. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
28 Multi-dimensional Zig-Zag process Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
29 Multi-dimensional Zig-Zag process Target π(x) = exp( U(x)) on d. Set of directions θ { 1, +1} d. Switching rates λ i (x, θ) = (θ i i U(x)) +, for i = 1,..., d. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
30 Multi-dimensional Zig-Zag process Target π(x) = exp( U(x)) on d. Set of directions θ { 1, +1} d. Switching rates λ i (x, θ) = (θ i i U(x)) +, for i = 1,..., d. Cool observation factorized target distribution π(x) = d i=1 π i(x i ) with π i (y) = exp( U i (y)). Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
31 Multi-dimensional Zig-Zag process Target π(x) = exp( U(x)) on d. Set of directions θ { 1, +1} d. Switching rates λ i (x, θ) = (θ i i U(x)) +, for i = 1,..., d. Cool observation factorized target distribution π(x) = d i=1 π i(x i ) with π i (y) = exp( U i (y)). Switching rates: λ i (x, θ) = (θ i U i (x i)) +. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
32 Multi-dimensional Zig-Zag process Target π(x) = exp( U(x)) on d. Set of directions θ { 1, +1} d. Switching rates λ i (x, θ) = (θ i i U(x)) +, for i = 1,..., d. Cool observation factorized target distribution π(x) = d i=1 π i(x i ) with π i (y) = exp( U i (y)). Switching rates: λ i (x, θ) = (θ i U i (x i)) +. Every component of the Zig-Zag process mixes at O(1). Compare to WM O (d), MALA O ( d 1/3), HMC O ( d 1/4). Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
33 Sampling x du dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
34 Sampling λ(x) = max ( ) 0, du dx x du dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
35 Sampling Λ(x) λ(x) = max ( ) 0, du dx x du dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
36 Sampling Λ(x) λ(x) = max ( ) 0, du dx T x ( draw P(T t) = exp ) t 0 Λ(X (s)) ds du dx accept T with probability λ(x (T ) Λ(X (T )) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
37 Subsampling m(x) du 1 dx du 2 dx x du dx U = 1 2 (U 1 + U 2 ) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
38 Subsampling Λ(x) λ 1 (x) λ 2 (x) du 1 dx du 2 dx x Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
39 Subsampling Λ(x) λ 1 (x) λ 2 (x) du 2 dx du 1 dx T ( draw P(T t) = exp ) t 0 Λ(X (s)) ds draw I from {1, 2} uniformly accept T with probability λ I (X (T )) Λ(X (T )) x Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
40 Subsampling Intractable likelihood, big data: U(x) = 1 n n i=1 U i(x). If π(x) n i=1 f (y i x)π 0 (x), take Theorem U i (x) = log π 0 (x) n log f (y i x). With subsampling, the Zig-Zag Process has exp( U) as invariant density. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
41 Subsampling Intractable likelihood, big data: U(x) = 1 n n i=1 U i(x). If π(x) n i=1 f (y i x)π 0 (x), take Theorem U i (x) = log π 0 (x) n log f (y i x). With subsampling, the Zig-Zag Process has exp( U) as invariant density. Proof: Effective switching rate is λ(x, θ) = 1 n n λ i (x, θ) = 1 n i=1 n (θu i (x)) +. i=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
42 Subsampling Intractable likelihood, big data: U(x) = 1 n n i=1 U i(x). If π(x) n i=1 f (y i x)π 0 (x), take Theorem U i (x) = log π 0 (x) n log f (y i x). With subsampling, the Zig-Zag Process has exp( U) as invariant density. Proof: Effective switching rate is λ(x, θ) = 1 n λ i (x, θ) = 1 n (θu i (x)) +. n n i=1 i=1 { n } λ(x, +1) λ(x, 1) = 1 n (U i (x)) + ( U i (x)) + n i=1 i=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
43 Subsampling Intractable likelihood, big data: U(x) = 1 n n i=1 U i(x). If π(x) n i=1 f (y i x)π 0 (x), take Theorem U i (x) = log π 0 (x) n log f (y i x). With subsampling, the Zig-Zag Process has exp( U) as invariant density. Proof: Effective switching rate is λ(x, θ) = 1 n λ i (x, θ) = 1 n (θu i (x)) +. n n i=1 i=1 { n } λ(x, +1) λ(x, 1) = 1 n (U i (x)) + ( U i (x)) + n = 1 n i=1 n { (U i (x)) + (U i (x)) } = 1 n i=1 i=1 n U i (x) = U (x). i=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
44 Subsampling - scaling Without subsampling, O(n) computations per O(1) update Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
45 Subsampling - scaling Without subsampling, O(n) computations per O(1) update With naive subsampling, O(1) computations per O(1/n) update Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
46 Subsampling - scaling Without subsampling, O(n) computations per O(1) update With naive subsampling, O(1) computations per O(1/n) update Subsampling with control variates, O(1) computations per O(1) update: super-efficient. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
47 Subsampling - scaling Without subsampling, O(n) computations per O(1) update With naive subsampling, O(1) computations per O(1/n) update Subsampling with control variates, O(1) computations per O(1) update: super-efficient. The Control Variates approach depends on posterior contraction and requires finding a point close to the mode: O(n) start-up cost. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
48 Control variates U(x) = 1 n n i=1 U i(x) Let x denote (a point close to) the mode of the posterior distribution. Naive subsampling: λ i (x, θ) = (θu i (x))+. Control variates: λ i (x, θ) = (θ {U i (x) + U (x ) U i (x )}) +. If x is close to the mode then U i (x) U i (x ) is small (under assumptions on U) So each λ i (x, θ) is close to the ideal switching rate (θu (x)) +. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
49 100 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
50 100 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
51 100 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
52 10,000 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
53 10,000 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
54 10,000 observations Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
55 Scaling in number of observations Zig-Zag, Zig-Zag w/subsampling, Zig-Zag w/control Variates, Zig-Zag with poor computational bound log(ess / epochs) base log(number of observations) base 2 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
56 Scaling in number of observations Zig-Zag, Zig-Zag w/subsampling, Zig-Zag w/control Variates, Zig-Zag with poor computational bound log(ess / second) base log(number of observations) base 2 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
57 Doubly intractable likelihood In many applications, the distribution of interest π has the following form. ( d ) exp i=1 x is i (y) π(x; y) = π 0 (x), x d, Z(y)M(x) where y {0, 1} n is a fixed observed realization of the forward model, ( d ) exp i=1 x is i (y) p(y x) =, M(x) s i, i = 1,..., d, are statistics which characterize the distribution of the forward model, with weights x 1,..., x d. Z(y) usual normalization constant Computational problem: Computation of M(x) is O(2 n ): M(x) = ( d ) x i s i (y). y {0,1} n exp i=1 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
58 Examples of doubly intractable likelihood p(y x) = Ising model (physics, image analysis) ( d ) exp i=1 x is i (y), x d, y {0, 1} n. M(x) s 1 (y) = y T Wy, where W is an interaction matrix s 2 (y) = h T y, where h represents an external magnetic field x 1, x 2 serve as inverse temperatures Exponential andom Graph Model random graphs over k vertices, with n := 1 2k(k 1) possible edges y 1,..., y n indicate the presence of an edge s 1 (y): number of edges in the random graph s 2 (y): e.g. number of triangles in the random graph Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
59 The Zig-Zag process applied to doubly intractable likelihood For simplicity, say x and ignore prior distribution. π(x; y) = so that exp (xs(y)) Z(y)M(x), M(x) = For the derivative of U we find z {0,1} n exp (xs(z)) x, y {0, 1} n, U(x) = log π(x; y) = xs(y) + log M(x). U (x) = s(y) + d log M(x) dx Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
60 The Zig-Zag process applied to doubly intractable likelihood For simplicity, say x and ignore prior distribution. π(x; y) = so that exp (xs(y)) Z(y)M(x), M(x) = For the derivative of U we find z {0,1} n exp (xs(z)) x, y {0, 1} n, U(x) = log π(x; y) = xs(y) + log M(x). U (x) = s(y) + d log M(x) dx z {0,1} n exp (xs(z)) s(z) = s(y) + M(x) Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
61 The Zig-Zag process applied to doubly intractable likelihood For simplicity, say x and ignore prior distribution. π(x; y) = so that exp (xs(y)) Z(y)M(x), M(x) = For the derivative of U we find z {0,1} n exp (xs(z)) x, y {0, 1} n, U(x) = log π(x; y) = xs(y) + log M(x). U (x) = s(y) + d log M(x) dx z {0,1} n exp (xs(z)) s(z) = s(y) + = s(y) + E x [s(y )], M(x) where Y is a realization of the forward model with parameter x. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
62 The Zig-Zag process applied to doubly intractable likelihood U (x) = s(y) + E x [s(y )]. Switching rate complexity O(2 n ). For x, θ { 1, +1}, λ(x, θ) = max(θu (x), 0) = max ( θs(y) + θe x [s(y )], 0). Idea: Use unbiased estimate of E x [s(y )] Crude algorithm for determining next switch: 1 Determine upper bound Λ(x) for λ(x, θ) 2 Generate switching ( time according to P(T t) = exp t ). 0 Λ(X (r)) dr d 3 Obtain unbiased estimate Ĝ of dx U(x) 4 Accept switch with probability max(0, θĝ)/λ(x (T )), otherwise repeat. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
63 Unbiased estimation of E x [s(y )] Two possible approaches: perfect sampling, coupling from the past (Propp, Wilson, 1996): use Glauber dynamics in ingenious way to obtain a sample Y which is distributed exactly according to the forward distribution π( x). Disadvantages: Not applicable to all discrete models Exponentially slow convergence in cold temperature regimes Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
64 Unbiased estimation of E x [s(y )] Two possible approaches: perfect sampling, coupling from the past (Propp, Wilson, 1996): use Glauber dynamics in ingenious way to obtain a sample Y which is distributed exactly according to the forward distribution π( x). Disadvantages: Not applicable to all discrete models Exponentially slow convergence in cold temperature regimes unbiased MCMC sampling (Glynn, hee, 2014): introduce N-valued random variable N. Define i := s(y i ) s(ỹi) where (Y i ) and (Ỹi) are two realizations of Glauber dynamics, correlated in a specific way. Unbiased estimate N i Ĝ = P(N i). i=0 Disadvantages: no global upper bound for estimate, variance may be extremely large. Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
65 Zig-Zag Process We can use piecewise deterministic Markov processes for sampling Unbiased estimate for the log density gradient results in correct invariant distribution. Significantly better scaling than IID sampling for big data Doubly intractable likelihood: work in progress Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
66 eferences B., oberts, A piecewise deterministic scaling limit of Lifted Metropolis-Hastings in the Curie-Weiss model, to appear in Annals of Applied Probability, 2015, B., Fearnhead, oberts, The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data, 2016, B., Duncan, Limit theorems for the Zig-Zag process, 2016, B., Fearnhead, Pollock, oberts, Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo, Thank you! Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, / 33
The zig-zag and super-efficient sampling for Bayesian analysis of big data
The zig-zag and super-efficient sampling for Bayesian analysis of big data LMS-CRiSM Summer School on Computational Statistics 15th July 2018 Gareth Roberts, University of Warwick Joint work with Joris
More informationCarlo. Correspondence: February 16, Abstract
Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo Paul Fearnhead 1,, Joris Bierkens 2, Murray Pollock 3 and Gareth O Roberts 3 1 Department of Mathematics and Statistics, Lancaster
More informationPractical unbiased Monte Carlo for Uncertainty Quantification
Practical unbiased Monte Carlo for Uncertainty Quantification Sergios Agapiou Department of Statistics, University of Warwick MiR@W day: Uncertainty in Complex Computer Models, 2nd February 2015, University
More informationarxiv: v1 [stat.co] 2 Nov 2017
Binary Bouncy Particle Sampler arxiv:1711.922v1 [stat.co] 2 Nov 217 Ari Pakman Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationAdvances and Applications in Perfect Sampling
and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC
More informationSequential Monte Carlo Samplers for Applications in High Dimensions
Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex
More informationRiemann Manifold Methods in Bayesian Statistics
Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationControl Variates for Markov Chain Monte Carlo
Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability
More informationAnswers and expectations
Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationProbabilistic Graphical Models Lecture 17: Markov chain Monte Carlo
Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationBayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference
1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE
More informationOn Markov chain Monte Carlo methods for tall data
On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationMonte Carlo in Bayesian Statistics
Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview
More informationLikelihood-free MCMC
Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte
More informationPaul Karapanagiotidis ECO4060
Paul Karapanagiotidis ECO4060 The way forward 1) Motivate why Markov-Chain Monte Carlo (MCMC) is useful for econometric modeling 2) Introduce Markov-Chain Monte Carlo (MCMC) - Metropolis-Hastings (MH)
More informationAdaptive Monte Carlo methods
Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert
More information19 : Slice Sampling and HMC
10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often
More information16 : Markov Chain Monte Carlo (MCMC)
10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions
More informationPattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods
Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationNotes on pseudo-marginal methods, variational Bayes and ABC
Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution
More informationMonte Carlo methods for sampling-based Stochastic Optimization
Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from
More informationApproximate Bayesian Computation and Particle Filters
Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra
More informationDeblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.
Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationComputer intensive statistical methods
Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationInexact approximations for doubly and triply intractable problems
Inexact approximations for doubly and triply intractable problems March 27th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers of) interacting
More informationSequential Monte Carlo Methods in High Dimensions
Sequential Monte Carlo Methods in High Dimensions Alexandros Beskos Statistical Science, UCL Oxford, 24th September 2012 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Andrew Stuart Imperial College,
More informationThe University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),
The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), Geoff Nicholls (Statistics, Oxford) fox@math.auckland.ac.nz
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof
More informationMarkov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018
Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationPseudo-marginal MCMC methods for inference in latent variable models
Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016
More informationStat 451 Lecture Notes Monte Carlo Integration
Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationSampling Algorithms for Probabilistic Graphical models
Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir
More informationQuantifying Uncertainty
Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationAn introduction to Sequential Monte Carlo
An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationComputer intensive statistical methods
Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of
More informationAsymptotics and Simulation of Heavy-Tailed Processes
Asymptotics and Simulation of Heavy-Tailed Processes Department of Mathematics Stockholm, Sweden Workshop on Heavy-tailed Distributions and Extreme Value Theory ISI Kolkata January 14-17, 2013 Outline
More informationIntroduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang
Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical Inference Joint Distribution Sleeps Well Playground Sunny Bike Ride Pleasant dinner Productive day Posterior Estimation
More information6 Markov Chain Monte Carlo (MCMC)
6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution
More informationIntroduction to Rare Event Simulation
Introduction to Rare Event Simulation Brown University: Summer School on Rare Event Simulation Jose Blanchet Columbia University. Department of Statistics, Department of IEOR. Blanchet (Columbia) 1 / 31
More informationIntroduction to Stochastic Gradient Markov Chain Monte Carlo Methods
Introduction to Stochastic Gradient Markov Chain Monte Carlo Methods Changyou Chen Department of Electrical and Computer Engineering, Duke University cc448@duke.edu Duke-Tsinghua Machine Learning Summer
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationFast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data
Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data Maksym Byshkin 1, Alex Stivala 4,1, Antonietta Mira 1,3, Garry Robins 2, Alessandro Lomi 1,2 1 Università della Svizzera
More informationAn ABC interpretation of the multiple auxiliary variable method
School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle
More informationA = {(x, u) : 0 u f(x)},
Draw x uniformly from the region {x : f(x) u }. Markov Chain Monte Carlo Lecture 5 Slice sampler: Suppose that one is interested in sampling from a density f(x), x X. Recall that sampling x f(x) is equivalent
More informationGradient-based Monte Carlo sampling methods
Gradient-based Monte Carlo sampling methods Johannes von Lindheim 31. May 016 Abstract Notes for a 90-minute presentation on gradient-based Monte Carlo sampling methods for the Uncertainty Quantification
More informationMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods John Geweke University of Iowa, USA 2005 Institute on Computational Economics University of Chicago - Argonne National Laboaratories July 22, 2005 The problem p (θ, ω I)
More informationLearning Energy-Based Models of High-Dimensional Data
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal
More informationLECTURE 15 Markov chain Monte Carlo
LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationPseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory
Pseudo-arginal Metropolis-Hastings: a siple explanation and (partial) review of theory Chris Sherlock Motivation Iagine a stochastic process V which arises fro soe distribution with density p(v θ ). Iagine
More informationKernel Adaptive Metropolis-Hastings
Kernel Adaptive Metropolis-Hastings Arthur Gretton,?? Gatsby Unit, CSML, University College London NIPS, December 2015 Arthur Gretton (Gatsby Unit, UCL) Kernel Adaptive Metropolis-Hastings 12/12/2015 1
More informationLecture 8: Bayesian Estimation of Parameters in State Space Models
in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space
More information1 Geometry of high dimensional probability distributions
Hamiltonian Monte Carlo October 20, 2018 Debdeep Pati References: Neal, Radford M. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo 2.11 (2011): 2. Betancourt, Michael. A conceptual
More informationInference in state-space models with multiple paths from conditional SMC
Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September
More informationRetail Planning in Future Cities A Stochastic Dynamical Singly Constrained Spatial Interaction Model
Retail Planning in Future Cities A Stochastic Dynamical Singly Constrained Spatial Interaction Model Mark Girolami Department of Mathematics, Imperial College London The Alan Turing Institute Lloyds Register
More informationStochastic modelling of urban structure
Stochastic modelling of urban structure Louis Ellam Department of Mathematics, Imperial College London The Alan Turing Institute https://iconicmath.org/ IPAM, UCLA Uncertainty quantification for stochastic
More informationSampling Methods (11/30/04)
CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationarxiv: v1 [stat.co] 23 Nov 2016
Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo Paul Fearnhead 1,, Joris Bierkens 2, Murray Pollock 2 and Gareth O Roberts 2 1 Department of Mathematics and Statistics, Lancaster
More informationPhysics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester
Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation
More informationSurveying the Characteristics of Population Monte Carlo
International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of
More informationMarkov Chain Monte Carlo
1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationHastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model
UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced
More informationCS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling
CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy
More informationMarkov Chain Monte Carlo, Numerical Integration
Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical
More informationMCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17
MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationVariational Inference via Stochastic Backpropagation
Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation
More informationBayesian parameter estimation in predictive engineering
Bayesian parameter estimation in predictive engineering Damon McDougall Institute for Computational Engineering and Sciences, UT Austin 14th August 2014 1/27 Motivation Understand physical phenomena Observations
More informationWeak convergence of Markov chain Monte Carlo II
Weak convergence of Markov chain Monte Carlo II KAMATANI, Kengo Mar 2011 at Le Mans Background Markov chain Monte Carlo (MCMC) method is widely used in Statistical Science. It is easy to use, but difficult
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationMachine Learning. Probabilistic KNN.
Machine Learning. Mark Girolami girolami@dcs.gla.ac.uk Department of Computing Science University of Glasgow June 21, 2007 p. 1/3 KNN is a remarkably simple algorithm with proven error-rates June 21, 2007
More informationBayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems
Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationA stochastic formulation of a dynamical singly constrained spatial interaction model
A stochastic formulation of a dynamical singly constrained spatial interaction model Mark Girolami Department of Mathematics, Imperial College London The Alan Turing Institute, British Library Lloyds Register
More informationPatterns of Scalable Bayesian Inference Background (Session 1)
Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles
More informationDAG models and Markov Chain Monte Carlo methods a short overview
DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex
More informationExample: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected
4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More informationMarkov Chains and MCMC
Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationMCMC and Gibbs Sampling. Kayhan Batmanghelich
MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction
More informationEvidence estimation for Markov random fields: a triply intractable problem
Evidence estimation for Markov random fields: a triply intractable problem January 7th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers
More information