Nonparametric Drift Estimation for Stochastic Differential Equations

Size: px
Start display at page:

Download "Nonparametric Drift Estimation for Stochastic Differential Equations"

Transcription

1 Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos, Y. Pokern and A. Stuart

2 Centre for Research in Statistical Methodology Conferences and workshops Academic visitor programme Currently preparing for a pre-valencia meeting on Model Uncertainty to take place between 30st May and 1st June.

3 Fitting SDEs to Molecular Dynamics MD data X(m t) R d Multiple Timescales High frequency data High dimension, only few dimensions of chemical interest Diffusion good description at some timescales only.

4 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

5 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

6 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

7 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

8 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

9 Plan Start from SDE dx t = b(x t )dt + db t, X(0) = x 0 and high-frequency discrete time observations x i. Write down likelihood for b( ). Manipulate likelihood to make the local time L an (almost) sufficient statistic. Specify prior on function space H, compute posterior. Make Bayesian framework rigorous. Make numerics robust. Application: Toy example from Molecular Dynamics

10 SDE properties - Girsanov dx t = b(x t )dt + db t Generates measure P on path space C ([0, T ], [0, 2π)). P is absolutely continuous w.r.t. W generated by Brownian Motion. The likelihood (Radon-Nikodym derivative) is dp dw = exp (I[b]) I[b] viewed as functional of the drift: T (b 2 (X t )dt 2b(X t )dx t ) I[b] = 1 2 0

11 SDE properties - Girsanov dx t = b(x t )dt + db t Generates measure P on path space C ([0, T ], [0, 2π)). P is absolutely continuous w.r.t. W generated by Brownian Motion. The likelihood (Radon-Nikodym derivative) is dp dw = exp (I[b]) I[b] viewed as functional of the drift: T (b 2 (X t )dt 2b(X t )dx t ) I[b] = 1 2 0

12 SDE properties - Girsanov dx t = b(x t )dt + db t Generates measure P on path space C ([0, T ], [0, 2π)). P is absolutely continuous w.r.t. W generated by Brownian Motion. The likelihood (Radon-Nikodym derivative) is dp dw = exp (I[b]) I[b] viewed as functional of the drift: T (b 2 (X t )dt 2b(X t )dx t ) I[b] = 1 2 0

13 A hint from the discrete problem n state Markov chain, discrete time, with transition matrix p 11 p 12 p 1n p 21 p 22 p 2n p n1 p n2 p nn Data X 0, X 1,... X T. Likelihood is T t=1 p Xt 1,X t Information about the ith row is only available from visits to i. So, let L i = T t=1 1 X t 1 =i be the local time at i, and factorise: n p i,xt. i=1 t; X t 1 =i Conditional on L, we have n independent inference problems.

14 A hint from the discrete problem n state Markov chain, discrete time, with transition matrix p 11 p 12 p 1n p 21 p 22 p 2n p n1 p n2 p nn Data X 0, X 1,... X T. Likelihood is T t=1 p Xt 1,X t Information about the ith row is only available from visits to i. So, let L i = T t=1 1 X t 1 =i be the local time at i, and factorise: n p i,xt. i=1 t; X t 1 =i Conditional on L, we have n independent inference problems.

15 A hint from the discrete problem n state Markov chain, discrete time, with transition matrix p 11 p 12 p 1n p 21 p 22 p 2n p n1 p n2 p nn Data X 0, X 1,... X T. Likelihood is T t=1 p Xt 1,X t Information about the ith row is only available from visits to i. So, let L i = T t=1 1 X t 1 =i be the local time at i, and factorise: n p i,xt. i=1 t; X t 1 =i Conditional on L, we have n independent inference problems.

16 A hint from the discrete problem n state Markov chain, discrete time, with transition matrix p 11 p 12 p 1n p 21 p 22 p 2n p n1 p n2 p nn Data X 0, X 1,... X T. Likelihood is T t=1 p Xt 1,X t Information about the ith row is only available from visits to i. So, let L i = T t=1 1 X t 1 =i be the local time at i, and factorise: n p i,xt. i=1 t; X t 1 =i Conditional on L, we have n independent inference problems.

17 Diffusion Local Time Local time is defined (at least in one-dimension) as the occupation density at a location: L(a) = lim ɛ 0 T s=0 1(X s (a ɛ, a + ɛ))ds 2ɛ This gives a natural way to replace time averages by space averages. T 0 f (x t )dt = f (a)l(a)da R

18 Inference for SDEs - parametric dx t = b(x t )dt + σ(x t )db t, Estimate drift function b( ) from observations {x i } M i=1, x i = x(i t) High frequency setup ( t 0), discretely observed case ( t = O(1)), or more generally... Drift functions parametrised by θ R N Could just choose a basis {b j }. Substantial theory and methodology available for these cases, and huge numbers of successful applications. BUT choice of basis functions can be far from obvious.

19 Inference for SDEs - parametric dx t = b(x t, θ)dt + σ(x t )db t, Estimate drift function b( ) from observations {x i } M i=1, x i = x(i t) High frequency setup ( t 0), discretely observed case ( t = O(1)), or more generally... Drift functions parametrised by θ R N Could just choose a basis {b j }. Substantial theory and methodology available for these cases, and huge numbers of successful applications. BUT choice of basis functions can be far from obvious.

20 Inference for SDEs - parametric dx = m θ j b j (X t )dt + σ(x t )db t, j=1 Estimate drift function b( ) from observations {x i } M i=1, x i = x(i t) High frequency setup ( t 0), discretely observed case ( t = O(1)), or more generally... Drift functions parametrised by θ R N Could just choose a basis {b j }. Substantial theory and methodology available for these cases, and huge numbers of successful applications. BUT choice of basis functions can be far from obvious.

21 Inference for SDEs - parametric Estimate drift function b( ) from observations {x i } M i=1, x i = x(i t) High frequency setup ( t 0), discretely observed case ( t = O(1)), or more generally... Drift functions parametrised by θ R N Could just choose a basis {b j }. Substantial theory and methodology available for these cases, and huge numbers of successful applications. BUT choice of basis functions can be far from obvious.

22 Inference for SDEs - parametric Estimate drift function b( ) from observations {x i } M i=1, x i = x(i t) High frequency setup ( t 0), discretely observed case ( t = O(1)), or more generally... Drift functions parametrised by θ R N Could just choose a basis {b j }. Substantial theory and methodology available for these cases, and huge numbers of successful applications. BUT choice of basis functions can be far from obvious.

23 The Likelihood Functional I[b] = log dp T ) (b dw = 1 2 (X t )dt 2b(X t )dx t 2 0 Start from the log-likelihood. We write V (x) = x 0 b(u)du. Apply Ito s formula for V (x) to rewrite the stochastic integral as boundary terms plus correction. Replace Integrals along the trajectory by integral against local time L(a)da.

24 The Likelihood Functional I[b] = log dp T ) (b dw = 1 2 (X t )dt 2b(X t )dx t 2 0 Start from the log-likelihood. We write V (x) = x 0 b(u)du. Apply Ito s formula for V (x) to rewrite the stochastic integral as boundary terms plus correction. Replace Integrals along the trajectory by integral against local time L(a)da.

25 The Likelihood Functional I[b] = W 2 (V (2π) V (0)) (V (X T ) V (X 0 )) 1 2 T 0 ( ) b 2 (X t ) + 2b (X t ) dt. Start from the log-likelihood. We write V (x) = x 0 b(u)du. Apply Ito s formula for V (x) to rewrite the stochastic integral as boundary terms plus correction. Replace Integrals along the trajectory by integral against local time L(a)da.

26 The Likelihood Functional I[b] = W 2 (V (2π) V (0)) (V (X T ) V (X 0 )) 1 2 2π 0 ( ) b 2 (a) + 2b (a) L(a)da. Start from the log-likelihood. We write V (x) = x 0 b(u)du. Apply Ito s formula for V (x) to rewrite the stochastic integral as boundary terms plus correction. Replace Integrals along the trajectory by integral against local time L(a)da.

27 The Likelihood Functional I[b] = W π 0 (V (2π) V (0)) [( ) ] b 2 (a) + 2b (a) L(a) χ x0,x T b(a) da. Start from the log-likelihood. We write V (x) = x 0 b(u)du. Apply Ito s formula for V (x) to rewrite the stochastic integral as boundary terms plus correction. Replace Integrals along the trajectory by integral against local time L(a)da.

28 If local time was smooth..... the log-likelihood I[b] would be bounded above on b L 2 (0, 2π). Taking the functional derivative yields the MLE b = L 2L

29 Infinite Dimensional Trouble I[b] = 2π 0 b(a) 2 L(a) + b (a)l(a)da For smooth L the functional is positive definite and bounded below. BUT L is not differentiable! Likelihood is easily shown to be almost surely unbounded.

30 Various Options Boundary 1 2 2π 0 ( ) b(a) 2 + b (a) L(a)da. 1 Use a non-likelihood based approach 2 Assume a parametric form b(x, θ) 3 Adopt some kind of penalised likelihood approach 4 Introduce a prior measure on drift functions b( ) and perform Bayesian estimation.

31 Gaussian Prior on drift functions Specify a prior Gaussian measure for zero-mean drift functions by Its Mean: b 0 H 2 per([0, 2π]) Its Precision (operator): A 0 on [0, 2π] with periodic boundary conditions. Thus a continuum Gaussian Markov random field. Smoothness imposed in A persists in the posterior

32 Choice of A A simple choice would take A = (= d 2 da 2 ). This is (approximately..) assuming independent increments of db(a) for different a, and would lead to continuous non-differentiable sample paths with probability 1. We mostly use instead A = 2, whereby sample paths have the smoothness of once integrated diffusions. Intuitively: ( ) b exp 2π 0 b(a) b 0 (a) 2 da

33 Finding the Posterior Multiply prior density by likelihood: ( ) exp exp ( 2π 0 2π 0 b(a) 2 da ) 1 [ ] (b(a) 2 + b (a))l(a)) + χ X0,X 2 T b(a) da + W term Complete the square to find that the posterior is Gaussian with Mean ( ) L ˆb = 2 L + χ X0,X T + W Posterior Covariance ( ) L

34 Finding the Posterior Multiply prior density by likelihood: ( ) exp exp ( 2π 0 2π 0 b(a) 2 da ) 1 [ ] (b(a) 2 + b (a))l(a)) + χ X0,X 2 T b(a) da + W term Complete the square to find that the posterior is Gaussian with Mean ( ) L ˆb = 2 L + χ X0,X T + W Posterior Covariance ( ) L

35 Towards the Posterior rigorously Standard PDE theory shows that the posterior mean is the weak solution of a PDE: Theorem Let L C([0, 2π]) be continuous and periodic and not identically zero. Then the PDE 2 u + Lu = 1 2 L + W + χ x0,x T (1) has a unique weak solution u H 2 per([0, 2π]).

36 Robustness of the Posterior Mean Theorem There exists a constant C(W, L ) > 0 such that for all admissible perturbed local times L the deviation of the perturbed posterior mean ũ from the unperturbed posterior mean u is bounded in the H 2 -norm: C > 0 L Λ : ũ u H 2 C(W, L ) L L L 2

37 Cleanup Observe absolute continuity of posterior and prior measure, 2 and 2 + L differ only in lower order differential parts. Compute the Radon-Nikodym derivative and identify with the likelihood. Simple estimate of local time from pointwise estimations combined with Hölder continuity of local time, so that ˆL L L 2 is small.

38 Estimating local time Choose N equal-sized bins. Count realisations falling in each bin. Width of bins adapts to M (i.e. t) (rather like bandwidth selection in kernel density estimation). Pointwise estimates converge to local time in probability. Use Hölder continuity of local time to obtain an L 2 error bound, using point values to construct piecewise constant approximants of L.

39 Estimating local time Choose N equal-sized bins. Count realisations falling in each bin. Width of bins adapts to M (i.e. t) (rather like bandwidth selection in kernel density estimation). Pointwise estimates converge to local time in probability. Use Hölder continuity of local time to obtain an L 2 error bound, using point values to construct piecewise constant approximants of L.

40 Estimating local time Choose N equal-sized bins. Count realisations falling in each bin. Width of bins adapts to M (i.e. t) (rather like bandwidth selection in kernel density estimation). Pointwise estimates converge to local time in probability. Use Hölder continuity of local time to obtain an L 2 error bound, using point values to construct piecewise constant approximants of L.

41 Estimating local time Choose N equal-sized bins. Count realisations falling in each bin. Width of bins adapts to M (i.e. t) (rather like bandwidth selection in kernel density estimation). Pointwise estimates converge to local time in probability. Use Hölder continuity of local time to obtain an L 2 error bound, using point values to construct piecewise constant approximants of L.

42 Estimating local time Choose N equal-sized bins. Count realisations falling in each bin. Width of bins adapts to M (i.e. t) (rather like bandwidth selection in kernel density estimation). Pointwise estimates converge to local time in probability. Use Hölder continuity of local time to obtain an L 2 error bound, using point values to construct piecewise constant approximants of L.

43 Numerical Analysis Fourth order elliptic PDE with non-regular right hand side. Use piecewise cubic polynomial base functions on each finite element. b(a) = N e=1 f =1 4 B e,f φ e,f (a) These span the approximation space H 2 h H2.

44 Finite Elements 2 Finite element representation turns weak PDE into a collection of linear equation for B e,f : (i,j),(e,f ) Ψ (i,j) M (i,j),(e,f ) B e,f = (i,j) Ψ (i,j) F (i,j) Standard numerical analysis guarantees the accuracy of these methods.

45 Numerics: Samples from Posterior dx = sin(x) + 3cos 2 (x)sin(x)dt + db t Second order prior covariance: A = 2

46 Numerics: Samples from Posterior dx = sin(x) + 3cos 2 (x)sin(x)dt + db t First order prior covariance: A = 1

47 Convergence as T Gaussian boundary conditions with second order covariance operator. T = 0.02

48 Convergence as T Gaussian boundary conditions with second order covariance operator. T = 50

49 Convergence as T Gaussian boundary conditions with second order covariance operator. T = 5000

50 Rates of posterior contraction For Z 1 = ˆb(0.38π) b(0.38π) and Z 2 = 2π 0 ˆb(a) sin(a)da. Questions: Do we have a law of large numbers Z i 0 as T? Do we get CLT-like convergence? Var(Z i ) = O ( ) 1 T (Numerical) Answers: Numerically, lim T Z i = 0 is observed. Decay of Variance: Answer depends on i! High frequency components of L can dominate the convergence.

51 Rates of posterior contraction For Z 1 = ˆb(0.38π) b(0.38π) and Z 2 = 2π 0 ˆb(a) sin(a)da. Questions: Do we have a law of large numbers Z i 0 as T? Do we get CLT-like convergence? Var(Z i ) = O ( ) 1 T (Numerical) Answers: Numerically, lim T Z i = 0 is observed. Decay of Variance: Answer depends on i! High frequency components of L can dominate the convergence.

52 Rate of Posterior Contraction Smooth Functional

53 Rate of Posterior Contraction Point Evaluation

54 Marginal likelihood More general prior precision operator: A(η) = η k + ε How to choose hyper-parameter η? Fully Bayesian approach clearly possible. We choose to maximise marginal likelihood. P({x t } T t=0 b)p 0(db) = A(η) A(η) + L T 1 L ( 2 exp 1 ) 2π ] [ (Ab 0 + f ) (A(η) + L T ) 1 (Ab 0 + f ) + b 0 Ab 0 da 2 0

55 Marginal likelihood More general prior precision operator: A(η) = η k + ε How to choose hyper-parameter η? Fully Bayesian approach clearly possible. We choose to maximise marginal likelihood. P({x t } T t=0 b)p 0(db) = A(η) A(η) + L T 1 L ( 2 exp 1 ) 2π ] [ (Ab 0 + f ) (A(η) + L T ) 1 (Ab 0 + f ) + b 0 Ab 0 da 2 0

56 Optimising Smoothness η

57 Molecular Dynamics MẌ(t) = V (X(t)) γmẋ(t) + 2γk B TMḂ X ω(x)

58 Fitting Result Whether data looks like a diffusion depends on timescale.

59 Fitting Result Posterior mean and standard deviation band for k = 1000:

60 Future Work / In Progress The case where the state space is R Extension to higher dimensions (2, 3) Consistency issues: rate of posterior contraction depends on smoothness of prior, truth and functional to be estimated. Extend to O(1)-spaced data using perfect simulation of Brownian local times with Metropolis-Hastings correction. Heterogenous diffusion coefficient is (in principle) straightforward for high-frequency data, though there would be associated numerical issues to resolve. In the discretely-observed case, this requires reparameterisation techniques (see R + Stramer, 2001, Durham and Gallant, 2002.

61 Future Work / In Progress The case where the state space is R Extension to higher dimensions (2, 3) Consistency issues: rate of posterior contraction depends on smoothness of prior, truth and functional to be estimated. Extend to O(1)-spaced data using perfect simulation of Brownian local times with Metropolis-Hastings correction. Heterogenous diffusion coefficient is (in principle) straightforward for high-frequency data, though there would be associated numerical issues to resolve. In the discretely-observed case, this requires reparameterisation techniques (see R + Stramer, 2001, Durham and Gallant, 2002.

62 Future Work / In Progress The case where the state space is R Extension to higher dimensions (2, 3) Consistency issues: rate of posterior contraction depends on smoothness of prior, truth and functional to be estimated. Extend to O(1)-spaced data using perfect simulation of Brownian local times with Metropolis-Hastings correction. Heterogenous diffusion coefficient is (in principle) straightforward for high-frequency data, though there would be associated numerical issues to resolve. In the discretely-observed case, this requires reparameterisation techniques (see R + Stramer, 2001, Durham and Gallant, 2002.

63 Future Work / In Progress The case where the state space is R Extension to higher dimensions (2, 3) Consistency issues: rate of posterior contraction depends on smoothness of prior, truth and functional to be estimated. Extend to O(1)-spaced data using perfect simulation of Brownian local times with Metropolis-Hastings correction. Heterogenous diffusion coefficient is (in principle) straightforward for high-frequency data, though there would be associated numerical issues to resolve. In the discretely-observed case, this requires reparameterisation techniques (see R + Stramer, 2001, Durham and Gallant, 2002.

64 Future Work / In Progress The case where the state space is R Extension to higher dimensions (2, 3) Consistency issues: rate of posterior contraction depends on smoothness of prior, truth and functional to be estimated. Extend to O(1)-spaced data using perfect simulation of Brownian local times with Metropolis-Hastings correction. Heterogenous diffusion coefficient is (in principle) straightforward for high-frequency data, though there would be associated numerical issues to resolve. In the discretely-observed case, this requires reparameterisation techniques (see R + Stramer, 2001, Durham and Gallant, 2002.

65 Summary Nonparametric drift estimation for diffusions on the circle can be performed rigorously for Gaussian prior (conjugate prior). Finite element implementation enables error control from discrete time high frequency samples all the way to numerically obtained posterior means. Applications in molecular dynamics and many other areas...

66 Samples from the Posterior are usable

67 Nonexistence Local time has the same regularity as Brownian motion: C α, α < 1 2. Unboundedness of the log-likelihood functional is linked to the regularity of local time. Substituting local time by Brownian bridge we get the

68 Nonexistence Local time has the same regularity as Brownian motion: C α, α < 1 2. Unboundedness of the log-likelihood functional is linked to the regularity of local time. Substituting local time by Brownian bridge we get the

69 Nonexistence Local time has the same regularity as Brownian motion: C α, α < 1 2. Unboundedness of the log-likelihood functional is linked to the regularity of local time. Substituting local time by Brownian bridge we get the Theorem Let W be a realisation of the Brownian bridge on [0, 1]. Then with probability one, the functional I[b] = 1 ( ) b 2 (s) + b (s) W (s)ds 2 is not bounded above on b H 1 ([0, 1]).

70 Classic example: Gibbs sampler for drift and diffusivity Algorithm Use sequential Gibbs sampler to estimate diffusivity σ and drift parameters θ j : 1 P(θ j x i, σ) 2 P(σ x i, θ j ) Observations: Algorithm works fine provided good approximations of the conditional densities are available. Frequently, approximations are good only for t 0. Simple fix: Augment the data {x j } by imputed datapoints x i,k at times t i < t i,1 < t i,2 <... < t i,k < t i+1.

71 Classic example: Gibbs sampler for drift and diffusivity Algorithm Use sequential Gibbs sampler to estimate diffusivity σ and drift parameters θ j : 1 P(θ j x i, σ) 2 P(σ x i, θ j ) Observations: Algorithm works fine provided good approximations of the conditional densities are available. Frequently, approximations are good only for t 0. Simple fix: Augment the data {x j } by imputed datapoints x i,k at times t i < t i,1 < t i,2 <... < t i,k < t i+1.

72 Classic example: Gibbs sampler for drift and diffusivity Augmented Data Algorithm Use sequential Gibbs sampler to estimate diffusivity σ, drift parameters θ j and imputed data points {x i,k }: 1 θ j P(θ j x i, x i,k, σ) 2 σ P(σ x i, x i,k, θ j ) 3 x i,k P( x i,k x i, σ, θ j ) Observations: Algorithm grinds to a halt as augmentation is increased Bad mixing for σ is observed Reason: Imputed data points determine σ, σ determines quadratic variation of imputed data points. Analysis via continuous time asymptotically equivalent diffusions (time rescaling)

73 Classic example: Gibbs sampler for drift and diffusivity Augmented Data Algorithm Use sequential Gibbs sampler to estimate diffusivity σ, drift parameters θ j and imputed data points {x i,k }: 1 θ j P(θ j x i, x i,k, σ) 2 σ P(σ x i, x i,k, θ j ) 3 x i,k P( x i,k x i, σ, θ j ) Observations: Algorithm grinds to a halt as augmentation is increased Bad mixing for σ is observed Reason: Imputed data points determine σ, σ determines quadratic variation of imputed data points. Analysis via continuous time asymptotically equivalent diffusions (time rescaling)

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

Exact Simulation of Diffusions and Jump Diffusions

Exact Simulation of Diffusions and Jump Diffusions Exact Simulation of Diffusions and Jump Diffusions A work by: Prof. Gareth O. Roberts Dr. Alexandros Beskos Dr. Omiros Papaspiliopoulos Dr. Bruno Casella 28 th May, 2008 Content 1 Exact Algorithm Construction

More information

LAN property for sde s with additive fractional noise and continuous time observation

LAN property for sde s with additive fractional noise and continuous time observation LAN property for sde s with additive fractional noise and continuous time observation Eulalia Nualart (Universitat Pompeu Fabra, Barcelona) joint work with Samy Tindel (Purdue University) Vlad s 6th birthday,

More information

A variational radial basis function approximation for diffusion processes

A variational radial basis function approximation for diffusion processes A variational radial basis function approximation for diffusion processes Michail D. Vrettas, Dan Cornford and Yuan Shen Aston University - Neural Computing Research Group Aston Triangle, Birmingham B4

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Exact Simulation of Multivariate Itô Diffusions

Exact Simulation of Multivariate Itô Diffusions Exact Simulation of Multivariate Itô Diffusions Jose Blanchet Joint work with Fan Zhang Columbia and Stanford July 7, 2017 Jose Blanchet (Columbia/Stanford) Exact Simulation of Diffusions July 7, 2017

More information

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Controlled Diffusions and Hamilton-Jacobi Bellman Equations Controlled Diffusions and Hamilton-Jacobi Bellman Equations Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2014 Emo Todorov (UW) AMATH/CSE 579, Winter

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian inference for stochastic differential mixed effects models - initial steps

Bayesian inference for stochastic differential mixed effects models - initial steps Bayesian inference for stochastic differential ixed effects odels - initial steps Gavin Whitaker 2nd May 2012 Supervisors: RJB and AG Outline Mixed Effects Stochastic Differential Equations (SDEs) Bayesian

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Hierarchical Bayesian Inversion

Hierarchical Bayesian Inversion Hierarchical Bayesian Inversion Andrew M Stuart Computing and Mathematical Sciences, Caltech cw/ S. Agapiou, J. Bardsley and O. Papaspiliopoulos SIAM/ASA JUQ 2(2014), pp. 511--544 cw/ M. Dunlop and M.

More information

Introduction to Random Diffusions

Introduction to Random Diffusions Introduction to Random Diffusions The main reason to study random diffusions is that this class of processes combines two key features of modern probability theory. On the one hand they are semi-martingales

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

An introduction to adaptive MCMC

An introduction to adaptive MCMC An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

Generalized Gaussian Bridges of Prediction-Invertible Processes

Generalized Gaussian Bridges of Prediction-Invertible Processes Generalized Gaussian Bridges of Prediction-Invertible Processes Tommi Sottinen 1 and Adil Yazigi University of Vaasa, Finland Modern Stochastics: Theory and Applications III September 1, 212, Kyiv, Ukraine

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models 6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories

More information

Representing Gaussian Processes with Martingales

Representing Gaussian Processes with Martingales Representing Gaussian Processes with Martingales with Application to MLE of Langevin Equation Tommi Sottinen University of Vaasa Based on ongoing joint work with Lauri Viitasaari, University of Saarland

More information

Stochastic Differential Equations.

Stochastic Differential Equations. Chapter 3 Stochastic Differential Equations. 3.1 Existence and Uniqueness. One of the ways of constructing a Diffusion process is to solve the stochastic differential equation dx(t) = σ(t, x(t)) dβ(t)

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Tools of stochastic calculus

Tools of stochastic calculus slides for the course Interest rate theory, University of Ljubljana, 212-13/I, part III József Gáll University of Debrecen Nov. 212 Jan. 213, Ljubljana Itô integral, summary of main facts Notations, basic

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

A new Hierarchical Bayes approach to ensemble-variational data assimilation

A new Hierarchical Bayes approach to ensemble-variational data assimilation A new Hierarchical Bayes approach to ensemble-variational data assimilation Michael Tsyrulnikov and Alexander Rakitko HydroMetCenter of Russia College Park, 20 Oct 2014 Michael Tsyrulnikov and Alexander

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

A Note on Auxiliary Particle Filters

A Note on Auxiliary Particle Filters A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

Point spread function reconstruction from the image of a sharp edge

Point spread function reconstruction from the image of a sharp edge DOE/NV/5946--49 Point spread function reconstruction from the image of a sharp edge John Bardsley, Kevin Joyce, Aaron Luttman The University of Montana National Security Technologies LLC Montana Uncertainty

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Markov chain Monte Carlo algorithms for SDE parameter estimation

Markov chain Monte Carlo algorithms for SDE parameter estimation Markov chain Monte Carlo algorithms for SDE parameter estimation Andrew Golightly and Darren J. Wilkinson Abstract This chapter considers stochastic differential equations for Systems Biology models derived

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Homework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples

Homework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples Homework #3 36-754, Spring 27 Due 14 May 27 1 Convergence of the empirical CDF, uniform samples In this problem and the next, X i are IID samples on the real line, with cumulative distribution function

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

Patterns of Scalable Bayesian Inference Background (Session 1)

Patterns of Scalable Bayesian Inference Background (Session 1) Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Discretization of SDEs: Euler Methods and Beyond

Discretization of SDEs: Euler Methods and Beyond Discretization of SDEs: Euler Methods and Beyond 09-26-2006 / PRisMa 2006 Workshop Outline Introduction 1 Introduction Motivation Stochastic Differential Equations 2 The Time Discretization of SDEs Monte-Carlo

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Graphical Models for Query-driven Analysis of Multimodal Data

Graphical Models for Query-driven Analysis of Multimodal Data Graphical Models for Query-driven Analysis of Multimodal Data John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology

More information

How to build an automatic statistician

How to build an automatic statistician How to build an automatic statistician James Robert Lloyd 1, David Duvenaud 1, Roger Grosse 2, Joshua Tenenbaum 2, Zoubin Ghahramani 1 1: Department of Engineering, University of Cambridge, UK 2: Massachusetts

More information

Bayesian inference for nonlinear multivariate diffusion processes (and other Markov processes, and their application to systems biology)

Bayesian inference for nonlinear multivariate diffusion processes (and other Markov processes, and their application to systems biology) Bayesian inference for nonlinear multivariate diffusion processes (and other Markov processes, and their application to systems biology) Darren Wilkinson School of Mathematics & Statistics and Centre for

More information

Methods of Data Assimilation and Comparisons for Lagrangian Data

Methods of Data Assimilation and Comparisons for Lagrangian Data Methods of Data Assimilation and Comparisons for Lagrangian Data Chris Jones, Warwick and UNC-CH Kayo Ide, UCLA Andrew Stuart, Jochen Voss, Warwick Guillaume Vernieres, UNC-CH Amarjit Budiraja, UNC-CH

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

Stochastic (intermittent) Spikes and Strong Noise Limit of SDEs.

Stochastic (intermittent) Spikes and Strong Noise Limit of SDEs. Stochastic (intermittent) Spikes and Strong Noise Limit of SDEs. D. Bernard in collaboration with M. Bauer and (in part) A. Tilloy. IPAM-UCLA, Jan 2017. 1 Strong Noise Limit of (some) SDEs Stochastic differential

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Towards a Bayesian model for Cyber Security

Towards a Bayesian model for Cyber Security Towards a Bayesian model for Cyber Security Mark Briers (mbriers@turing.ac.uk) Joint work with Henry Clausen and Prof. Niall Adams (Imperial College London) 27 September 2017 The Alan Turing Institute

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

1 Brownian Local Time

1 Brownian Local Time 1 Brownian Local Time We first begin by defining the space and variables for Brownian local time. Let W t be a standard 1-D Wiener process. We know that for the set, {t : W t = } P (µ{t : W t = } = ) =

More information

A new iterated filtering algorithm

A new iterated filtering algorithm A new iterated filtering algorithm Edward Ionides University of Michigan, Ann Arbor ionides@umich.edu Statistics and Nonlinear Dynamics in Biology and Medicine Thursday July 31, 2014 Overview 1 Introduction

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt. The concentration of a drug in blood Exponential decay C12 concentration 2 4 6 8 1 C12 concentration 2 4 6 8 1 dc(t) dt = µc(t) C(t) = C()e µt 2 4 6 8 1 12 time in minutes 2 4 6 8 1 12 time in minutes

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Geometric projection of stochastic differential equations

Geometric projection of stochastic differential equations Geometric projection of stochastic differential equations John Armstrong (King s College London) Damiano Brigo (Imperial) August 9, 2018 Idea: Projection Idea: Projection Projection gives a method of systematically

More information

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Lecture 22 Girsanov s Theorem

Lecture 22 Girsanov s Theorem Lecture 22: Girsanov s Theorem of 8 Course: Theory of Probability II Term: Spring 25 Instructor: Gordan Zitkovic Lecture 22 Girsanov s Theorem An example Consider a finite Gaussian random walk X n = n

More information

Strong uniqueness for stochastic evolution equations with possibly unbounded measurable drift term

Strong uniqueness for stochastic evolution equations with possibly unbounded measurable drift term 1 Strong uniqueness for stochastic evolution equations with possibly unbounded measurable drift term Enrico Priola Torino (Italy) Joint work with G. Da Prato, F. Flandoli and M. Röckner Stochastic Processes

More information

Introduction. log p θ (y k y 1:k 1 ), k=1

Introduction. log p θ (y k y 1:k 1 ), k=1 ESAIM: PROCEEDINGS, September 2007, Vol.19, 115-120 Christophe Andrieu & Dan Crisan, Editors DOI: 10.1051/proc:071915 PARTICLE FILTER-BASED APPROXIMATE MAXIMUM LIKELIHOOD INFERENCE ASYMPTOTICS IN STATE-SPACE

More information

HMM part 1. Dr Philip Jackson

HMM part 1. Dr Philip Jackson Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information