Bayesian inference for stochastic differential ixed effects odels - initial steps Gavin Whitaker 2nd May 2012 Supervisors: RJB and AG
Outline Mixed Effects Stochastic Differential Equations (SDEs) Bayesian inference for SDEs Toy odel (Roberts and Straer 2001 paper)
SDE Models Consider an Itô process {X t, t 0} satisfying dx t = α(x t, θ)dt + β(x t, θ)dw t X t is the value of the process at tie t α(x t, θ) is the drift β(x t, θ) is the diffusion coefficient W t is standard Brownian otion X 0 is the vector of initial conditions
SDE Models Consider an Itô process {X t, t 0} satisfying dx t = α(x t, θ)dt + β(x t, θ)dw t X t is the value of the process at tie t α(x t, θ) is the drift β(x t, θ) is the diffusion coefficient W t is standard Brownian otion X 0 is the vector of initial conditions Seek a nuerical solution via (for exaple) the Euler-Maruyaa approxiation X t X t+ t X t = α(x t, θ) t + β(x t, θ) W t where W t N(0, I t)
SDE Models CIR Model dx t = (θ 1 θ 2 X t )dt + θ 3 Xt dw t Used to odel short ter interest rates The process is ean reverting
SDE Models CIR Model Siulation of CIR Model X t 10 12 14 16 18 20 θ 3=0.5 θ 3=0.2 0 5 10 15 20 Tie Figure: Nuerical solution for CIR odel, θ 1 = 1, θ 2 = 0.1, X 0 = 15
SDE Models Aphid Growth Model Also known as plant lice, or greenfly They are sall sap sucking insects Soe species of ants far aphids, for the honeydew they release. These dairying ants, ilk the aphids by stroking the
SDE Models Aphid Growth Model ( ) ( ) ( dnt λnt µn = t C t λnt + µn dt + t C t λn t dc t λn t λn t λn t ) 1 2 dw t N t is the aphid population size at tie t C t is the cuulative population at tie t This odel is an SDE approxiation to an underlying stochastic kinetic odel Birth rate of λn t and a death rate of µn t C t
SDE Models Aphid Growth Model Siulation of Aphid Growth Model Population size 0 500 1000 1500 2000 2500 3000 3500 N t C t 0 2 4 6 8 10 Tie Figure: Nuerical solution for Aphid odel, λ = 1.75, µ = 0.001
Mixed Effects SDE Models What if experiental units are not identical? Suppose the units have coon paraeters θ but different paraeters b i We treat the b i as rando effects with a population profile
Mixed Effects SDE Models What if experiental units are not identical? Suppose the units have coon paraeters θ but different paraeters b i We treat the b i as rando effects with a population profile This gives us a stochastic differential ixed-effects odel for the experiental units: dx i t = α(x i t, θ, b i )dt + β(x i t, θ, b i )dw i t, i = 1,..., M Differences between units are down to different realisations of the Brownian otion paths W i t and the rando effects b i Allows us to split the total variation between within- and between-individual coponents
Bayesian inference for SDEs Probleatic due to the intractability of the transition density characterising the process In other words, we typically can t solve an SDE analytically So we could just work with the Euler approxiation Given data d at equidistant ties t 0, t 1,..., t n, the Euler approxiation ay be unsatisfactory for t = t i+1 t i We therefore adopt a data augentation schee
Bayesian inference for SDEs Introduce a partition of [t i, t i+1 ] as t i = τ i < τ i+1 <... < τ (i+1) = t i+1 where τ τ i+1 τ i = t i+1 t i Apply Euler approxiation over each interval of width τ Introduces 1 latent values between every pair of observations
Bayesian inference for SDEs Forulate joint posterior for paraeters and latent values d = (x t0, x t1,..., x tn ) x = (x τ1, x τ2,..., x τ 1, x τ+1,...,..., x τn 1 ) = latent path (x, d) = (x τ0, x τ1,..., x τ, x τ+1,...,..., x τn ) = augented path
Bayesian inference for SDEs Forulate joint posterior for paraeters and latent data as where π(θ, x d) π(θ)π(x, d θ) n 1 π(θ) i=0 π(x τi+1 x τi, θ) π(x τi+1 x τi, θ) = φ ( x τi+1 ; x τi + α(x τi, θ) t, β(x τi, θ) t ) and φ( ; µ, Σ) denotes the Gaussian density with ean µ and variance Σ
Bayesian inference for SDEs Forulate joint posterior for paraeters and latent data as where π(θ, x d) π(θ)π(x, d θ) n 1 π(θ) i=0 π(x τi+1 x τi, θ) π(x τi+1 x τi, θ) = φ ( x τi+1 ; x τi + α(x τi, θ) t, β(x τi, θ) t ) and φ( ; µ, Σ) denotes the Gaussian density with ean µ and variance Σ The posterior distribution is typically analytically intractable
A Gibbs sapling approach We therefore saple via an MCMC schee E.g a Gibbs sapler, alternating between draws of θ x, d x θ, d
A Gibbs sapling approach We therefore saple via an MCMC schee E.g a Gibbs sapler, alternating between draws of θ x, d x θ, d The last step can be done (for exaple) in blocks of length 1 between observations Metropolis within Gibbs updates ay be needed Proble: the ixing is poor for large
Toy odel Consider the SDE dx t = 1 θ dw t Suppose that we have observations X 0 = x 0 = 0 and X 1 = x 1 Set τ i = i/ for i = 0, 1,..., so that (x, d) = x 0, x 1, x 2,..., x 1,x 1 }{{} obs path (bridge) obs Under the Euler approxiation X i x (i 1), θ N ( ) 1 x (i 1), θ
Toy odel Consider the SDE dx t = 1 θ dw t Suppose that we have observations X 0 = x 0 = 0 and X 1 = x 1 Set τ i = i/ for i = 0, 1,..., so that (x, d) = x 0, x 1, x 2,..., x 1,x 1 }{{} obs path (bridge) obs
Toy odel Consider the SDE dx t = 1 θ dw t Suppose that we have observations X 0 = x 0 = 0 and X 1 = x 1 Set τ i = i/ for i = 0, 1,..., so that (x, d) = x 0, x 1, x 2,..., x 1,x 1 }{{} obs path (bridge) obs Under the Euler approxiation X i x (i 1), θ N ( ) 1 x (i 1), θ
Toy odel Hence π(x, d θ) θ θ exp 2π i=1 θ /2 exp { 1 2 θ i=1 ( x i ( x i x (i 1) 2 x (i 1) ) 2 } ) 2
Toy odel Hence π(x, d θ) θ θ exp 2π i=1 θ /2 exp { 1 2 θ i=1 ( x i ( x i x (i 1) 2 x (i 1) ) 2 } ) 2 Take prior θ Exp(1) The full conditional for θ is π(θ x, d) π(θ)π(x, d θ)
Toy odel θ /2 exp { { θ /2 exp θ 1 2 θ i=1 ( ΣX 2 ( x i )} + 1 x (i 1) ) 2 θ } where Σ X = i=1 ( x i x (i 1) ) 2
Toy odel θ /2 exp { { θ /2 exp θ 1 2 θ i=1 ( ΣX 2 ( x i )} + 1 x (i 1) ) 2 θ } where Σ X = i=1 ( x i x (i 1) ) 2 Therefore ( θ x, d Γ 2 + 1, Σ X 2 ) + 1
Toy odel Under the linear Gaussian structure of the siple SDE, the full conditional x θ, d can be sapled using X i x 1, θ = ix 1 + 1 Z i, i = 1, 2,..., θ where {Z t, 0 t 1} is a standard Brownian bidge, that is a standard Brownian otion conditioned to hit 0 at tie 0, at tie 1
Toy odel Siulated data Take x 0 = 0, θ = 1 Siulate x 1 using Euler schee Get x 1 = 0.6947 d = (0, 0.6947)
Toy odel Siulated data Take x 0 = 0, θ = 1 Siulate x 1 using Euler schee Get x 1 = 0.6947 d = (0, 0.6947) MCMC schee We perfor a run of 1000 iterations of the schee with no thin. Initialise with θ (0) = 1, the prior ean Step 1 Update the discretised Brownian bridge which hits x 1 at t = 1 Step 2 Draw θ fro its full conditional distribution, ( θ x, d Γ 2 + 1, Σ ) X + 1 2
Toy odel Results log(1 θ), =10 θ1 1 0 1 2 3 0 200 400 600 800 1000 Tie log(1 θ), =100 θ1 1 0 1 2 0 200 400 600 800 1000 Tie log(1 θ), =1000 θ1 0.0 1.0 0 200 400 600 800 1000 Tie Figure: Trace plots for log( 1 θ )
Toy odel Results Auto correlation for log(1 θ) ACF 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 Lag =10 =100 =1000 Figure: Auto-correlation plots for log( 1 θ )
Toy odel Results Mixing gets even worse for larger Why is this happening? Try and quantify the ixing tie by considering the paraeter update The paraeter update can be rewritten in a clever way...
Toy odel What s going wrong? Recall that X i x 1, θ = ix 1 + 1 Z i, i = 1, 2,..., θ
Toy odel What s going wrong? Recall that X i x 1, θ = ix 1 + 1 Z i, i = 1, 2,..., θ Now Σ X = = = ( i x 1 + 1 [ (i 1) Z i i=1 θ ( 1 [ Z i Z (i 1) θ i=1 i=1 ( 1 θ [ Z i Z (i 1) x 1 + 1 θ Z (i 1) ] + i x (i 1) 1 ] + x ) 2 1 x 1 ) 2 ]) 2
Toy odel What s going wrong? Expanding out Σ X = i=1 ( 1 [ Z i θ Z (i 1) = 1 θ Σ Z + x2 1 + 2 x 1 θ ] 2 + x 2 1 i=1 2 + 2 θ x 1 [ Z i Z (i 1) ] [ Z i Z (i 1) ] ) Now i=1 [ Z i Z (i 1) ] = ( ) ( ) Z 1 Z 0 + Z 2 Z 1 +... (... + ( Z ( 1) Z ( 2) ) + Z 1 Z ( 1) )
Toy odel What s going wrong? Expanding out Σ X = i=1 ( 1 [ Z i θ Z (i 1) = 1 θ Σ Z + x2 1 + 2 x 1 θ ] 2 + x 2 1 i=1 2 + 2 θ x 1 [ Z i Z (i 1) ] [ Z i Z (i 1) ] ) Now i=1 [ Z i Z (i 1) ] = ( ) ( ) Z 1 Z 0 + Z 2 Z 1 +... ( ) ( )... + Z ( 1) Z ( 2) + Z 1 Z ( 1)
Toy odel What s going wrong? So Σ X = 1 θ Σ Z + x2 1 + 2 θ x 1 [Z 1 Z 0 ] But Z 0 = Z 1 = 0 since Z is a standard Brownian bridge Thus where Σ Z = Σ X = 1 θ Σ Z + x2 1 i=1 ( Z i Z (i 1) ) 2 Using properties of Gaussian rando variables we have Σ Z χ2 1
Toy odel What s going wrong? So Σ X = 1 θ Σ Z + x2 1 + 2 ] x 1 θ [ 0 Z 1 Z 0 But Z 0 = Z 1 = 0 since Z is a standard Brownian bridge Thus where Σ Z = Σ X = 1 θ Σ Z + x2 1 i=1 ( Z i Z (i 1) ) 2
Toy odel What s going wrong? So Σ X = 1 θ Σ Z + x2 1 + 2 ] x 1 θ [ 0 Z 1 Z 0 But Z 0 = Z 1 = 0 since Z is a standard Brownian bridge Thus where Σ Z = Σ X = 1 θ Σ Z + x2 1 i=1 ( Z i Z (i 1) ) 2 Using properties of Gaussian rando variables we have Σ Z χ2 1
Toy odel What s going wrong? Now If H Γ ( 2 + 1, 1) then ( θ x, d Γ 2 + 1, Σ X 2 θ new = = H Σ X 2 2 + 1 H ( x 2 1 + χ2 1 θ old ) + 1 ) + 1
Toy odel What s going wrong? Now If H Γ ( 2 + 1, 1) then ( θ x, d Γ 2 + 1, Σ X 2 θ new = = H Σ X 2 + 1 H x 2 1 2 + χ2 1 2θ old + 1 ) + 1
Toy odel What s going wrong? For large, approxiate H and χ 2 1 with Noral rando variables Roberts and Straer then use a suitable Taylor expansion of the expression for θ new to give ( ) 1 } 2 2 θ new θ old {1 + (W1 W 2 ) 2 + x2 1 θ old + W 2 2 where W 1 and W 2 are independent N(0, 1) rando variables
Toy odel What s going wrong? For large, approxiate H and χ 2 1 with Noral rando variables Roberts and Straer then use a suitable Taylor expansion of the expression for θ new to give ( ) 1 } 2 2 θ new θ old {1 + (W1 W 2 ) 2 + x2 1 θ old + W 2 2 where W 1 and W 2 are independent N(0, 1) rando variables θ new = θ old { 1 + O( 1 ) }
Toy odel What s going wrong? For large, approxiate H and χ 2 1 with Noral rando variables Roberts and Straer then use a suitable Taylor expansion of the expression for θ new to give ( ) 1 } 2 2 θ new θ old {1 + (W1 W 2 ) 2 + x2 1 θ old + W 2 2 where W 1 and W 2 are independent N(0, 1) rando variables θ new = θ old { 1 + O( 1 ) } Mixing tie is O()
Future work Construct MCMC schees for arbitrary nonlinear diffusion processes Naive schees with a block update for the path Better schees that use a reparaeterisation Joint update of path and paraeters (pmcmc) Application to ixed effects SDEs Aphid odel, real data exaples
References Roberts. G. O. and Straer. O. On inference for partially observed nonlinear diffusion odels using the Metropolis-Hastings algorith. Bioetrika, 88 (3) 603-621, 2001 Gillespie, C. S. and Golightly, A. Bayesian inference for generalized stochastic population growth odels with application to aphids. JRSS Series C, Applied Statistics, 59(2):341-357, 2010