Mix & Match Hamiltonian Monte Carlo

Size: px

Start display at page:

Download "Mix & Match Hamiltonian Monte Carlo"

Amberly Cunningham
5 years ago
Views:

1 Mix & Match Hamiltonian Monte Carlo Elena Akhmatskaya,2 and Tijana Radivojević BCAM - Basque Center for Applied Mathematics, Bilbao, Spain 2 IKERBASQUE, Basque Foundation for Science, Bilbao, Spain MCMSki V, Lenzerheide, Switzerland January 5, 206

2 Preview We introduce an alternative to Hamiltonian Monte Carlo (HMC) for efficient sampling in statistical simulations The new method called Mix & Match Hamiltonian Monte Carlo (MMHMC) has been inspired by Generalized Shadow Hybrid Monte Carlo (GSHMC ) by Akhmatskaya & Reich MMHMC: is generalized HMC that samples with modified Hamiltonians offers computationally effective Metropolis test for momentum update reduces potential negative effects of momentum flips relies on the method- and system- specific adaptive integration scheme and compatible modified Hamiltonians outperforms in sampling efficiency the advanced sampling techniques for computational statistics such as HMC and Riemann Manifold HMC efficient for sampling in multidimensional space

3 Behind the scenes : GSHMC was introduced for sampling in molecular simulation published: Akhmatskaya, Reich (2008), JCOMP, 227, 4934; Akhmatskaya, Bou-Rabee, Reich (2009), JCOMP, 228 (6), 2256 patented: GB patent (2009), US patent (20) [Fujitsu, Authors: Akhmatskaya, Reich] proved to be successful in simulations of complex molecular systems in Biology and Chemistry (6 publications) No implementation and testing in statistical computation till 205 Never been implemented in open source software due to patent restrictions November 205: Fujitsu issued the license giving a permission (i) to use the patented method in open source software (ii) to EA to implement and use know-how Current status: GSHMC has been modified and adapted to statistical applications to give birth to MMHMC. Implemented in BCAM in-house software [Radivojević, Akhmatskaya, preprint]

4 MMHMC features For the target density π(θ) of position vector θ, momenta p conjugate to θ, with a mass matrix M (a preconditioner) construct a potential function U(θ) = log(π(θ)) and a Hamiltonian H(θ, p) = U(θ) + 2 pt M p Sampling is performed with respect to a modified canonical density π(θ, p) exp( H h (θ, p)) E. g., the modified Hamiltonian of order m = 4 for the Verlet method H h (θ, p) = H(θ, p) + h2 ( 2p M U θθ (θ)m p U θ (θ) M U θ (θ) ) 24

5 MMHMC algorithm (θ, p) Partial Momentum Monte Carlo (PMMC) PMMC ϕp + ϕu p = p with probability P = min{, exp( Ĥh)} otherwise u N (0, M) is the noise, ϕ [0, ] p Hamiltonian Dynamics Ĥh = h2 (6b ) ( ϕa + 2 ) ϕ( ϕ)b 24 A = u M U θθ (θ)m u p M U θθ (θ)m p B = u M U θθ (θ)m p b is the integrator s parameter (θ, p ) The extended Hamiltonian w Metropolis test A (θ new, p new ) R F Ĥ h (θ, p, u) = H h (θ, p) + 2 u M u defines the extended reference density ˆπ(θ, p, u) exp( Ĥh(θ, p, u))

6 MMHMC algorithm (θ, p) PMMC p Metropolis test { (θ (θ new, p new ) =, p ) accept with prob. α F(θ, p ) reject otherwise α = min {, exp( H { h )} (θ, p) Flip F(θ, p) = reduced flip (optionally) Hamiltonian Dynamics (θ, p ) Re-weighting (w) For every n =,..., N stores w n = exp ( H h (θ n, p n ) H(θ n, p n ) ) w Metropolis test A (θ new, p new ) R F N n= Ω = w nω n N n= w n Ω n, n =, 2,..., N - values of observables along a sequence of states (θ n, p n)

7 What to expect? Pros enhanced sampling High acceptance rates H h conserved by symplectic integrators better than H Access to second-order information about the target distribution Extra parameter for performance enhancing Faster convergence to the target PDF Cons extra computational cost Computation of H h for each proposal (higher orders H h are more expensive) Extra Metropolis test for momentum update Accurate numerical integrators required to use low orders H h for systems with highly oscillatory H Our strategy To find the numerical integrator that provides the best conservation of modified energy. Search within the family of 2-stage splitting integrating schemes.

8 Modified Hamiltonians for splitting integrators 4-th order modified Hamiltonian for 2-stage splitting integrators were derived in terms of quantities available during simulation by applying the Baker-Campbell-Hausdorff (BCH) formula iteratively H h (θ, p) = U(θ) + K(p) + h 2( ) αp M Uθ (θ) + βu θ (θ) M U θ (θ) U θ (θ) - numerical time-derivative of U θ (θ) α = 6b 24, β = 6b2 6b + 2 b - parameter of an integrator

9 Adaptive integrators for optimal conservation of modified energy Consider one parameter 2-stage splitting integrator of Hamiltonian system with H(θ, p) = U(θ) + 2 pt M p = A + B: ψ h = ϕ B bh ϕ A h ϕ B ( 2b)h ϕ A h ϕ B bh 2 2 Our Adaptive Integration Approach (AIA): Given step size h and highest frequency f max find b(h, f max ) ( system specific integrator) that minimizes expectation of the modified energy error = H [4] h (Ψ h,l(θ, p)) H [4] h (θ, p), i. e. 0 E( ) ρ(h, b) minimal

10 Adaptive integrators for optimal conservation of modified energy (( 2 + h2 β)b h + ( 2 + h2 α)c h )(B h + C h ) ρ(h, b) = A 2 h A h = h4 b( 2b) 4 h2 2 + B h = h3 ( 2b) + h 4 C h = h5 b 2 ( 2b) + h 3 b( b) h 4 Then: b (h, f max ) = arg min b B max ρ(h, b) 0<h< h where h (reduced step size) is a function of f max

11 AIA in action b medess/t AIA 0.2 BCSS VV h D = h VV - velocity Verlet, b = 0.25 BCSS - a fixed parameter integrator derived for MMHMC using ideas of Blanes et. al (204) medess/t D =000 AIA VV BCSS HMCVV h Efficiency in sampling a multivarate Gaussian distribution S. Blanes, F. Casas, J.M. Sanz-Serna (204), Numerical integrators for the Hybrid Monte Carlo method, SIAM J. Sci. Comput., 36(4), A556 A580

12 AIA in action b medess/t AIA 0.2 BCSS VV h D = h VV - velocity Verlet, b = 0.25 BCSS - a fixed parameter integrator derived for MMHMC using ideas of Blanes et. al (204) medess/t D = 000 AIA VV BCSS HMCVV h Efficiency in sampling a multivarate Gaussian distribution S. Blanes, F. Casas, J.M. Sanz-Serna (204), Numerical integrators for the Hybrid Monte Carlo method, SIAM J. Sci. Comput., 36(4), A556 A580

13 Benchmark Models Banana-shaped distribution D = 2 Bayesian Logistic Regression Dataset # of param (D) # of obs (K) german sonar musk secom Stochastic Volatility p(x y, θ) p(θ y, x) Simulated data d = 2000 d = 5000 d = 0000 Lichman, M. (203), UCI Machine Learning Repository [ Irvine, CA: University of California, School of Information and Computer Science

14 Testing Methods MMHMC HMC RMHMC Criteria Space exploration Sampling efficiency Convergence Metrics AR acceptance rate ESS* - time normalized effective sample size EF efficiency factor (relative ESS* w.r.t HMC) ˆR potential scale reduction factor Results averaged over 0 runs Choice of simulation parameters each method tuned for the best performance M. Girolami, B. Calderhead (20), Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods, Journal of the Royal Statistical Society: Series B, 73(2):23 24 SP. Brooks, A. Gelman (998), General methods for monitoring convergence of iterative simu- lations. Journal of Computational and Graphical Statistics, 7,

15 Banana distribution 5 sampling paths RWMH HMC MMHMC RMHMC

16 Banana distribution 5000 samples RWMH HMC MMHMC RMHMC

17 Bayesian Logistic Regression (BLR) EF german (D = 25) HMC MMHMC AR sonar (D = 6) AR musk (D = 67) secom (D = 444) 6 AR AR EF miness medess maxess miness medess maxess

18 Stochastic volatility (SV): d= AR θ x HMC RMHMC MMHMC EF β σ φ miness medess maxess

19 SV: d= AR θ x HMC MMHMC EF β σ φ miness medess maxess

20 SV: d= AR θ x HMC MMHMC EF β σ φ miness medess maxess

21 SV Convergence ˆR β ˆR σ ˆR φ.3.2 d=2000 d=5000 d=0000 HMC RMHMC MMHMC MC iterations

22 Summary MMHMC vs HMC MMHMC demonstrates higher AR, bigger ESS* and faster convergence MMHMC vs RMHMC SV model: MMHMC and RMHMC demonstrate comparable sampling performance with slight dominance of MMHMC BLR model: RMHMC does not improve HMC for considered dimensions, whereas MMHMC outperforms HMC up to 8 for D 25. In contrast to RMHMC, MMHMC does not require matrix inversion (computationally less expensive) relies on separable Hamiltonians - allows for use of new, more efficient numerical integrators efficient for high dimensions problems

23 Guggenheim, Bilbao Modeling & Simulation in Life & Materials Sciences: MSLMS BCAM

ADAPTIVE TWO-STAGE INTEGRATORS FOR SAMPLING ALGORITHMS BASED ON HAMILTONIAN DYNAMICS

ADAPTIVE TWO-STAGE INTEGRATORS FOR SAMPLING ALGORITHMS BASED ON HAMILTONIAN DYNAMICS E. Akhmatskaya a,c, M. Fernández-Pendás a, T. Radivojević a, J. M. Sanz-Serna b a Basque Center for Applied Mathematics