A random walk from molecular dynamics to data science

Similar documents
Thermostatic Controls for Noisy Gradient Systems and Applications to Machine Learning

From Physical Modelling to Big Data Analytics: Examples and Challenges

Gaussian Process Approximations of Stochastic Differential Equations

Effective dynamics for the (overdamped) Langevin equation

Monte Carlo methods for sampling-based Stochastic Optimization

Heat bath models for nonlinear partial differential equations Controlling statistical properties in extended dynamical systems

Manifold Monte Carlo Methods

Introduction to Machine Learning CMU-10701

Random Vibrations & Failure Analysis Sayan Gupta Indian Institute of Technology Madras

Gaussian processes for inference in stochastic differential equations

Stochastic differential equation models in biology Susanne Ditlevsen

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Numerical methods in molecular dynamics and multiscale problems

Introduction to Computational Stochastic Differential Equations

A variational radial basis function approximation for diffusion processes

LECTURE 15 Markov chain Monte Carlo

Introduction to Computer Simulations of Soft Matter Methodologies and Applications Boulder July, 19-20, 2012

Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks

Markov Chains and MCMC

Applied Mathematics Methods Stream

M4A42 APPLIED STOCHASTIC PROCESSES

Kurume University Faculty of Economics Monograph Collection 18. Theoretical Advances and Applications in. Operations Research. Kyushu University Press

Computing ergodic limits for SDEs

Handbook of Stochastic Methods

Control Theory in Physics and other Fields of Science

Deep Relaxation: PDEs for optimizing Deep Neural Networks

Riemann Manifold Methods in Bayesian Statistics

Monte Carlo Methods. Handbook of. University ofqueensland. Thomas Taimre. Zdravko I. Botev. Dirk P. Kroese. Universite de Montreal

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

The parallel replica method for Markov chains

Markov Chain Monte Carlo Method

Recent Advances in Bayesian Inference Techniques

STA 4273H: Statistical Machine Learning

A tutorial on numerical transfer operator methods. numerical analysis of dynamical systems

The Molecular Dynamics Method

V e h i c l e I C T A r e n a I n n o v a t i o n B a z a a r L i n d h o l m e n S c i e n c e P a r k

Advanced Molecular Dynamics

Everyday NMR. Innovation with Integrity. Why infer when you can be sure? NMR

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Computational Physics. J. M. Thijssen

SMSTC: Probability and Statistics

Second Year Chemistry 2014

Probabilistic Graphical Models

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Gillespie s Algorithm and its Approximations. Des Higham Department of Mathematics and Statistics University of Strathclyde

Second Year Chemistry 2012

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

From cell biology to Petri nets. Rainer Breitling, Groningen, NL David Gilbert, London, UK Monika Heiner, Cottbus, DE

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Titles and Abstracts

Introduction to asymptotic techniques for stochastic systems with multiple time-scales

Lecture 7 and 8: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Homework 3 due Friday, April 26. A couple of examples where the averaging principle from last time can be done analytically.

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

CONTENTS. Preface List of Symbols and Notation

The Kramers problem and first passage times.

DUBLIN CITY UNIVERSITY

Numerical Solutions of ODEs by Gaussian (Kalman) Filtering

THE UNIVERSITY OF MANCHESTER PARTICULARS OF APPOINTMENT FACULTY OF ENGINEERING AND PHYSICAL SCIENCES SCHOOL OF PHYSICS AND ASTRONOMY

Algorithms for Ensemble Control. B Leimkuhler University of Edinburgh

Unravelling the nuclear pore complex

Random Walks A&T and F&S 3.1.2

Basics of Statistical Mechanics

2.3 Modeling Interatomic Interactions Pairwise Potentials Many-Body Potentials Studying Biomolecules: The Force

Numerical Integration of SDEs: A Short Tutorial

Approximation of Top Lyapunov Exponent of Stochastic Delayed Turning Model Using Fokker-Planck Approach

Speeding up Convergence to Equilibrium for Diffusion Processes

CSC 2541: Bayesian Methods for Machine Learning

Introduction to multiscale modeling and simulation. Explicit methods for ODEs : forward Euler. y n+1 = y n + tf(y n ) dy dt = f(y), y(0) = y 0

Pattern Recognition and Machine Learning

Handbook of Stochastic Methods

Stat 516, Homework 1

What Can Physics Say About Life Itself?

Syllabus BINF Computational Biology Core Course

Systematic Closure Approximations for Multiscale Simulations

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Monte Carlo in Bayesian Statistics

Different approaches to model wind speed based on stochastic differential equations

A Degree In Chemistry

MSc Dissertation topics:

Chapter 2. Brownian dynamics. 2.1 Introduction

Molecular Dynamics Lecture 3

J. Sadeghi E. Patelli M. de Angelis

Density estimation. Computing, and avoiding, partition functions. Iain Murray

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

Lecture 12: Detailed balance and Eigenfunction methods

Convex Optimization CMU-10725

Recommended Courses by ECE Topic Area Graduate Students

STOCHASTIC PROCESSES IN PHYSICS AND CHEMISTRY

Lecture 4: Introduction to stochastic processes and stochastic calculus

Introduction to Machine Learning CMU-10701

Path Integrals. Andreas Wipf Theoretisch-Physikalisches-Institut Friedrich-Schiller-Universität, Max Wien Platz Jena

Sampling-based probabilistic inference through neural and synaptic dynamics

A mathematical framework for Exact Milestoning

Elementary Applications of Probability Theory

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Seminar: Data Assimilation

String method for the Cahn-Hilliard dynamics

Markov Chain Monte Carlo

Transcription:

A random walk from molecular dynamics to data science Benedict Leimkuhler University of Edinburgh The Maxwell Institute for Mathematical Sciences and The Alan Turing Institute

Molecular Dynamics (MD) Biomolecules Materials cytochrome C folding: Elber and Cardenas Quasi-continuum method: molecular dynamics with coarse-graining for indentation studies. Ortiz, Phillips and Tadmor

Molecular Dynamics 2017: article in Drug Discovery Today Molecular dynamics-driven drug discovery: leaping forward with confidence developing a market-approved drug costs a staggering $2.6 billion Given that the dynamics of the covalent bonds involving hydrogen atoms are not crucial in biological problems, they are usually constrained using integration algorithms, such as SHAKE, RATTLE, Hence, a time-step value in the range of 1.5 2 fs is possible and has been shown to be suitable for MD simulations of many biological systems." Examples of some 2017 simulations out of around 20000 published Ligand binding in the HIV-integrase enzyme Druggability of membrane bound Rab5 Fission fragment damage in nuclear fuel Patchy particle systems Nanoparticle cholestoral metabolism therapeutics Multiscale protein dynamics in antigen presentation Bottlebrush copolymers Modification of membrane structure by lithium Smectic C semiflexible polymers Graphene reinforced polymer nanocomposites Unsaturated Zwitterionic lipids Identifying the warfarin binding site

HIV-1 Virus Capsid (protective shield around virus) Molecular dynamics 64M atoms (including surroundings) Described by a potential energy function U(x) Chaotic, nonlinear dynamics, ever expanding scale Key questions are of a probabilistic nature

<latexit sha1_base64="rlexfyzr8cyk87qy3eae4nayum0=">aaacbnicbvdlssnafl2pr1pfuzcidbahbkoigi6lblxwsa9oqplmpu3qmstmtioldoxgx3hjqhg3fom7/8zpm4w2hhg4c8693htpkhcmton8w4wv1bx1jejmawt7z3fp3j9oqjivhdzizgpzdrcinew0oznmtj1iikxaassy3kz91ohkxeloxo8t6gvcj1ipeayn1lwpprzp5i2wtaas8ncgmk8kfe48kzpf1y47vwcgtezcnjqhr71rf3lhtfjbi004vqrjoon2myw1i5xosl6qailjepdpx9aic6r8bhbgbj0ajus9wjpnlpqpvzsyljqai8bucqwhatgbiv95nvt3rvymrumqautmg3oprzpg00xqycqlmo8nwuqysysiaywx0sa5kgnbxtx5mttpq65tde8uyrxrpi4ihmejvmcfs6jbldshaqqe4rle4c16sl6sd+tjxlqw8p5d+apr8wfywzgh</latexit> <latexit sha1_base64="rlexfyzr8cyk87qy3eae4nayum0=">aaacbnicbvdlssnafl2pr1pfuzcidbahbkoigi6lblxwsa9oqplmpu3qmstmtioldoxgx3hjqhg3fom7/8zpm4w2hhg4c8693htpkhcmton8w4wv1bx1jejmawt7z3fp3j9oqjivhdzizgpzdrcinew0oznmtj1iikxaassy3kz91ohkxeloxo8t6gvcj1ipeayn1lwpprzp5i2wtaas8ncgmk8kfe48kzpf1y47vwcgtezcnjqhr71rf3lhtfjbi004vqrjoon2myw1i5xosl6qailjepdpx9aic6r8bhbgbj0ajus9wjpnlpqpvzsyljqai8bucqwhatgbiv95nvt3rvymrumqautmg3oprzpg00xqycqlmo8nwuqysysiaywx0sa5kgnbxtx5mttpq65tde8uyrxrpi4ihmejvmcfs6jbldshaqqe4rle4c16sl6sd+tjxlqw8p5d+apr8wfywzgh</latexit> <latexit sha1_base64="rlexfyzr8cyk87qy3eae4nayum0=">aaacbnicbvdlssnafl2pr1pfuzcidbahbkoigi6lblxwsa9oqplmpu3qmstmtioldoxgx3hjqhg3fom7/8zpm4w2hhg4c8693htpkhcmton8w4wv1bx1jejmawt7z3fp3j9oqjivhdzizgpzdrcinew0oznmtj1iikxaassy3kz91ohkxeloxo8t6gvcj1ipeayn1lwpprzp5i2wtaas8ncgmk8kfe48kzpf1y47vwcgtezcnjqhr71rf3lhtfjbi004vqrjoon2myw1i5xosl6qailjepdpx9aic6r8bhbgbj0ajus9wjpnlpqpvzsyljqai8bucqwhatgbiv95nvt3rvymrumqautmg3oprzpg00xqycqlmo8nwuqysysiaywx0sa5kgnbxtx5mttpq65tde8uyrxrpi4ihmejvmcfs6jbldshaqqe4rle4c16sl6sd+tjxlqw8p5d+apr8wfywzgh</latexit> <latexit sha1_base64="rlexfyzr8cyk87qy3eae4nayum0=">aaacbnicbvdlssnafl2pr1pfuzcidbahbkoigi6lblxwsa9oqplmpu3qmstmtioldoxgx3hjqhg3fom7/8zpm4w2hhg4c8693htpkhcmton8w4wv1bx1jejmawt7z3fp3j9oqjivhdzizgpzdrcinew0oznmtj1iikxaassy3kz91ohkxeloxo8t6gvcj1ipeayn1lwpprzp5i2wtaas8ncgmk8kfe48kzpf1y47vwcgtezcnjqhr71rf3lhtfjbi004vqrjoon2myw1i5xosl6qailjepdpx9aic6r8bhbgbj0ajus9wjpnlpqpvzsyljqai8bucqwhatgbiv95nvt3rvymrumqautmg3oprzpg00xqycqlmo8nwuqysysiaywx0sa5kgnbxtx5mttpq65tde8uyrxrpi4ihmejvmcfs6jbldshaqqe4rle4c16sl6sd+tjxlqw8p5d+apr8wfywzgh</latexit> <latexit sha1_base64="5zgcieh/9te+m1b+d1lp/bb74sm=">aaacenicbvc9tsmwghtkxyl/auywiwqphagsharjbqtjkuhbqqmv4zitvtujbae1ivimllwkcwmistkx8ta4bqzoocns6e4+ff7otxivyrk+jdlk6tr6rnmzsrw9s7tn7h+0zzwktbwcs1h0fsqjoxfxffwmdbnbepcz6fij66nfesbc0ji6u5oeebwnihpsjjsw+my9cwwhqe7ytdauq+gmik5udml9dupojydfynw3q1bdmgeue7sgvvcg1te/3cdgkserwgxj2botrhkzeopirvkkm0qsidxca9ltnekcsc+bnztde60emiyffpgcm/x3ria4lbpu6yrhaigxvan4n9dlvxjpztrkukuipf8upgzqm6f9wiakghwbaikwopqvea+rqfjpfiu6bhvx5gxspmvyvso+pa82r4o6yuaihimasmefaiib0aiowoarpinx8gy8gs/gu/exj5amyuyq/ihx+qn+vpy4</latexit> <latexit sha1_base64="zohxt+42lb03snodu1wswi5h9ec=">aaab9xicbvbnswmxem3wr1q/qh69bitql2vxbd0wvxisyd+gu5zsntugzpmlma2u0v/hxymixv0v3vw3pu0etpxbwoo9gwbmhangblz32ymsrw9sbhw3szu7e/sh5cojllgzpqxjlvc6exldbjescrwe66saksqurb0ob2d+e8s04uo+wdhlqul6ksecerdsoz8ioh3wqk8jbee9cswtuxpgvellpijynhrllz9sneuybcqimv3ptsgyea2ccjyt+zlhkafd0mddsyvjmakm86un+mwqey6vtiubz9xfexosgdnoqtuzebiyzw8m/ud1m4ivgwmxaqzm0swiobmyfj5fgcougquxtorqze2tma6ijhrsucubgrf88ippxdq8t+bdx1bqn3kcrxsctleveegk1dedaqamokijz/sk3pwn58v5dz4wrqunnzlgf+b8/galo5jd</latexit> <latexit sha1_base64="zohxt+42lb03snodu1wswi5h9ec=">aaab9xicbvbnswmxem3wr1q/qh69bitql2vxbd0wvxisyd+gu5zsntugzpmlma2u0v/hxymixv0v3vw3pu0etpxbwoo9gwbmhangblz32ymsrw9sbhw3szu7e/sh5cojllgzpqxjlvc6exldbjescrwe66saksqurb0ob2d+e8s04uo+wdhlqul6ksecerdsoz8ioh3wqk8jbee9cswtuxpgvellpijynhrllz9sneuybcqimv3ptsgyea2ccjyt+zlhkafd0mddsyvjmakm86un+mwqey6vtiubz9xfexosgdnoqtuzebiyzw8m/ud1m4ivgwmxaqzm0swiobmyfj5fgcougquxtorqze2tma6ijhrsucubgrf88ippxdq8t+bdx1bqn3kcrxsctleveegk1dedaqamokijz/sk3pwn58v5dz4wrqunnzlgf+b8/galo5jd</latexit> <latexit sha1_base64="zohxt+42lb03snodu1wswi5h9ec=">aaab9xicbvbnswmxem3wr1q/qh69bitql2vxbd0wvxisyd+gu5zsntugzpmlma2u0v/hxymixv0v3vw3pu0etpxbwoo9gwbmhangblz32ymsrw9sbhw3szu7e/sh5cojllgzpqxjlvc6exldbjescrwe66saksqurb0ob2d+e8s04uo+wdhlqul6ksecerdsoz8ioh3wqk8jbee9cswtuxpgvellpijynhrllz9sneuybcqimv3ptsgyea2ccjyt+zlhkafd0mddsyvjmakm86un+mwqey6vtiubz9xfexosgdnoqtuzebiyzw8m/ud1m4ivgwmxaqzm0swiobmyfj5fgcougquxtorqze2tma6ijhrsucubgrf88ippxdq8t+bdx1bqn3kcrxsctleveegk1dedaqamokijz/sk3pwn58v5dz4wrqunnzlgf+b8/galo5jd</latexit> <latexit sha1_base64="zohxt+42lb03snodu1wswi5h9ec=">aaab9xicbvbnswmxem3wr1q/qh69bitql2vxbd0wvxisyd+gu5zsntugzpmlma2u0v/hxymixv0v3vw3pu0etpxbwoo9gwbmhangblz32ymsrw9sbhw3szu7e/sh5cojllgzpqxjlvc6exldbjescrwe66saksqurb0ob2d+e8s04uo+wdhlqul6ksecerdsoz8ioh3wqk8jbee9cswtuxpgvellpijynhrllz9sneuybcqimv3ptsgyea2ccjyt+zlhkafd0mddsyvjmakm86un+mwqey6vtiubz9xfexosgdnoqtuzebiyzw8m/ud1m4ivgwmxaqzm0swiobmyfj5fgcougquxtorqze2tma6ijhrsucubgrf88ippxdq8t+bdx1bqn3kcrxsctleveegk1dedaqamokijz/sk3pwn58v5dz4wrqunnzlgf+b8/galo5jd</latexit> Turning it into maths Physicists tell us that what they need to do is calculate Z '(x)dµ(x) observable function '( ) Gibbs measure just integration but with 180M variables!

<latexit sha1_base64="pohpw9humefyccb8iv+v21wdfp8=">aaack3icbvblswmxgmzwv62vqkcvwslus9kvqs9cqqoek9ghdjclm6ztajjdkmyxlpt/vphxpojbb179h2bbhmrrb4hjzdckm0heqnk2/wnlvlbx1jfym4wt7z3dvel+qvofscskgumwynaafgfukiammpf2janiasotyhid6a0rkyqg4kgpi+jx1be0rzhshvklnzcjpqic5db1xr5dd4rknkdlx1n4bv0qtj+4gdf4k85liss57kbgyg5+swrx7mnazedmqanmpu4xx91uignohmymkdvx7eh7czkaykbsghsreie8rh3smvagtpsxtlkm8mqwxdglptlcwwk770gqv2rma7ozjvolwkb+p3vi3bv0eiqiwbobpw/1ygz1clpiyjdkgjubg4cwpoavea+qrfibegumbgcx8jjonlucu+lcn5eqtvkdexaejkezooacvmedqimgwoajvib38ge9w2/wl/u9xc1zm88h+dpwzy/cwkbk</latexit> invariant measure <latexit sha1_base64="ufox4zita877+ikq+m6ienxmvh0=">aaab6nicbvdlsgnbeoynrxhfuy9ebopgkeykomegf48rzqosjcxozpmhm7pltk8qqj7biwdfvppf3vwbj8kenlggoajqprsrsqww6pvfxmftfwnzq7hd2tnd2z8ohx41bzizxhsskylpr9ryktrvoedj26nhvewst6lr7cxvpxfjraifczzyungbfrfgfj300fvzr1zxq/4czjueoalajnqv/nxtjyxtxcot1npo4kcytqhbwssflrqz5sllizrghuc1vdygk/mpu3lmld6je+nki5mrvycmvfk7vphrvbshdtmbif95nqzj63aidjoh12yxkm4kwytm/iz9ythdoxaemipcryqnqaemxtolf0kw/piqav5ua78a3f9wajd5heu4gvm4hwcuoaz3uicgmbjam7zcmye9f+/d+1i0frx85hj+wpv8av1ejdy=</latexit> <latexit sha1_base64="ypu/zz/rjuesmglf2kjx3atf4h4=">aaab+3icbvbns8naen3ur1q/yj16crahxkoigl6eohepfewhtdfstpt26wytdifsepjxvhhqxkt/xjv/xm2bg7y+ghi8n8pmpd/mtiftfxultfwnza3ydmvnd2//wdysdlsuselbjokr7plyuc4ebqmdtnuxpdj0oe36k9uz332iurfipeaauzfei8ecrjboytor9aknzrxmxnv544cjaflprnknew5rltgfqaeclc/8ggwjkoruaofyqb5jx+bmwaijnoavqajojmkej2hfu4fdqtxsfntunwplaawr1cxamqu/jzickpwgvu4mmyzvsjct//p6cqrxbszenaavzleoslgfktulwhoysqnwvbnmjno3wmsmjsag46roejzll1dj57zh2a3n/qlwvcnikknjdilqyegxqinuuau1euft9ixe0zurgy/gu/gxac0zxcwr+gpj8wdvk5ql</latexit> <latexit sha1_base64="zhzqntoszix3l2pm3wlwblnlzai=">aaacinicbvdlsgmxfm3uv62vqks3wsjuhdijgroofn24rgaf0kldjs20oulmsdlfmsy3upfx3lhq1jxgx5i2s9dwa4hdoedyc48fmaq0bx9zuaxlldw1/hphy3nre6e4u9duyswxaecqhblti0uyfashqwakhumcum9iyx9et/zwiehfq3gnxxhpctqxnkayasn5xut3ghtijpcmbjt1okxcn5aij06a8bmnha6kuzeiqp3ec5jlyg+eopakjbtitwexizoreshq94ofbi/emsdcy4au6jh2plsjkppirtkcgyssitxefdixvcbovdeznpjci6p0ybbk84sgu/x3rik4umpumyrheqdmvyn4n9ejdxdrtaiiyk0eni0kygz1ccd9wr6vbgs2ngrhsc1fir4g05a2rrzmcc78yyukevpx7ipze1aqxwv15meboarl4ibzuam3oa4aainh8axewzv1zl1y79bnljqzspl98afw9w8ndapx</latexit> <latexit sha1_base64="piajc1yww/awe6ss5xb0fgy0lp0=">aaacgxicbvbps8mwhe39o+e/qkcvwshmg6mvqs/c0ivhcxybrrwkabafjw1jutko/rpe/cpepcjiuu9+g7otb918epj47/1ifi9igjxksr6nhcwl5zxv0lp5fwnza9vc2w3kobwyodhmswghsbjgi+ioqhhpj4ighjdscgzxy7/1qiskcxsrrgnxoopftesxulrytstzbydh7vk0ojycf/dozztp77njo4dex9drrg6l3na3k1bnmgdoe7sgfvcg4zufbhjjljniyyak7nhworwmcuuxi3nztsvjeb6ghulogifopjdnnsvhovzc2i2fppgce/x3ria4lcme6crhqi9nvbh4n9djvffcy2iupipeeppqn2vqxxbcewypifixksyic6r/cnefcysvlross7bnv54nzzoabdxsm9nk/bkoowt2wqgoahucgtq4bg3gaawewtn4bw/gk/fivbsf0+icuczsgt8wvn4a71ue/g==</latexit> Sampling and MCMC Typical approach: using a Markov Chain with prescribed Example method: Metropolis Monte Carlo

<latexit sha1_base64="ipbhajzhpzzplun7nsgjggxlzim=">aaacdxicbvdltsjafj3ic/fvdelmipq0iwdbmojghojgjsywskcq6tdahokjm1mdifyag3/fjquncevenx/jaf0oejkbnjxzb+69x48zfdkyvrxmyura+kz2m7e1vbo7p+8fvewucexchlgi130kckmhcswvjnrjtldgm1lzbzdtv/zaukbrec9hmfec1atpl2ikldtwt1xjamiraphfdoayxrdlfg2z5cacnozcwtbbet4qwtpazwknja9svnr6v7mt4sqgocqmcdgwrvh6y8qlxyxmcs1ekbjhaeqrhqihcojwxrnvjvbukr3yjbiqumkz+ntijaihrogvogmk+2lrm4r/ey1edi+9mq3jrjiqzxd1ewzlbkfrwa7lbes2ugrhttwtepcrr1iqahmqbhvx5wvsduq2vblvzvpl6zsoldgcx8aanrgazxalksafgdycz/ak3rqn7uv71z7mrrktntkef6b9/gcqezy6</latexit> <latexit sha1_base64="u++gakoc+pkxbqglatnj+qg4cfq=">aaab8hicdvdltsjafj3ic/gfunqzkzjgwqblzy7oxiumfjbqyxqyymj02sxmjathk9y40bi3fo47/8ypykjgt3ktk3puzb33ecgjulnwh5fawv1b30hvzra2d3b3svshlrleahmhbywqhq9jwignjqkkku4ocpi9rtre5clx23desbrwazunieujeaddiphs0g25jc+c/p3prj/nwwalwq4xi9ay7vq5vk9iovsu2da0twuohfii2c++9wybjnzcfwziyq5thcqnkvaumzll9cjjqoqnaes6mnlke+ng84nn8eqrazgmhc6u4fz9phejx8qp7+loh6mx/o0l4l9en1ldmhtthkakclxyniwyvafmvocdkghwbkojwolqwyeei4gw0hlldahfn8l/satg2jqiq1kucb6miw2owdhiaxtuqqncgizwaay+eabp4nkqxqpxyrwuwlpgcuyq/idx9gl2hjaw</latexit> Example: Metropolis Monte Carlo for the Uneven Double Well Gaussian prior: rare events slow convergence too much measure in the right basin

<latexit sha1_base64="9tto/n8ofer8lgngebhgmxsljd8=">aaacjnicbzdnsgmxfiuz/lv/qi7dbitqectmexqjig5cvndaqmcomtrtg5nmmnyrlmgexo2v4safiulorzftr9dwcygh79xlck8qc67btj+tufmfxaxlldxc2vrg5lzxe6euo0rr5tjirkozem0el8wfdoi1y8vigajwco6urn7jgsnni3klw5j5ielj3uwugeht4nnqqrb3sge+x8eejieg2c0pdnhoar9ht98rskvzd5vcjawn7wljrtjjwrpcyuuj5vvrf1+9tkstkemggmjdcuwy/jqo4fswroalmswe3peeaxkpsci0n47xzpcbir3cjzq5evcy/p5isaj1maxmz0igr6e9efzpayxqpfntlumemksth7qjwbdhuwa4wxwjiizgekq4+sumfaiibznswytgtk88k+rvimnxnjut0svlhsck2kp7qiwcdiou0dwqirdr9iie0st6s56sf+vd+pi0zln5zc76u9bxn3xho/g=</latexit> Sampling using SDEs Ito-type Stochastic Differential Equation Brownian Dynamics Wiener measure describes a particle diffusing in a potential U at fixed temperature

Fokker-Planck Equation The stochastic paths of an Itō SDE sample the evolving distribution with probability density satisfying @ @t = L L is the L 2 adjoint of L defined by (L g)(x) = lim t!0 E[g(x( t)) x(0) = x] g(x) t

<latexit sha1_base64="lnwzustj1+34cpnlf+gcgyowa1o=">aaacrxicbvblaxsxgnsmetovtz3mimiczifmnwtasyc0bhp0ohyclrnoza0timkx6vsnruyf6yx33pipeumhjesaah0h8hoqjgbmq58myawweia3wdyh+yxfpewv2ura+szm/eonrs0kw3ihztizzwm1xarnoyba8rpcckosyu+ti++vfzrmxopm/4rjzvukdrvibapgpbhoibqqdkcmgi6agpndyij0cpmsw7nbi8rqcnf4dpimqclhonnvhn1d7ihrefacpsreurglitsuy0duut4l43ojbivt4lckmpegmqed12/iigof4hqypnb2ojchvqmgbjo8rjhc8pyyczrkpu81vdz23bsfeu94zydtzpijau/v5xookmsnkvhjal/72qve97xeaenxvhm6l4br9vhqwkgmga4qxqnhoam58yqyi/yumi2ooqx88tvfqvt6y29jd78vha3o5kbx9g1wxzlaqtuoisl0br2hh6inooihx+g3+ov+bdfbn+auuh+mzgwzmc/obykh/9cqsmu=</latexit> <latexit sha1_base64="ur9aqlkrsnvuaj45yzb/1dknvmy=">aaab/hicbzdnssnafiun/tb6f+3szwar6sksikaboejgzqxtftpyjtpbdugke2ymygj1vdy4umstd+lot3hazqgtbwy+zr2xe+cemwdko863tbs8srq2xtgobm5t7+zae/snjrjjwaocc9kkialoiva00xxasqqsbhyaweh6um8+gfrmrhc6jcepysbifuajnlbxlnxkufqej/elhvvsxdm47tplp+pmhrfbzagmctw79lenj2gsqqqpj0q1xsfwfkakzptdunhjfmsejsga2gyjeolys+nxy3xknb7uc2lepphu/t2rkvcpnaxmz0j0um3xjuz/txai+xd+xqi40rdr2aj+wrewejie7jejvppuakgsmvsxhrjjqdz5fu0i7vyxf6fxwnwdqnt7vq5d5xeu0ae6rbxkonnuqzeojjxeuyqe0st6s56sf+vd+pi1lln5tan9kfx5a3o/k1o=</latexit> Brownian Dynamics so is a stationary density & geometric ergodicity under mild conditions on U Ergodic limit Just need a reliable way of computing numerical solutions of SDEs (on long intervals)!

SDE Numerics What is a suitable definition for error for an SDE? Examples: weak, strong, ergodic How to get high accuracy in SDE discretisation? How does discretisation affect convergence to the invariant distribution?

Euler-Maruyama Method Numerical Method discrete Brownian path Leimkuhler-Matthews Method [L. & Matthews, AMRX, 2013] [L., Matthews & Stoltz, IMA J. Num. Anal., 2015] [L., Matthews & Tretyakov, Proc Roy Soc A, 2014]

Uneven Double Well ergodic error small stepsize bimodal distribution large stepsize L-M E-M

Invariant measures of numerical methods Sampling Applications Charlie Matthews (UoE PhD 2012-2016) 5 papers + a book and over 300 citations

Charlie Matthews (PhD 2015, now @ Chicago) Langevin (BAOAB) Matthias Sachs (PhD 2017, now @ Duke) Invariant measures Overdamped Limit Constrained LD Xiaocheng Shang (PhD 2015, now @ETH) Langevin/AdL Ergodicity Adaptive Langevin Gabriel Stoltz (ENPC Paris) Michael Tretyakov (Nottingham) Vincent Danos (ENS Paris)

the most recent citations for our 2016 geodesic integrators paper: physical chemistry biomolecules data science biomolecules molecular software data science physics stats & data science stats & data science Why are statisticians reading molecular dynamics papers?

Wind Turbines (with industry partner) Zofia Trstanova (postdoc) Anton Martinsson (PhD student)

Financial giant Goldman Sachs announces that it plans to invest $150 billion in clean energy projects and technology like solar and wind farms, energy efficiency upgrades for buildings, and power grid infrastructure. Problem: Goldman Sachs analysts lack of turbine knowledge Solution: Employ a specialist energy management firm Problem: Each windfarm may have hundreds of turbines each turbine produces data at 10s to 10m intervals on dozens of properties Large data problem [10K turbine years!] Solution 1: Solution 2: Hire engineers to stare at data Get a team of mathematicians to develop a data-driven model to analyse the data.

Ideal power output to windspeed diagram Example data set Learn: to categorise conditions: derating, icing, Goal: Automatic analysis of data streams for maintenance, performance analysis, planning

<latexit sha1_base64="6st+2uni49vtkzhpfrx8lltbni8=">aaab6hicbvbns8naej3ur1q/qh69lbbbu0le0gpri8cw7ae0owy2k3btzhn3n0ij/qvepcji1z/kzx/jts1bwx8mpn6bywzekaiujet+o4w19y3nrej2awd3b/+gfhju0ngqgdzzlglvcahgwsu2dtcco4lcgguc28h4dua3n1bphst7m0nqj+hq8pazaqzueoyxk27vnyosei8nfchr75e/eooyprfkwwtvuuu5ifezqgxnaqelxqoxowxmh9i1vniitz/nd52sm6smsbgrw9kqufp7iqor1pmosj0rnso97m3e/7xuasjrp+mysq1ktlgupokymmy+jgoukbkxsyqyxe2thi2ooszybeo2bg/55vxsuqh6btvrxfzqn3kcrtibuzghd66gbndqhyywqhigv3hzhpwx5935wlqwnhzmgp7a+fwb252m9q==</latexit> <latexit sha1_base64="do2tvhsgwpjem6ngq7vwez4dw5a=">aaacj3icbzdps8mwfmft+wvwx1wpxojd8dtaiuwpytclxwludtzs0jtbwtkkjkkwyv4bl/4rxgqv0ap/idlwudcfhhzzee/x8r5ryqjsrvtplzawv1bxyuv2xubw9o6zu9dwipoytlbgqnyipaijnlq01yx0uklqejfyfw2vjvm7eyivffxwj1isjkjpay9ipa0knytmmpsygey9hufqrz7vfd8uycz6m+jwfqdqayil7ivoxvztgivck0qffnemnrc/fjhlcneyiaw6npvqiedsu8zi2pyzrvkeh6hpukzylbav5nm9x/dikbj2hdshazilvztylcg1sijtmsa9upo5cfwv18107ztiku8zttiedepldgobj6bbmeqcnrszgbck5q8qd5bewbtrbwocn7/yomjxqp5b9w5oko3lwo4yoach4bh4oa4a4bo0qqtg8acewct4sx6tz+vd+pivlqyizx/8cevrg3jrofk=</latexit> Model Models may be artificial or not, but the distinction is often artificial! Ex: convolutional neural network

<latexit sha1_base64="rzlbq3g0+drgaqrply9ijnsmmqu=">aaacc3icbvc9tsmwght4leuvwmhitujqlypbsdbwsdawibarmihyxke16isu7sbvaxcwxowfayryeqe23ganzqatj1k63x2fpt8fnfgplovbwfvf2nzalu2ud/f2dw7no+ootfkbsrsnlbfogcrhncztrrujdhcerqej3wb0k/vdbyiktej7nehei9agpihfsgnjnysup7xx1kldl4ueqwtmgjmd5wl1rdq47ptvq2hnavejxzaqkndyzs+3n+a0irhcdenzsy2uvawjrtejs7kbssirhqeb6wkao4hil5tnmcezrfrhmaj9ygxn6u+ndevstqjat0zidewyl4v/eb1uhvdermoekhljxaewzvbhzoubfsoivmyicckc6r9cpeqcyaxrk+ss7oxiq6rz3rcthn13uw1ef3wuwcmogbqwwsvoglvqam2awsn4bq/gzxgyxox342mxumyuoyfgd4zph56emxs=</latexit> <latexit sha1_base64="qmjny7d3nwzcz0iuuxl57xkwnik=">aaaccnicbvdlssnafj34rpuvdelmtah1uxirdfmounelfewdmhgm00k7ddkjmxoxxk7d+ctuxcji1i9w5984abpq1gmxdufcy733+dgjulnwtze3v7c4tfxyka6urw9smlvbtrklapmgjlgk2j6shffogooqrtqxicj0gwn5g/pmb90riwner9uwjm6iepwgfcoljc/cc2jabj/chsiqdorus2nvht1cwky+96g2plnkvawx4cyxc1icoeqe+ev0i5yehcvmkjqd24qvmykhkgzkvhqsswkeb6hhoppyfblppunxrvbak10yreixv3cs/p5iusjlmpr1z4hux057mfif10lucoqmlmejihxpfgujgyqcws6wswxbig01qvhqfsvefsqqvjq9og7bnn55ljspkrzvsa+os7wzpi4c2ax7oaxscajq4aluqqng8aiewst4m56mf+pd+ji0zhn5za74a+pzb7p9mp8=</latexit> DATA PARAMETERS prior model LIKELIHOOD Bayes Theorem minimise U(q) to determine optimal parameters I m not really me Data Scientist Thomas Bayes, U of Edinburgh, Class of 1721

Wind Turbine Benchmarking With enough data, it is straightforward to replace the engineer with a convolutional neural network for detecting a specific condition, e.g. icing Problems: 1. Mixed data sets 2. Lack of transferability Unsupervised learning to learn classes from data Generalisation error Generative networks 3. Multimodality all require sampling!

Multimodality in NNs With Zofia Trstanova and Frederik Heber (Turing) P(red) isolated local min 9 parameter model, multimodal loss landscape with pretty severe trapping in upper well.

Sampling Metastable Systems Ways forward: 1. EQN. New types of diffusions that more efficiently explore the complex geometry; in particular nonreversible diffusions. w. J. Weare & C. Matthews, Chicago. 2. ISST. Use of temperature and other collective variables to enhance and guide sampling. w. A. Martinsson, J. Lu (Duke) & E. Vanden-Eijnden (NYU). 3. DM. Data-analytic tools based on diffusion maps to perform unsupervised manifold learning from simulated data in order to enhance exploration. w. Z. Trstanova, T. Lelievre (Paris) and R. Banisch (Berlin)

(building on work of Clementi, Kevrekides )

Thermodynamic Analytics Toolkit (TATi) Frederik Heber (Rutherford-Turing Fellow) Benedict Leimkuhler Zofia Trstanova (EPSRC PDRA - Edinburgh) Vision: Exploration tool for loss landscape in ANNs Sampling methods in TensorFlow purposes Develop understanding of NN properties Assist in analysis/redesign of NNs

The Edinburgh Environment Maxwell Institute Union of mathematicians at Edinburgh and Heriot-Watt Universities. Currently unifying graduate programmes under the banner Maxwell Institute Graduate School. International Centre for the Mathematical Sciences Runs workshops, supports small group research and coordinates knowledge exchange and public events. Centres for Doctoral Training e.g. MIGSAA Main mechanism for funding PhD students and coordinating training across the Maxwell Institute 3 proposed in 2018: MAC-MIGS (Modelling, analysis and computation), GLAMS (Glasgow-Maxwell School in algebraic methods), and Data Science & AI (a joint venture with Informatics)

Thomas Bayes Centre

Why Edinburgh? amazing skyline: access to the highlands fringe festival folk music still part of the EU but.

The real reason Scottish Boletus Harvest (Porcini/Ceps)

Thank you for listening!