Molecular dynamics simulations and drug discovery

Size: px

Start display at page:

Download "Molecular dynamics simulations and drug discovery"

Todd Underwood
5 years ago
Views:

2 olecular dynamics simulations and drug discovery Jacob D. Durrant, J. Andrew ccammon BC Biology :71 DOI: / With constant improvements in both computer power and algorithm design, the future of computer-aided drug design is promising; molecular dynamics simulations are likely to play an increasingly important role. First principles computational materials design for energy storage materials in lithium ion batteries Ying Shirley eng,. Elena Arroyo-de Dompablo Energy Environ. Sci. 2009, 2, DOI: /b901825e Specifically, we show how each relevant property can be related to the structural component in the material and can be computed from first principles.

3 SAPLING TIE ap for Atomistic Simulations ms μs OLECULAR DYNAICS ns ps AB-INITIO ETHODS fs SYSTE SIZE

4 SAPLING TIE ap for Atomistic Simulations ms μs OLECULAR DYNAICS ns ps AB-INITIO ETHODS Q/ ETHODS fs SYSTE SIZE

Nobel Prize 2013 in Chemistry: Development of

5 Nobel Prize 2013 in Chemistry: Development of ultiscale odels Further details: Electron dynamics in complex environments with real-time time dependent density functional theory in a Q- framework U.N. orzan et al. The Journal of Chemical Physics 140, (2014). DOI: /

6 THE LIO PROJECT Implementing novel methods of atomistic quantum simulation that efficiently use the current state-of-the-art hardware tools Computer Science LIO Theoretical Chemistry github.com/nanolebrero/lio

7 THE LIO PROJECT Quantum Calculations (Q/ with Amber) DFT + Gaussian Basis Single Point Calculation Electronic Propagation Born-Oppenheimer TD-DFT Electron Dynamics Dynamics Ehrenfest Dynamics github.com/nanolebrero/lio

8 DFT: The way of the density Hohenberg - Khon Theorem ρ( x, y, z) Ψ (r 1,, r n ) otions are governed not by deterministic laws, but by probability functions; chemical bonds are formed not mechanically, but by shifting clouds of electrons that are simultaneously waves and particles. olecular dynamics simulations and drug discovery

9 DFT: The way of the density Hohenberg - Khon Theorem ρ( x, y, z) Ψ (r 1,, r n ) ρ( x, y, z) = Pij χ i ( x, y, z ) χ j ( x, y, z) i=1

10 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ]

11 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ] Analytical Resolution: ρ(r ) = Pij χi (r ) χ j (r ) i=1 gets distributed = T [ P ] +V ne [ P ] +V ee [ P ]

12 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ] Single Point ( 1 Step )

13 E XC [ ρ ] ϵ XC [ ρ(x, y, z ), ρ(x, y, z) ] dv E XC [ ρ ] ϵ XC [ ρ(r k ), ρ(r k ) ] V k k ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 χi ρ = x i=1 x j =1 i=1 Pij χ j + χ i χj Pij x

14 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral

15 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess Pij χ1 (r 1) χ 1 (r K ) χ (r 1 ) χ (r K ) E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral

16 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral

17 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ϵxc [ ρ(r k ), ρ( r k ) ] V k k Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral

18 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral

19 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 1 Thread = 1 Point + Double Sum

20 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 i=1 ρ(r k ) = χ i (r k ) P ij χ j (r k ) 1 Thread = 1 Point + Only j-sum (+ other subroutine to i-accumulate)

21 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 i=1 ρ(r k ) = χ i (r k ) P ij χ j (r k ) SAE PATTERN SAE NUBER χi ρ = x i=1 x i=1 j =1 Pij χ j + χi χj P ij x

22 P ij χ j (r k ) POINTS OF GRID FUNCTIONS OF BASIS k=1 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 k=2 k=3 k=4 k=5 k=6 k=7 k=8

23 BLOCKS OF SIZE B POINTS OF GRID FUNCTIONS OF BASIS k=1 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 k=2 k=3 k=4 k=5 k=6 k=7 k=8

24 Thread 1 ( function i = 1 ) Thread 2 ( function i = 2 ) P1 j χ j (r k ) P2 j χ j (r k ) Point k fixed Thread B ( function i = B ) BLOCKS OF SIZE B P Bj χ j (r k )

25 Thread 1 ( function i = 1 ) Thread 2 ( function i = 2 ) P1 j χ j (r k ) P2 j χ j (r k ) Point k fixed Thread B ( function i = B ) BLOCKS OF SIZE B P Bj χ j (r k ) Different sets of j = 1 to for all threads The same set of j = 1 to for all threads Pi 1, P i 2,, Pi χ 1 (r k ), χ2 (r k ),, χ (r k )

26 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + j= B+1 P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k )

27 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + j= B+1 P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k ) (1) Load functions j = 1 to B (one each thread) to shared mem. (2) syncthreads() (3) Each thread performs sums from 1 to B. Repeat (1) to (3) for j = B+1 to 2B, 2B+1 to 3B, etc.

28 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + { P11 PB 1 j= B+1 P1 2 P1 B PB 2 P B B } P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k )

29 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + { P11 PB 1 j= B+1 P1 2 P1 B PB 2 P B B } P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k ) TEXTURE EORY 2D access Local cache

30 GET GLOBAL VARS =+ P11 GET χ 1 (r k ) GLOBAL VARS From a memory intensive kernel =+ P12 χ 2(r k )

31 GET GLOBAL VARS =+ P11 GET χ 1 (r k ) GLOBAL VARS From a memory intensive kernel GET GLOBAL VARS =+ P11 χ 1 (r k ) =+ P12 χ 2(r k ) to a more arithmetic intensive kernel. =+ P1 B χ B (r k )

( Estimated time for the 4 core CPU ) x5.7 14 GPU: GeForce GTX 780 12 x6.

3 Test Systems 4 Speedup (1 core) Heme Taxol Valinomycin 2 x 29 x 26 x 23 0 CPU Heme GPU CPU Taxol

32 ( Estimated time for the 4 core CPU ) x GPU: GeForce GTX x HARDWARE CPU: Intel i Exc Time (s) 8 6 x7.3 Test Systems 4 Speedup (1 core) Heme Taxol Valinomycin 2 x 29 x 26 x 23 0 CPU Heme GPU CPU Taxol GPU CPU GPU Valinomycin ( no openp ) Details:. A. Nitsche,. Ferreria, E. E. ocskos,. C. Gonzalez Lebrero, J. Chem Theory and Comput. 2014, 10, DOI: /ct400308n

33 Gaussian function centered in atom Function value is relevant for this point Function value is not relevant for this point

34 GROUPS OF POINTS Groups solved separately: reduced version of problem. Inhomogeneous workload: few groups with a lot of points/functions VS a lot of groups with very few. Small Group: bad for GPU. Hybrid Partitioning: Small spheres (lots of points) and big cubes (fewer points)

35 HYBRID SCHEE Compute the smallest groups in the CPU. Hardware: Tesla K80 (1 device) N x Intel Xeon E v3 synergistic behavior

36 RESULTS FOR THE HYBRID XC TER KERNEL HYBRID COPUTING SCHEE Limitations Coulomb Q/ Hardware GeForce GTX 980 vs Intel i Tesla K80 vs Xeon E Hybrid (K80+8c) vs Xeon E Speedup x 8.0 x 1.9 x 2.8

37 CURRENT STATE HYBRID COPUTING SCHEE Q/ Coulomb cublas Current Limitations Exc (again!) atrix Diag/Dcmp? Single Point Time (sec) XC TER KERNEL 76 s 20 s 3.3 s Before XC Kernel After XC Kernel After: Hybrid, Q, Coulomb, cublas

Single Point Time (sec) XC TER KERNEL 76 s 20 s 3.

38 CURRENT STATE HYBRID COPUTING SCHEE Q/ Coulomb cublas Current Limitations Exc (again!) atrix Diag/Dcmp? Single Point Time (sec) XC TER KERNEL 76 s 20 s 3.3 s Before XC Kernel After XC Kernel After: Hybrid, Q, Coulomb, cublas Free Energy Calculation days 30 days

39 PROBLE ath expression specific to the physical model. Very inhomogeneous workload. emory intensive calculations. Direct manipulation of registers and memory influences algorithmic structure. ultiple blocks of threads can run in the same execution unit Re-organizing work to make it more homogeneous. Distribution of work that takes into account the different capabilities of CPUs / GPUs Customized solution with performance comparable to those of problems more naturally suited for GPU.

40 Thank you for your attention! And thanks to Everyone at the group of computer simulation of the DQIAyQF, FCEN, UBA. Collaborators: - Jorge Kohanoff - Cristián G. Sánchez - Ivan Girotto Institutional Support: - CONICET - UBA - ICTP THE LIO GROUP - ariano González Lebrero - Dario Estrin - Damian Scherlis - Esteban osckos - Uriel orzan - William Agudelo - Nicolás Foglia - Fernando Boubetta - Federico Pedron - anuel Ferreria - atías Nitsche - Juan Pablo Darago - Chad Hopkins visit our code! github.com/nanolebrero/lio

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming