Molecular dynamics simulations and drug discovery
|
|
- Todd Underwood
- 5 years ago
- Views:
Transcription
1
2 olecular dynamics simulations and drug discovery Jacob D. Durrant, J. Andrew ccammon BC Biology :71 DOI: / With constant improvements in both computer power and algorithm design, the future of computer-aided drug design is promising; molecular dynamics simulations are likely to play an increasingly important role. First principles computational materials design for energy storage materials in lithium ion batteries Ying Shirley eng,. Elena Arroyo-de Dompablo Energy Environ. Sci. 2009, 2, DOI: /b901825e Specifically, we show how each relevant property can be related to the structural component in the material and can be computed from first principles.
3 SAPLING TIE ap for Atomistic Simulations ms μs OLECULAR DYNAICS ns ps AB-INITIO ETHODS fs SYSTE SIZE
4 SAPLING TIE ap for Atomistic Simulations ms μs OLECULAR DYNAICS ns ps AB-INITIO ETHODS Q/ ETHODS fs SYSTE SIZE
5 Nobel Prize 2013 in Chemistry: Development of ultiscale odels Further details: Electron dynamics in complex environments with real-time time dependent density functional theory in a Q- framework U.N. orzan et al. The Journal of Chemical Physics 140, (2014). DOI: /
6 THE LIO PROJECT Implementing novel methods of atomistic quantum simulation that efficiently use the current state-of-the-art hardware tools Computer Science LIO Theoretical Chemistry github.com/nanolebrero/lio
7 THE LIO PROJECT Quantum Calculations (Q/ with Amber) DFT + Gaussian Basis Single Point Calculation Electronic Propagation Born-Oppenheimer TD-DFT Electron Dynamics Dynamics Ehrenfest Dynamics github.com/nanolebrero/lio
8 DFT: The way of the density Hohenberg - Khon Theorem ρ( x, y, z) Ψ (r 1,, r n ) otions are governed not by deterministic laws, but by probability functions; chemical bonds are formed not mechanically, but by shifting clouds of electrons that are simultaneously waves and particles. olecular dynamics simulations and drug discovery
9 DFT: The way of the density Hohenberg - Khon Theorem ρ( x, y, z) Ψ (r 1,, r n ) ρ( x, y, z) = Pij χ i ( x, y, z ) χ j ( x, y, z) i=1
10 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ]
11 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ] Analytical Resolution: ρ(r ) = Pij χi (r ) χ j (r ) i=1 gets distributed = T [ P ] +V ne [ P ] +V ee [ P ]
12 E [ ρ ] = T [ ρ ] + V ne [ ρ ] + V ee [ ρ ] + E XC [ ρ ] Single Point ( 1 Step )
13 E XC [ ρ ] ϵ XC [ ρ(x, y, z ), ρ(x, y, z) ] dv E XC [ ρ ] ϵ XC [ ρ(r k ), ρ(r k ) ] V k k ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 χi ρ = x i=1 x j =1 i=1 Pij χ j + χ i χj Pij x
14 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral
15 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess Pij χ1 (r 1) χ 1 (r K ) χ (r 1 ) χ (r K ) E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral
16 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral
17 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ϵxc [ ρ(r k ), ρ( r k ) ] V k k Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral
18 SINGLE POINT CALCULATION INPUT Nuclear Geometry Starting Guess ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 Pij E XC [ ρ ] (1) Compute Basis Values. (2) Compute Density Values. OUTPUT Ground State Density (3) Numeric Integral
19 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 1 Thread = 1 Point + Double Sum
20 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 i=1 ρ(r k ) = χ i (r k ) P ij χ j (r k ) 1 Thread = 1 Point + Only j-sum (+ other subroutine to i-accumulate)
21 ρ(r k ) = Pij χ i (r k ) χ j (r k ) i=1 j =1 i=1 ρ(r k ) = χ i (r k ) P ij χ j (r k ) SAE PATTERN SAE NUBER χi ρ = x i=1 x i=1 j =1 Pij χ j + χi χj P ij x
22 P ij χ j (r k ) POINTS OF GRID FUNCTIONS OF BASIS k=1 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 k=2 k=3 k=4 k=5 k=6 k=7 k=8
23 BLOCKS OF SIZE B POINTS OF GRID FUNCTIONS OF BASIS k=1 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 k=2 k=3 k=4 k=5 k=6 k=7 k=8
24 Thread 1 ( function i = 1 ) Thread 2 ( function i = 2 ) P1 j χ j (r k ) P2 j χ j (r k ) Point k fixed Thread B ( function i = B ) BLOCKS OF SIZE B P Bj χ j (r k )
25 Thread 1 ( function i = 1 ) Thread 2 ( function i = 2 ) P1 j χ j (r k ) P2 j χ j (r k ) Point k fixed Thread B ( function i = B ) BLOCKS OF SIZE B P Bj χ j (r k ) Different sets of j = 1 to for all threads The same set of j = 1 to for all threads Pi 1, P i 2,, Pi χ 1 (r k ), χ2 (r k ),, χ (r k )
26 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + j= B+1 P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k )
27 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + j= B+1 P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k ) (1) Load functions j = 1 to B (one each thread) to shared mem. (2) syncthreads() (3) Each thread performs sums from 1 to B. Repeat (1) to (3) for j = B+1 to 2B, 2B+1 to 3B, etc.
28 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + { P11 PB 1 j= B+1 P1 2 P1 B PB 2 P B B } P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k )
29 B 2B P 1 j χ j (r k ) + j= B +1 B 2B P 2 j χ j (r k ) + j =B +1 P1 j χ j (r k ) + + j= B+1 P1 j χ j (r k ) P2 j χ j ( r k ) + + j = B +1 P2 j χ j ( r k ) B 2B P Bj χ j (r k ) + { P11 PB 1 j= B+1 P1 2 P1 B PB 2 P B B } P Bj χ j (r k ) + + j = B +1 P Bj χ j (r k ) TEXTURE EORY 2D access Local cache
30 GET GLOBAL VARS =+ P11 GET χ 1 (r k ) GLOBAL VARS From a memory intensive kernel =+ P12 χ 2(r k )
31 GET GLOBAL VARS =+ P11 GET χ 1 (r k ) GLOBAL VARS From a memory intensive kernel GET GLOBAL VARS =+ P11 χ 1 (r k ) =+ P12 χ 2(r k ) to a more arithmetic intensive kernel. =+ P1 B χ B (r k )
32 ( Estimated time for the 4 core CPU ) x GPU: GeForce GTX x HARDWARE CPU: Intel i Exc Time (s) 8 6 x7.3 Test Systems 4 Speedup (1 core) Heme Taxol Valinomycin 2 x 29 x 26 x 23 0 CPU Heme GPU CPU Taxol GPU CPU GPU Valinomycin ( no openp ) Details:. A. Nitsche,. Ferreria, E. E. ocskos,. C. Gonzalez Lebrero, J. Chem Theory and Comput. 2014, 10, DOI: /ct400308n
33 Gaussian function centered in atom Function value is relevant for this point Function value is not relevant for this point
34 GROUPS OF POINTS Groups solved separately: reduced version of problem. Inhomogeneous workload: few groups with a lot of points/functions VS a lot of groups with very few. Small Group: bad for GPU. Hybrid Partitioning: Small spheres (lots of points) and big cubes (fewer points)
35 HYBRID SCHEE Compute the smallest groups in the CPU. Hardware: Tesla K80 (1 device) N x Intel Xeon E v3 synergistic behavior
36 RESULTS FOR THE HYBRID XC TER KERNEL HYBRID COPUTING SCHEE Limitations Coulomb Q/ Hardware GeForce GTX 980 vs Intel i Tesla K80 vs Xeon E Hybrid (K80+8c) vs Xeon E Speedup x 8.0 x 1.9 x 2.8
37 CURRENT STATE HYBRID COPUTING SCHEE Q/ Coulomb cublas Current Limitations Exc (again!) atrix Diag/Dcmp? Single Point Time (sec) XC TER KERNEL 76 s 20 s 3.3 s Before XC Kernel After XC Kernel After: Hybrid, Q, Coulomb, cublas
38 CURRENT STATE HYBRID COPUTING SCHEE Q/ Coulomb cublas Current Limitations Exc (again!) atrix Diag/Dcmp? Single Point Time (sec) XC TER KERNEL 76 s 20 s 3.3 s Before XC Kernel After XC Kernel After: Hybrid, Q, Coulomb, cublas Free Energy Calculation days 30 days
39 PROBLE ath expression specific to the physical model. Very inhomogeneous workload. emory intensive calculations. Direct manipulation of registers and memory influences algorithmic structure. ultiple blocks of threads can run in the same execution unit Re-organizing work to make it more homogeneous. Distribution of work that takes into account the different capabilities of CPUs / GPUs Customized solution with performance comparable to those of problems more naturally suited for GPU.
40 Thank you for your attention! And thanks to Everyone at the group of computer simulation of the DQIAyQF, FCEN, UBA. Collaborators: - Jorge Kohanoff - Cristián G. Sánchez - Ivan Girotto Institutional Support: - CONICET - UBA - ICTP THE LIO GROUP - ariano González Lebrero - Dario Estrin - Damian Scherlis - Esteban osckos - Uriel orzan - William Agudelo - Nicolás Foglia - Fernando Boubetta - Federico Pedron - anuel Ferreria - atías Nitsche - Juan Pablo Darago - Chad Hopkins visit our code! github.com/nanolebrero/lio
Introduction to numerical computations on the GPU
Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming
More informationComputational Methods. Chem 561
Computational Methods Chem 561 Lecture Outline 1. Ab initio methods a) HF SCF b) Post-HF methods 2. Density Functional Theory 3. Semiempirical methods 4. Molecular Mechanics Computational Chemistry " Computational
More informationIntroduction to Density Functional Theory
Introduction to Density Functional Theory S. Sharma Institut für Physik Karl-Franzens-Universität Graz, Austria 19th October 2005 Synopsis Motivation 1 Motivation : where can one use DFT 2 : 1 Elementary
More informationParallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29
Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing
More informationAntti-Pekka Hynninen, 5/10/2017, GTC2017, San Jose CA
S7255: CUTT: A HIGH- PERFORMANCE TENSOR TRANSPOSE LIBRARY FOR GPUS Antti-Pekka Hynninen, 5/10/2017, GTC2017, San Jose CA MOTIVATION Tensor contractions are the most computationally intensive part of quantum
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationarxiv: v1 [hep-lat] 7 Oct 2010
arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA
More informationPractical Combustion Kinetics with CUDA
Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationUsing a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics
Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge González-Domínguez Parallel and Distributed Architectures Group Johannes Gutenberg University of Mainz, Germany j.gonzalez@uni-mainz.de
More informationHybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures
More informationLecture 8: Introduction to Density Functional Theory
Lecture 8: Introduction to Density Functional Theory Marie Curie Tutorial Series: Modeling Biomolecules December 6-11, 2004 Mark Tuckerman Dept. of Chemistry and Courant Institute of Mathematical Science
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationEfficient Molecular Dynamics on Heterogeneous Architectures in GROMACS
Efficient Molecular Dynamics on Heterogeneous Architectures in GROMACS Berk Hess, Szilárd Páll KTH Royal Institute of Technology GTC 2012 GROMACS: fast, scalable, free Classical molecular dynamics package
More informationA microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS
GTC 20130319 A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS Erik Lindahl erik.lindahl@scilifelab.se Molecular Dynamics Understand biology We re comfortably on
More informationHECToR CSE technical meeting, Oxford Parallel Algorithms for the Materials Modelling code CRYSTAL
HECToR CSE technical meeting, Oxford 2009 Parallel Algorithms for the Materials Modelling code CRYSTAL Dr Stanko Tomi Computational Science & Engineering Department, STFC Daresbury Laboratory, UK Acknowledgements
More informationPopulation annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice
Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice Michal Borovský Department of Theoretical Physics and Astrophysics, University of P. J. Šafárik in Košice,
More informationQ-Chem 4.0: Expanding the Frontiers. Jing Kong Q-Chem Inc. Pittsburgh, PA
Q-Chem 4.0: Expanding the Frontiers Jing Kong Q-Chem Inc. Pittsburgh, PA Q-Chem: Profile Q-Chem is a high performance quantum chemistry program; Contributed by best quantum chemists from 40 universities
More informationDensity Functional Theory
Density Functional Theory Iain Bethune EPCC ibethune@epcc.ed.ac.uk Overview Background Classical Atomistic Simulation Essential Quantum Mechanics DFT: Approximations and Theory DFT: Implementation using
More informationA CUDA Solver for Helmholtz Equation
Journal of Computational Information Systems 11: 24 (2015) 7805 7812 Available at http://www.jofcis.com A CUDA Solver for Helmholtz Equation Mingming REN 1,2,, Xiaoguang LIU 1,2, Gang WANG 1,2 1 College
More informationElectron Correlation
Electron Correlation Levels of QM Theory HΨ=EΨ Born-Oppenheimer approximation Nuclear equation: H n Ψ n =E n Ψ n Electronic equation: H e Ψ e =E e Ψ e Single determinant SCF Semi-empirical methods Correlation
More informationReal-time signal detection for pulsars and radio transients using GPUs
Real-time signal detection for pulsars and radio transients using GPUs W. Armour, M. Giles, A. Karastergiou and C. Williams. University of Oxford. 15 th July 2013 1 Background of GPUs Why use GPUs? Influence
More informationExplore Computational Power of GPU in Electromagnetics and Micromagnetics
Explore Computational Power of GPU in Electromagnetics and Micromagnetics Presenter: Sidi Fu, PhD candidate, UC San Diego Advisor: Prof. Vitaliy Lomakin Center of Magnetic Recording Research, Department
More informationUncertainty Propagation and Nonlinear Filtering for Space Navigation using Differential Algebra
Uncertainty Propagation and Nonlinear Filtering for Space Navigation using Differential Algebra M. Valli, R. Arellin, P. Di Lizia and M. R. Lavagna Departent of Aerospace Engineering, Politecnico di Milano
More informationAndré Schleife Department of Materials Science and Engineering
André Schleife Department of Materials Science and Engineering Yesterday you (should have) learned this: http://upload.wikimedia.org/wikipedia/commons/e/ea/ Simple_Harmonic_Motion_Orbit.gif 1. deterministic
More informationExchange Correlation Functional Investigation of RT-TDDFT on a Sodium Chloride. Dimer. Philip Straughn
Exchange Correlation Functional Investigation of RT-TDDFT on a Sodium Chloride Dimer Philip Straughn Abstract Charge transfer between Na and Cl ions is an important problem in physical chemistry. However,
More informationSolving PDEs with CUDA Jonathan Cohen
Solving PDEs with CUDA Jonathan Cohen jocohen@nvidia.com NVIDIA Research PDEs (Partial Differential Equations) Big topic Some common strategies Focus on one type of PDE in this talk Poisson Equation Linear
More informationAPPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD
APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia
More informationStochastic Modelling of Electron Transport on different HPC architectures
Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy
More informationTracking using CONDENSATION: Conditional Density Propagation
Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation
More informationKlaus Schulten Department of Physics and Theoretical and Computational Biophysics Group University of Illinois at Urbana-Champaign
Klaus Schulten Department of Physics and Theoretical and Computational Biophysics Group University of Illinois at Urbana-Champaign GTC, San Jose Convention Center, CA Sept. 20 23, 2010 GPU and the Computational
More informationPOD-DEIM MODEL ORDER REDUCTION FOR THE MONODOMAIN REACTION-DIFFUSION EQUATION IN NEURO-MUSCULAR SYSTEM
6th European Conference on Coputational Mechanics (ECCM 6) 7th European Conference on Coputational Fluid Dynaics (ECFD 7) 1115 June 2018, Glasgow, UK POD-DEIM MODEL ORDER REDUCTION FOR THE MONODOMAIN REACTION-DIFFUSION
More informationA Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters
A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!
More informationarxiv: v1 [physics.comp-ph] 30 Oct 2017
An efficient GPU algorithm for tetrahedron-based Brillouin-zone integration Daniel Guterding 1, and Harald O. Jeschke 1 Lucht Probst Associates, Große Gallusstraße 9, 011 Frankfurt am Main, Germany, European
More informationMultilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota
Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationParallelism of MRT Lattice Boltzmann Method based on Multi-GPUs
Parallelism of MRT Lattice Boltzmann Method based on Multi-GPUs 1 School of Information Engineering, China University of Geosciences (Beijing) Beijing, 100083, China E-mail: Yaolk1119@icloud.com Ailan
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationComputational Chemistry I
Computational Chemistry I Text book Cramer: Essentials of Quantum Chemistry, Wiley (2 ed.) Chapter 3. Post Hartree-Fock methods (Cramer: chapter 7) There are many ways to improve the HF method. Most of
More informationResearch on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
NUCLEAR SCIENCE AND TECHNIQUES 25, 0501 (14) Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method XU Qi ( 徐琪 ), 1, YU Gang-Lin ( 余纲林 ), 1 WANG Kan ( 王侃 ),
More informationWeile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3
A plane wave pseudopotential density functional theory molecular dynamics code on multi-gpu machine - GPU Technology Conference, San Jose, May 17th, 2012 Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun
More informationMulticore Parallelization of Determinant Quantum Monte Carlo Simulations
Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Andrés Tomás, Che-Rung Lee, Zhaojun Bai, Richard Scalettar UC Davis SIAM Conference on Computation Science & Engineering Reno, March
More informationPuReMD-GPU: A Reactive Molecular Dynamic Simulation Package for GPUs
Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 2012 PuReMD-GPU: A Reactive Molecular Dynamic Simulation Package for GPUs Sudhir B. Kylasa
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM
OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer
More informationCRYPTOGRAPHIC COMPUTING
CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,
More informationMURCIA: Fast parallel solvent accessible surface area calculation on GPUs and application to drug discovery and molecular visualization
MURCIA: Fast parallel solvent accessible surface area calculation on GPUs and application to drug discovery and molecular visualization Eduardo J. Cepas Quiñonero Horacio Pérez-Sánchez Wolfgang Wenzel
More informationBus transit = 20 ns (one way) access, each module cannot be accessed faster than 120 ns. So, the maximum bandwidth is
32 Flynn: Coputer Architecture { The Solutions Chapter 6. Meory Syste Design Proble 6. The eory odule uses 64 4 M b chips for 32MB of data and 8 4 M b chips for ECC. This allows 64 bits + 8 bits ECC to
More informationab initio Electronic Structure Calculations
ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab
More informationIntro to ab initio methods
Lecture 2 Part A Intro to ab initio methods Recommended reading: Leach, Chapters 2 & 3 for QM methods For more QM methods: Essentials of Computational Chemistry by C.J. Cramer, Wiley (2002) 1 ab initio
More informationMeasuring freeze-out parameters on the Bielefeld GPU cluster
Measuring freeze-out parameters on the Bielefeld GPU cluster Outline Fluctuations and the QCD phase diagram Fluctuations from Lattice QCD The Bielefeld hybrid GPU cluster Freeze-out conditions from QCD
More informationIn 1830, Auguste Comte wrote in his work
N o v e l A r c h i t e c t u r e s Graphical Processing Units for Quantum Chemistry The authors provide a brief overview of electronic structure theory and detail their experiences implementing quantum
More informationJacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA
Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is
More informationGPU Computing Activities in KISTI
International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationMultiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU
Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Khramtsov D.P., Nekrasov D.A., Pokusaev B.G. Department of Thermodynamics, Thermal Engineering and Energy Saving Technologies,
More informationParallel Sparse Tensor Decompositions using HiCOO Format
Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline
More informationAUTHOR QUERY FORM. Fax: For correction or revision of any artwork, please consult
Our reference: YJCPH 52 P-authorquery-v7 AUTHOR QUERY FORM Journal: YJCPH Please e-mail or fax your responses and any corrections to: Article Number: 52 E-mail: corrections.esch@elsevier.vtex.lt Fax: +
More informationGPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic
GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago
More informationStudy on Markov Alternative Renewal Reward. Process for VLSI Cell Partitioning
Int. Journal of Math. Analysis, Vol. 7, 2013, no. 40, 1949-1960 HIKARI Ltd, www.-hikari.co http://dx.doi.org/10.12988/ia.2013.36142 Study on Markov Alternative Renewal Reward Process for VLSI Cell Partitioning
More informationBlock AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark
Block AIR Methods For Multicore and GPU Per Christian Hansen Hans Henrik B. Sørensen Technical University of Denmark Model Problem and Notation Parallel-beam 3D tomography exact solution exact data noise
More informationComputation of Large Sparse Aggregated Areas for Analytic Database Queries
Computation of Large Sparse Aggregated Areas for Analytic Database Queries Steffen Wittmer Tobias Lauer Jedox AG Collaborators: Zurab Khadikov Alexander Haberstroh Peter Strohm Business Intelligence and
More informationHeterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry
Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry and Eugene DePrince Argonne National Laboratory (LCF and CNM) (Eugene moved to Georgia Tech last week)
More informationProtein Structure Prediction on GPU: a Declarative Approach in a Multi-agent Framework
Protein Structure Prediction on GPU: a Declarative Approach in a Multi-agent Framework Federico Campeotto (1),(2) Agostino Dovier (2) Enrico Pontelli (1) campe8@nmsu.edu agostino.dovier@uniud.it epontell@cs.nmsu.edu
More informationMolecular Mechanics: The Ab Initio Foundation
Molecular Mechanics: The Ab Initio Foundation Ju Li GEM4 Summer School 2006 Cell and Molecular Mechanics in BioMedicine August 7 18, 2006, MIT, Cambridge, MA, USA 2 Outline Why are electrons quantum? Born-Oppenheimer
More informationA MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION
A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University
More informationGPU-accelerated Computing at Scale. Dirk Pleiter I GTC Europe 10 October 2018
GPU-accelerated Computing at Scale irk Pleiter I GTC Europe 10 October 2018 Outline Supercomputers at JSC Future science challenges Outlook and conclusions 2 3 Supercomputers at JSC JUQUEEN (until 2018)
More informationCase Study: Quantum Chromodynamics
Case Study: Quantum Chromodynamics Michael Clark Harvard University with R. Babich, K. Barros, R. Brower, J. Chen and C. Rebbi Outline Primer to QCD QCD on a GPU Mixed Precision Solvers Multigrid solver
More informationOn Lotka-Volterra Evolution Law
Advanced Studies in Biology, Vol. 3, 0, no. 4, 6 67 On Lota-Volterra Evolution Law Farruh Muhaedov Faculty of Science, International Islaic University Malaysia P.O. Box, 4, 570, Kuantan, Pahang, Malaysia
More informationAccelerating Model Reduction of Large Linear Systems with Graphics Processors
Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex
More informationOn the Mixed Discretization of the Time Domain Magnetic Field Integral Equation
On the Mixed Discretization of the Tie Doain Magnetic Field Integral Equation H. A. Ülkü 1 I. Bogaert K. Cools 3 F. P. Andriulli 4 H. Bağ 1 Abstract Tie doain agnetic field integral equation (MFIE) is
More informationMassively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem
Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing
More informationA CPU-GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method
A CPU-GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method Jee Choi 1, Aparna Chandramowlishwaran 3, Kamesh Madduri 4, and Richard Vuduc 2 1 ECE, Georgia Tech 2 CSE, Georgia
More informationThe electronic structure of materials 2 - DFT
Quantum mechanics 2 - Lecture 9 December 19, 2012 1 Density functional theory (DFT) 2 Literature Contents 1 Density functional theory (DFT) 2 Literature Historical background The beginnings: L. de Broglie
More informationReactivity and Organocatalysis. (Adalgisa Sinicropi and Massimo Olivucci)
Reactivity and Organocatalysis (Adalgisa Sinicropi and Massimo Olivucci) The Aldol Reaction - O R 1 O R 1 + - O O OH * * H R 2 R 1 R 2 The (1957) Zimmerman-Traxler Transition State Counterion (e.g. Li
More informationarxiv: v1 [physics.data-an] 19 Feb 2017
arxiv:1703.03284v1 [physics.data-an] 19 Feb 2017 Model-independent partial wave analysis using a massively-parallel fitting framework L Sun 1, R Aoude 2, A C dos Reis 2, M Sokoloff 3 1 School of Physics
More informationIntroduction to Benchmark Test for Multi-scale Computational Materials Software
Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)
More informationBeam dynamics calculation
September 6 Beam dynamics calculation S.B. Vorozhtsov, Е.Е. Perepelkin and V.L. Smirnov Dubna, JINR http://parallel-compute.com Outline Problem formulation Numerical methods OpenMP and CUDA realization
More informationTopic 5a Introduction to Curve Fitting & Linear Regression
/7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline
More informationAb initio calculations for potential energy surfaces. D. Talbi GRAAL- Montpellier
Ab initio calculations for potential energy surfaces D. Talbi GRAAL- Montpellier A theoretical study of a reaction is a two step process I-Electronic calculations : techniques of quantum chemistry potential
More informationThe Fast Multipole Method in molecular dynamics
The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules
More informationTwo case studies of Monte Carlo simulation on GPU
Two case studies of Monte Carlo simulation on GPU National Institute for Computational Sciences University of Tennessee Seminar series on HPC, Feb. 27, 2014 Outline 1 Introduction 2 Discrete energy lattice
More informationSEISMIC FRAGILITY ANALYSIS
9 th ASCE Specialty Conference on Probabilistic Mechanics and Structural Reliability PMC24 SEISMIC FRAGILITY ANALYSIS C. Kafali, Student M. ASCE Cornell University, Ithaca, NY 483 ck22@cornell.edu M. Grigoriu,
More informationThe linear sampling method and the MUSIC algorithm
INSTITUTE OF PHYSICS PUBLISHING INVERSE PROBLEMS Inverse Probles 17 (2001) 591 595 www.iop.org/journals/ip PII: S0266-5611(01)16989-3 The linear sapling ethod and the MUSIC algorith Margaret Cheney Departent
More informationJim Held, Ph.D., Intel Fellow & Director Emerging Technology Research, Intel Labs. HPC User Forum April 18, 2018
Jim Held, Ph.D., Intel Fellow & Director Emerging Technology Research, Intel Labs HPC User Forum April 18, 2018 Quantum Computing: Key Concepts Superposition Classical Physics Quantum Physics v Entanglement
More information2 Electronic structure theory
Electronic structure theory. Generalities.. Born-Oppenheimer approximation revisited In Sec..3 (lecture 3) the Born-Oppenheimer approximation was introduced (see also, for instance, [Tannor.]). We are
More informationGEM4 Summer School OpenCourseWare
GEM4 Summer School OpenCourseWare http://gem4.educommons.net/ http://www.gem4.org/ Lecture: Molecular Mechanics by Ju Li. Given August 9, 2006 during the GEM4 session at MIT in Cambridge, MA. Please use
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationWalter Kohn was awarded with the Nobel Prize in Chemistry in 1998 for his development of the density functional theory.
Walter Kohn was awarded with the Nobel Prize in Chemistry in 1998 for his development of the density functional theory. Walter Kohn receiving his Nobel Prize from His Majesty the King at the Stockholm
More informationA method to determine relative stroke detection efficiencies from multiplicity distributions
A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,
More informationBert de Groot, Udo Schmitt, Helmut Grubmüller
Computergestützte Biophysik I Bert de Groot, Udo Schmitt, Helmut Grubmüller Max Planck-Institut für biophysikalische Chemie Theoretische und Computergestützte Biophysik Am Fassberg 11 37077 Göttingen Tel.:
More informationHigh-Performance Computing, Planet Formation & Searching for Extrasolar Planets
High-Performance Computing, Planet Formation & Searching for Extrasolar Planets Eric B. Ford (UF Astronomy) Research Computing Day September 29, 2011 Postdocs: A. Boley, S. Chatterjee, A. Moorhead, M.
More informationJulian Merten. GPU Computing and Alternative Architecture
Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg
More informationPlaquette Renormalized Tensor Network States: Application to Frustrated Systems
Workshop on QIS and QMP, Dec 20, 2009 Plaquette Renormalized Tensor Network States: Application to Frustrated Systems Ying-Jer Kao and Center for Quantum Science and Engineering Hsin-Chih Hsiao, Ji-Feng
More informationSoftware optimization for petaflops/s scale Quantum Monte Carlo simulations
Software optimization for petaflops/s scale Quantum Monte Carlo simulations A. Scemama 1, M. Caffarel 1, E. Oseret 2, W. Jalby 2 1 Laboratoire de Chimie et Physique Quantiques / IRSAMC, Toulouse, France
More informationGPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications
GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications Christopher Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, Wen-Mei W. Hwu University of Illinois at Urbana-Champaign
More informationS0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA
S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services,
More informationOptimized LU-decomposition with Full Pivot for Small Batched Matrices S3069
Optimized LU-decomposition with Full Pivot for Small Batched Matrices S369 Ian Wainwright High Performance Consulting Sweden ian.wainwright@hpcsweden.se Based on work for GTC 212: 1x speed-up vs multi-threaded
More informationGPU Accelerated Markov Decision Processes in Crowd Simulation
GPU Accelerated Markov Decision Processes in Crowd Simulation Sergio Ruiz Computer Science Department Tecnológico de Monterrey, CCM Mexico City, México sergio.ruiz.loza@itesm.mx Benjamín Hernández National
More informationA model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)
A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal
More informationFirst, a look at using OpenACC on WRF subroutine advance_w dynamics routine
First, a look at using OpenACC on WRF subroutine advance_w dynamics routine Second, an estimate of WRF multi-node performance on Cray XK6 with GPU accelerators Based on performance of WRF kernels, what
More information