Accelerating Three-Body Potentials using GPUs NVIDIA Tesla K20X

Size: px
Start display at page:

Download "Accelerating Three-Body Potentials using GPUs NVIDIA Tesla K20X"

Transcription

1 Using a Hybrid Cray Supercomputer to Model Non-Icing Surfaces for Cold- Climate Wind Turbines Accelerating Three-Body Potentials using GPUs NVIDIA Tesla K20X GE Global Research Masako Yamada

2 Opportunity in Cold-Climate Wind Wind energy production > 285 GW/year and growing Cold regions favorable Lower human population Good wind conditions GW opportunity from ~$2million/MW installed Technical need Anti-icing surfaces 3-10% energy losses due to icing Shut-downs Active heating expensive VTT Technical Research Centre of Finland 13_wind_energy.jsp?lang=en 2

3 ALCC Awards million hours DOE ASCR Leadership Computing Challenge Awards Energy-relevant applications 1. Non-Icing Surfaces for Cold-Climate Wind Turbines Jaguar (Cray XK6) at Oak Ridge National Lab Molecular dynamics using LAMMPS 1 million mw water molecule droplets on engineered surfaces Completed >300 simulations Achieved >200x speedup from 2011 to 2013 >5x from GPU acceleration 2. Accelerated Non-Icing Surfaces for Cold-Climate Wind Turbines Titan (Cray XK7, hybrid) at Oak Ridge National Lab Time parallelization via Parallel Replica method Expected x faster results 3

4 Titan enables leadership-class study Size of simulation ~ 1 million molecules Droplet size >> critical nucleus size Mimic physical dimensions (*somewhat) Duration of simulation ~ 1 microsecond Nucleation is an activated process Freezing rarely observed in MD simulations Number of simulations ~ 100 s Study requires embarrassingly parallel runs Different surfaces, ambient temperatures, conductivity Multiple replicates required due to stochastic nature *million molecule droplet ~ 50nm diameter 4

5 Personal history with MD LOGO DRAFT Year Software/Language # of Molecules Hardware 1995 Pascal Few Desktop Mac 2000 C, Fortran90 Hundreds IBM SP, SGI O2K 2010 NAMD, LAMMPS 1000 s Linux HPC Present GPU-enabled LAMMPS Millions Titan

6 >200x overall speedup since Switched to mw water potential 3-body model is more expensive/complex than 2-body but Particle reduction at least 3x Timestep increase 10x No long-range forces 2. LAMMPS dynamic load balance 2-3x 3. GPU acceleration of 3-body model 5x 2011: 6 femtosecond/1024 CPU-second (SPC/E) 2013: 2 picoseconds/1024 CPU-second (mw) 6

7 1. mw water potential Stillinger Weber 3-body particle = one water molecule Introduced in 2009, Nature paper in 2011 Bulk water properties comparable or better than existing point-charge models Much faster than point-charge models Exemplary test case by authors: 180x faster than SPC/E GE production simulation: 40-50x faster than SPC/E asymmetric million molecule droplet on engineered surface; loaded onto 64 nodes SPC/E mw 7

8 2. LAMMPS dynamic load balance Introduced in 2012 Adjusts size of processor sub-domains to equalize number of particles 2-3x speedup for 1 million molecule droplets on 64 nodes (with user-specified processor mapping) No load balancing Default load balancing User-specified mapping 8

9 3. GPU-acceleration of 3-body potential See details W. Michael Brown and Masako Yamada Implementing Molecular Dynamics on Hybrid High Performance Computers Three-Body Potentials. Computer Physics Communications Computer Physics Communications, (2013) 9

10 Load 1 million molecules on Host/CPU + 1 million molecules 64 nodes Processor sub-domains correspond to spatial partitioning of droplet MPI tasks/node 1 core/paired-unit 10

11 Host Memory Per node ~ 15,000 molecules DRAFT Host AMD Opteron 6274 CPU Core0 Core1 Core2 Core3 Core4 Core5 Core6 Core7 Core8 Core9 Core10 Core11 Core12 Core13 Core14 Core15 Work iterm Work item Work item Work item Kernel Work item Work item Work item Work Group Processor 1 Processor 2. Processor 14 Accelerator NVIDIA Tesla K20X GPU Core 1 Core 2 Core 192 Private Private Private Local Memory Global Memory Work item = fundamental unit of activity 11

12 Parallelization in LAMMPS Host Time integration Thermostat/barostat Bond/angle calculations Statistics Accelerator 3-body potential Neighbor-lists 12

13 Generic 3-body potential U = i j i k>j φ p i, p j, p k r ij < r c, r ik < r c 0 otherwise Good candidate for GPU 1. Occupies majority of computational time 2. Can be decomposed into independent kernels/work-items p i i p j r ij r ik j k Stillinger-Weber MEAM Tersoff REBO/AIREBO Bond-order (0,0,0) p k r c = cutoff r α = neighbor skin 13

14 Redundant Computation Approach Atom-decomposition 1 atom 1 computational kernel only fewest operations (and effective parallelization) but shared memory access a bottleneck Force-decomposition 1 atom 3 computational kernels required redundant computations but reduced shared memory issues many work-items = more effective use of cores 14

15 Stillinger-Weber Parallelization U = φ 2 (r ij ) + φ 3 r ij, r ik, θ jik i j<i i j i k>j 2-body operations Atom i 3 kernels no data dependencies 3-body operations (r ij < r α ).AND. (r ik < r α ) ==.TRUE. update forces on i only 3-body operations (r ij < r α ).AND. (r ik < r α ) ==.FALSE. neighbor-of-neighbor interactions 15

16 Neighbor List 3-body force-decomposition approach involves neighbor-of-neighbor operations Requires additional overhead increase in border size shared by two processes neighbor list for ghost atoms straddling across cores GPU implementation not necessarily faster than CPU but less time spent in host-accelerator data transfer (note: neighbor lists are huge) 16

17 GPU acceleration benefit >5x speedup achieved in production water droplet of 1 million molecules on engineered surface (64 nodes) Not limited to Stillinger-Weber -- applicable to MEAM, Tersoff, REBO, AIREBO, Bond-order, etc. 17

18 Implementation 18

19 6 different surfaces Interaction potential developed at GE Global Research 19

20 Freezing front propagation Visualization of latent heat release 20

21 Bottom View Side View Visualizing crystalline regions Steinhardt-Nelson order parameter particle mobility DRAFT

22 Advanced visualization Mike Matheson, Oak Ridge National Lab Will include visuals/movies here 22

23 Next steps Quasi time parallelization using Parallel Replica Method Launch dozens of replicates simultaneously; monitor ensemble behavior Expected outcome: x faster results Analysis and application of simulation results 23

24 Credits Mike Brown (ORNL) GPU acceleration Paul Crozier (Sandia) dynamic load balancing Valeria Molinero (Utah) mw potential Aaron Keyes (Umich, Berkeley) Steinhardt-Nelson order parameters Art Voter/Danny Perez (LANL) Parallel Replica method Mike Matheson (ORNL) -- Visualization Jack Wells, Suzy Tichenor (ORNL) General Azar Alizadeh, Branden Moore, Rick Arthur, Margaret Blohm (GE Global Research) This research was conducted in part under the auspices of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy under Contract No. DEAC05-00OR22725 with UT- Battelle, LLC. This research was also conducted in part under the auspices of the GE Global Research High Performance Computing program. 24

25 References W. Michael Brown, W. M and Yamada, M. Implementing Molecular Dynamics on Hybrid High Performance Computers Three-Body Potentials. Computer Physics Communications Computer Physics Communications, (2013) Shi, B. and Dhir, V. K. Molecular dynamics simulation of the contact angle of liquids on solid surfaces. The Journal of Chemical Physics, 130, 3 (01/21/ 2009), ; Sergi, D., Scocchi, G. and Ortona, A. Molecular dynamics simulations of the contact angle between water droplets and graphite surfaces. Fluid Phase Equilibria, 332, 0 (10/25/ 2012), Oxtoby, D. W. Homogeneous nucleation: theory and experiment. Journal of Physics: Condensed Matter, 4, ), Plimpton, S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. Journal of Computational Physics, 117, 1 (3/1/ 1995), Humphrey, W., Dalke, A. and Schulten, K. VMD: Visual molecular dynamics. Journal of Molecular Graphics, 14, 1 (2// 1996), Keys, A. S. Shape Matching Analysis Code. University of Michigan, City, 2011; Keys, A. S., Iacovella, C. R. and Glotzer, S. C. Characterizing Structure Through Shape Matching and Applications to Self-Assembly. Annual Review of Condensed Matter Physics, 2, 1 (2011/03/ ), ; Steinhardt, P. J., Nelson, D. R. and Ronchetti, M. Bond-orientational order in liquids and glasses. Physical Review B, 28, 2 (07/15/ 1983), Stillinger, F. H. and Weber, T. A. Computer simulation of local order in condensed phases of silicon. Physical Review B, 31, 8 (04/15/ 1985), Berendsen, H. J. C., Grigera, J. R. and Straatsma, T. P. The missing term in effective pair potentials. The Journal of Physical Chemistry, 91, 24 (1987/11/ ), Molinero, V. and Moore, E. B. Water Modeled As an Intermediate Element between Carbon and Silicon. The Journal of Physical Chemistry B, 113, 13 (2009/04/ ), ; Moore, E. B. and Molinero, V. Structural transformation in supercooled water controls the crystallization rate of ice. Nature, 479, 7374 (11/24/print 2011), Yamada, M., Mossa, S., Stanley, H. E. and Sciortino, F. Interplay between Time-Temperature Transformation and the Liquid-Liquid Phase Transition in Water. Physical Review Letters, 88, 19 (04/26/ 2002), Brown, W. M., Wang, P., Plimpton, S. J. and Tharrington, A. N. Implementing molecular dynamics on hybrid high performance computers short range forces. Computer Physics Communications, 182, 4 (4// 2011), Voter, A. F. Parallel replica method for dynamics of infrequent events. Physical Review B, 57, 22 (06/01/ 1998), R13985-R

26 Titan 299,008 Opteron cores (18688 nodes) 18,688 NVIDIA Tesla K20X GPU accelerators Each node 16 cores AMD Opteron CPUs 1 Tesla K20X GPU accelerator 2688 compute cores Gemini interconnect (ASIC, MPI messages) PCI-Express 2.0 bus LAMMPS one of the acceptance testing applications for Titan 26

Accelerating Three-Body Molecular Dynamics Potentials Using NVIDIA Tesla K20X GPUs. GE Global Research Masako Yamada

Accelerating Three-Body Molecular Dynamics Potentials Using NVIDIA Tesla K20X GPUs. GE Global Research Masako Yamada Accelerating Three-Body Molecular Dynamics Potentials Using NVIDIA Tesla K20X GPUs GE Global Research Masako Yamada Overview of MD Simulations Non-Icing Surfaces for Wind Turbines Large simulations ~ 1

More information

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing

More information

LAMMPS Performance Benchmark on VSC-1 and VSC-2

LAMMPS Performance Benchmark on VSC-1 and VSC-2 LAMMPS Performance Benchmark on VSC-1 and VSC-2 Daniel Tunega and Roland Šolc Institute of Soil Research, University of Natural Resources and Life Sciences VSC meeting, Neusiedl am See, February 27-28,

More information

Parallel Multivariate SpatioTemporal Clustering of. Large Ecological Datasets on Hybrid Supercomputers

Parallel Multivariate SpatioTemporal Clustering of. Large Ecological Datasets on Hybrid Supercomputers Parallel Multivariate SpatioTemporal Clustering of Large Ecological Datasets on Hybrid Supercomputers Sarat Sreepathi1, Jitendra Kumar1, Richard T. Mills2, Forrest M. Hoffman1, Vamsi Sripathi3, William

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)

More information

Efficient Parallelization of Molecular Dynamics Simulations on Hybrid CPU/GPU Supercoputers

Efficient Parallelization of Molecular Dynamics Simulations on Hybrid CPU/GPU Supercoputers Efficient Parallelization of Molecular Dynamics Simulations on Hybrid CPU/GPU Supercoputers Jaewoon Jung (RIKEN, RIKEN AICS) Yuji Sugita (RIKEN, RIKEN AICS, RIKEN QBiC, RIKEN ithes) Molecular Dynamics

More information

Supporting Information for Solid-liquid Thermal Transport and its Relationship with Wettability and the Interfacial Liquid Structure

Supporting Information for Solid-liquid Thermal Transport and its Relationship with Wettability and the Interfacial Liquid Structure Supporting Information for Solid-liquid Thermal Transport and its Relationship with Wettability and the Interfacial Liquid Structure Bladimir Ramos-Alvarado, Satish Kumar, and G. P. Peterson The George

More information

RWTH Aachen University

RWTH Aachen University IPCC @ RWTH Aachen University Optimization of multibody and long-range solvers in LAMMPS Rodrigo Canales William McDoniel Markus Höhnerbach Ahmed E. Ismail Paolo Bientinesi IPCC Showcase November 2016

More information

The Performance Evolution of the Parallel Ocean Program on the Cray X1

The Performance Evolution of the Parallel Ocean Program on the Cray X1 The Performance Evolution of the Parallel Ocean Program on the Cray X1 Patrick H. Worley Oak Ridge National Laboratory John Levesque Cray Inc. 46th Cray User Group Conference May 18, 2003 Knoxville Marriott

More information

A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS

A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS GTC 20130319 A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS Erik Lindahl erik.lindahl@scilifelab.se Molecular Dynamics Understand biology We re comfortably on

More information

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:

More information

Using GPUs for faster LAMMPS particle simulations

Using GPUs for faster LAMMPS particle simulations Using GPUs for faster LAMMPS particle simulations Paul S. Crozier Exploiting New Computer Architectures in Molecular Dynamics Simulations March 23, 2011 Sandia National Laboratories is a multi-program

More information

Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures

Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures 1 Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures Hasan Metin Aktulga, Chris Knight, Paul Coffman, Tzu-Ray Shan, Wei Jiang Abstract Hybrid parallelism

More information

INITIAL INTEGRATION AND EVALUATION

INITIAL INTEGRATION AND EVALUATION INITIAL INTEGRATION AND EVALUATION OF SLATE PARALLEL BLAS IN LATTE Marc Cawkwell, Danny Perez, Arthur Voter Asim YarKhan, Gerald Ragghianti, Jack Dongarra, Introduction The aim of the joint milestone STMS10-52

More information

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror

Molecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror Molecular dynamics simulation CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror 1 Outline Molecular dynamics (MD): The basic idea Equations of motion Key properties of MD simulations Sample applications

More information

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms

More information

Investigation of an Unusual Phase Transition Freezing on heating of liquid solution

Investigation of an Unusual Phase Transition Freezing on heating of liquid solution Investigation of an Unusual Phase Transition Freezing on heating of liquid solution Calin Gabriel Floare National Institute for R&D of Isotopic and Molecular Technologies, Cluj-Napoca, Romania Max von

More information

Energetics in Ice VI

Energetics in Ice VI Energetics in Ice VI Nishanth Sasankan 2011 NSF / REU PROJECT Physics Department University of Notre Dame Advisor: Dr. Kathie E. Newman Abstract There are many different phases of Ice, which exist in different

More information

Supercomputing: Why, What, and Where (are we)?

Supercomputing: Why, What, and Where (are we)? Supercomputing: Why, What, and Where (are we)? R. Govindarajan Indian Institute of Science, Bangalore, INDIA govind@serc.iisc.ernet.in (C)RG@SERC,IISc Why Supercomputer? Third and Fourth Legs RG@SERC,IISc

More information

Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3

Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3 A plane wave pseudopotential density functional theory molecular dynamics code on multi-gpu machine - GPU Technology Conference, San Jose, May 17th, 2012 Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

The Fast Multipole Method in molecular dynamics

The Fast Multipole Method in molecular dynamics The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules

More information

Julian Merten. GPU Computing and Alternative Architecture

Julian Merten. GPU Computing and Alternative Architecture Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Molecular Dynamics with LAMMPS on a Hybrid Cray Supercomputer

Molecular Dynamics with LAMMPS on a Hybrid Cray Supercomputer Molecular Dynamics with LAMMPS on a Hybrid Cray Supercomputer W. Michael Brown Na@onal Center for Computa@onal Sciences Oak Ridge Na@onal Laboratory NVIDIA Technology Theater, Supercompu8ng 2012 November

More information

Practical Combustion Kinetics with CUDA

Practical Combustion Kinetics with CUDA Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides

More information

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed

More information

A framework for detailed multiphase cloud modeling on HPC systems

A framework for detailed multiphase cloud modeling on HPC systems Center for Information Services and High Performance Computing (ZIH) A framework for detailed multiphase cloud modeling on HPC systems ParCo 2009, 3. September 2009, ENS Lyon, France Matthias Lieber a,

More information

What is Classical Molecular Dynamics?

What is Classical Molecular Dynamics? What is Classical Molecular Dynamics? Simulation of explicit particles (atoms, ions,... ) Particles interact via relatively simple analytical potential functions Newton s equations of motion are integrated

More information

Processing NOAA Observation Data over Hybrid Computer Systems for Comparative Climate Change Analysis

Processing NOAA Observation Data over Hybrid Computer Systems for Comparative Climate Change Analysis Processing NOAA Observation Data over Hybrid Computer Systems for Comparative Climate Change Analysis Xuan Shi 1,, Dali Wang 2 1 Department of Geosciences, University of Arkansas, Fayetteville, AR 72701,

More information

Acceleration of WRF on the GPU

Acceleration of WRF on the GPU Acceleration of WRF on the GPU Daniel Abdi, Sam Elliott, Iman Gohari Don Berchoff, Gene Pache, John Manobianco TempoQuest 1434 Spruce Street Boulder, CO 80302 720 726 9032 TempoQuest.com THE WORLD S FASTEST

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Two case studies of Monte Carlo simulation on GPU

Two case studies of Monte Carlo simulation on GPU Two case studies of Monte Carlo simulation on GPU National Institute for Computational Sciences University of Tennessee Seminar series on HPC, Feb. 27, 2014 Outline 1 Introduction 2 Discrete energy lattice

More information

Diffusion of Water and Diatomic Oxygen in Poly(3-hexylthiophene) Melt: A Molecular Dynamics Simulation Study

Diffusion of Water and Diatomic Oxygen in Poly(3-hexylthiophene) Melt: A Molecular Dynamics Simulation Study Diffusion of Water and Diatomic Oxygen in Poly(3-hexylthiophene) Melt: A Molecular Dynamics Simulation Study Julia Deitz, Yeneneh Yimer, and Mesfin Tsige Department of Polymer Science University of Akron

More information

Accelerating Quantum Chromodynamics Calculations with GPUs

Accelerating Quantum Chromodynamics Calculations with GPUs Accelerating Quantum Chromodynamics Calculations with GPUs Guochun Shi, Steven Gottlieb, Aaron Torok, Volodymyr Kindratenko NCSA & Indiana University National Center for Supercomputing Applications University

More information

Molecular Dynamics Simulations

Molecular Dynamics Simulations Molecular Dynamics Simulations Dr. Kasra Momeni www.knanosys.com Outline LAMMPS AdHiMad Lab 2 What is LAMMPS? LMMPS = Large-scale Atomic/Molecular Massively Parallel Simulator Open-Source (http://lammps.sandia.gov)

More information

First, a look at using OpenACC on WRF subroutine advance_w dynamics routine

First, a look at using OpenACC on WRF subroutine advance_w dynamics routine First, a look at using OpenACC on WRF subroutine advance_w dynamics routine Second, an estimate of WRF multi-node performance on Cray XK6 with GPU accelerators Based on performance of WRF kernels, what

More information

Prof. Brant Robertson Department of Astronomy and Astrophysics University of California, Santa

Prof. Brant Robertson Department of Astronomy and Astrophysics University of California, Santa Accelerated Astrophysics: Using NVIDIA GPUs to Simulate and Understand the Universe Prof. Brant Robertson Department of Astronomy and Astrophysics University of California, Santa Cruz brant@ucsc.edu, UC

More information

JASS Modeling and visualization of molecular dynamic processes

JASS Modeling and visualization of molecular dynamic processes JASS 2009 Konstantin Shefov Modeling and visualization of molecular dynamic processes St Petersburg State University, Physics faculty, Department of Computational Physics Supervisor PhD Stepanova Margarita

More information

Graphics Card Computing for Materials Modelling

Graphics Card Computing for Materials Modelling Graphics Card Computing for Materials Modelling Case study: Analytic Bond Order Potentials B. Seiser, T. Hammerschmidt, R. Drautz, D. Pettifor Funded by EPSRC within the collaborative multi-scale project

More information

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm Penporn Koanantakool and Katherine Yelick {penpornk, yelick}@cs.berkeley.edu Computer Science Division, University of California,

More information

Xiaoxia Li. Group of HPC & Cheminformatics Institute of Process Engineering Chinese Academy of Sciences, Beijing

Xiaoxia Li. Group of HPC & Cheminformatics Institute of Process Engineering Chinese Academy of Sciences, Beijing GTC 2016 San Jose, Californiae, 7 April, 2016 Xiaoxia Li Group of HPC & Cheminformatics Institute of Process Engineering Chinese Academy of Sciences, Beijing Outline 1 2 3 Reaction mechanisms of coal pyrolysis?

More information

Multiscale simulations of complex fluid rheology

Multiscale simulations of complex fluid rheology Multiscale simulations of complex fluid rheology Michael P. Howard, Athanassios Z. Panagiotopoulos Department of Chemical and Biological Engineering, Princeton University Arash Nikoubashman Institute of

More information

Petascale Quantum Simulations of Nano Systems and Biomolecules

Petascale Quantum Simulations of Nano Systems and Biomolecules Petascale Quantum Simulations of Nano Systems and Biomolecules Emil Briggs North Carolina State University 1. Outline of real-space Multigrid (RMG) 2. Scalability and hybrid/threaded models 3. GPU acceleration

More information

Nucleation rate (m -3 s -1 ) Radius of water nano droplet (Å) 1e+00 1e-64 1e-128 1e-192 1e-256

Nucleation rate (m -3 s -1 ) Radius of water nano droplet (Å) 1e+00 1e-64 1e-128 1e-192 1e-256 Supplementary Figures Nucleation rate (m -3 s -1 ) 1e+00 1e-64 1e-128 1e-192 1e-256 Calculated R in bulk water Calculated R in droplet Modified CNT 20 30 40 50 60 70 Radius of water nano droplet (Å) Supplementary

More information

Performance Evaluation of Scientific Applications on POWER8

Performance Evaluation of Scientific Applications on POWER8 Performance Evaluation of Scientific Applications on POWER8 2014 Nov 16 Andrew V. Adinetz 1, Paul F. Baumeister 1, Hans Böttiger 3, Thorsten Hater 1, Thilo Maurer 3, Dirk Pleiter 1, Wolfram Schenck 4,

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

GPU-Accelerated Analysis and Visualization of Large Structures Solved by Molecular Dynamics Flexible Fitting

GPU-Accelerated Analysis and Visualization of Large Structures Solved by Molecular Dynamics Flexible Fitting GPU-Accelerated Analysis and Visualization of Large Structures Solved by Molecular Dynamics Flexible Fitting John E. Stone, Ryan McGreevy, Barry Isralewitz, and Klaus Schulten Theoretical and Computational

More information

Direct Self-Consistent Field Computations on GPU Clusters

Direct Self-Consistent Field Computations on GPU Clusters Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd

More information

The QMC Petascale Project

The QMC Petascale Project The QMC Petascale Project Richard G. Hennig What will a petascale computer look like? What are the limitations of current QMC algorithms for petascale computers? How can Quantum Monte Carlo algorithms

More information

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose 上海超级计算中心 Shanghai Supercomputer Center Lei Xu Shanghai Supercomputer Center 03/26/2014 @GTC, San Jose Overview Introduction Fundamentals of the FDTD method Implementation of 3D UPML-FDTD algorithm on GPU

More information

Parallel Eigensolver Performance on High Performance Computers 1

Parallel Eigensolver Performance on High Performance Computers 1 Parallel Eigensolver Performance on High Performance Computers 1 Andrew Sunderland STFC Daresbury Laboratory, Warrington, UK Abstract Eigenvalue and eigenvector computations arise in a wide range of scientific

More information

New approaches to strongly interacting Fermi gases

New approaches to strongly interacting Fermi gases New approaches to strongly interacting Fermi gases Joaquín E. Drut The Ohio State University INT Program Simulations and Symmetries Seattle, March 2010 In collaboration with Timo A. Lähde Aalto University,

More information

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain

More information

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts

More information

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system Oliver Fuhrer and Thomas C. Schulthess 1 Piz Daint Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Compute node:

More information

Reactive Molecular Dynamics on Massively Parallel Heterogeneous Architectures

Reactive Molecular Dynamics on Massively Parallel Heterogeneous Architectures Reactive Molecular Dynamics on Massively Parallel Heterogeneous Architectures Sudhir B Kylasa, Hasan Metin Aktulga, Ananth Y Grama Abstract We present a parallel implementation of the ReaxFF force field

More information

GPU Accelerated Markov Decision Processes in Crowd Simulation

GPU Accelerated Markov Decision Processes in Crowd Simulation GPU Accelerated Markov Decision Processes in Crowd Simulation Sergio Ruiz Computer Science Department Tecnológico de Monterrey, CCM Mexico City, México sergio.ruiz.loza@itesm.mx Benjamín Hernández National

More information

Collision and Coalescence 3/3/2010. ATS 351 Lab 7 Precipitation. Droplet Growth by Collision and Coalescence. March 7, 2006

Collision and Coalescence 3/3/2010. ATS 351 Lab 7 Precipitation. Droplet Growth by Collision and Coalescence. March 7, 2006 ATS 351 Lab 7 Precipitation March 7, 2006 Droplet Growth by Collision and Coalescence Growth by condensation alone takes too long ( 15 C -) Occurs in clouds with tops warmer than 5 F Greater the speed

More information

S3D Direct Numerical Simulation: Preparation for the PF Era

S3D Direct Numerical Simulation: Preparation for the PF Era S3D Direct Numerical Simulation: Preparation for the 10 100 PF Era Ray W. Grout, Scientific Computing SC 12 Ramanan Sankaran ORNL John Levesque Cray Cliff Woolley, Stan Posey nvidia J.H. Chen SNL NREL

More information

Atomistic molecular simulations for engineering applications: methods, tools and results. Jadran Vrabec

Atomistic molecular simulations for engineering applications: methods, tools and results. Jadran Vrabec Atomistic molecular simulations for engineering applications: methods, tools and results Jadran Vrabec Motivation Simulation methods vary in their level of detail The more detail, the more predictive power

More information

Adaptive Heterogeneous Computing with OpenCL: Harnessing hundreds of GPUs and CPUs

Adaptive Heterogeneous Computing with OpenCL: Harnessing hundreds of GPUs and CPUs Adaptive Heterogeneous Computing with OpenCL: Harnessing hundreds of GPUs and CPUs Simon McIntosh-Smith simonm@cs.bris.ac.uk Head of Microelectronics Research University of Bristol, UK 1 ! Collaborators

More information

INTENSIVE COMPUTATION. Annalisa Massini

INTENSIVE COMPUTATION. Annalisa Massini INTENSIVE COMPUTATION Annalisa Massini 2015-2016 Course topics The course will cover topics that are in some sense related to intensive computation: Matlab (an introduction) GPU (an introduction) Sparse

More information

A Brief Guide to All Atom/Coarse Grain Simulations 1.1

A Brief Guide to All Atom/Coarse Grain Simulations 1.1 A Brief Guide to All Atom/Coarse Grain Simulations 1.1 Contents 1. Introduction 2. AACG simulations 2.1. Capabilities and limitations 2.2. AACG commands 2.3. Visualization and analysis 2.4. AACG simulation

More information

Enhancing Amber for Use on the Blue Waters High- Performance Computing Resource

Enhancing Amber for Use on the Blue Waters High- Performance Computing Resource Enhancing Amber for Use on the Blue Waters High- Performance Computing Resource Daniel R. Roe a, Jason M. Swails b, Romelia Salomon-Ferrer c, Adrian E. Roitberg b, and Thomas E. Cheatham III a * a. Department

More information

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Christopher P. Stone, Ph.D. Computational Science and Engineering, LLC Kyle Niemeyer, Ph.D. Oregon State University 2 Outline

More information

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications Christopher Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, Wen-Mei W. Hwu University of Illinois at Urbana-Champaign

More information

Molecular Dynamics Simulation of Nanometric Machining Under Realistic Cutting Conditions Using LAMMPS

Molecular Dynamics Simulation of Nanometric Machining Under Realistic Cutting Conditions Using LAMMPS Molecular Dynamics Simulation of Nanometric Machining Under Realistic Cutting Conditions Using LAMMPS Rapeepan Promyoo Thesis Presentation Advisor: Dr. Hazim El-Mounayri Department of Mechanical Engineering

More information

Code Timings of a Bulk Liquid Simulation

Code Timings of a Bulk Liquid Simulation 1 Parallel Molecular Dynamic Code for Large Simulations using Truncated Octahedron Periodics M.M. Micci, L.N. Long and J.K. Little a a Aerospace Department, The Pennsylvania State University, 233 Hammond

More information

Efficient Molecular Dynamics on Heterogeneous Architectures in GROMACS

Efficient Molecular Dynamics on Heterogeneous Architectures in GROMACS Efficient Molecular Dynamics on Heterogeneous Architectures in GROMACS Berk Hess, Szilárd Páll KTH Royal Institute of Technology GTC 2012 GROMACS: fast, scalable, free Classical molecular dynamics package

More information

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Physics and Computing: Exascale Partnerships Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Science and Exascale i Workshop held in DC to identify scientific challenges

More information

Accelerated Prediction of the Polar Ice and Global Ocean (APPIGO)

Accelerated Prediction of the Polar Ice and Global Ocean (APPIGO) DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Accelerated Prediction of the Polar Ice and Global Ocean (APPIGO) Eric Chassignet Center for Ocean-Atmosphere Prediction

More information

Performance Evaluation of MPI on Weather and Hydrological Models

Performance Evaluation of MPI on Weather and Hydrological Models NCAR/RAL Performance Evaluation of MPI on Weather and Hydrological Models Alessandro Fanfarillo elfanfa@ucar.edu August 8th 2018 Cheyenne - NCAR Supercomputer Cheyenne is a 5.34-petaflops, high-performance

More information

arxiv: v1 [hep-lat] 7 Oct 2010

arxiv: v1 [hep-lat] 7 Oct 2010 arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA

More information

Shear Properties and Wrinkling Behaviors of Finite Sized Graphene

Shear Properties and Wrinkling Behaviors of Finite Sized Graphene Shear Properties and Wrinkling Behaviors of Finite Sized Graphene Kyoungmin Min, Namjung Kim and Ravi Bhadauria May 10, 2010 Abstract In this project, we investigate the shear properties of finite sized

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Direct Molecular Simulation of Nonequilibrium Flows

Direct Molecular Simulation of Nonequilibrium Flows Direct Molecular Simulation of Nonequilibrium Flows Thomas E. Schwartzentruber Aerospace Engineering and Mechanics University of Minnesota AFOSR Young Investigator Program Grant # FA9550-10-1-0075 Direct

More information

Structure of the First and Second Neighbor Shells of Water: Quantitative Relation with Translational and Orientational Order.

Structure of the First and Second Neighbor Shells of Water: Quantitative Relation with Translational and Orientational Order. Structure of the First and Second Neighbor Shells of Water: Quantitative Relation with Translational and Orientational Order Zhenyu Yan, Sergey V. Buldyrev,, Pradeep Kumar, Nicolas Giovambattista 3, Pablo

More information

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago

More information

Arizona Cloud Seeding Efforts: A Salt River Project Perspective. James Walter SRP Surface Water Resources CPWAC/CPWP Joint Meeting March 30, 2018

Arizona Cloud Seeding Efforts: A Salt River Project Perspective. James Walter SRP Surface Water Resources CPWAC/CPWP Joint Meeting March 30, 2018 Arizona Cloud Seeding Efforts: A Salt River Project Perspective James Walter SRP Surface Water Resources Importance of Winter Precipitation Winter Cloud Seeding 101 Early Cloud Seeding Projects Questions

More information

11 Parallel programming models

11 Parallel programming models 237 // Program Design 10.3 Assessing parallel programs 11 Parallel programming models Many different models for expressing parallelism in programming languages Actor model Erlang Scala Coordination languages

More information

GPU Computing Activities in KISTI

GPU Computing Activities in KISTI International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr

More information

Local Hyperdynamics. Arthur F. Voter Theoretical Division Los Alamos National Laboratory Los Alamos, NM, USA

Local Hyperdynamics. Arthur F. Voter Theoretical Division Los Alamos National Laboratory Los Alamos, NM, USA Local Hyperdynamics Arthur F. Voter Theoretical Division National Laboratory, NM, USA Summer School in Monte Carlo Methods for Rare Events Division of Applied Mathematics Brown University Providence, RI

More information

High-Performance Computing, Planet Formation & Searching for Extrasolar Planets

High-Performance Computing, Planet Formation & Searching for Extrasolar Planets High-Performance Computing, Planet Formation & Searching for Extrasolar Planets Eric B. Ford (UF Astronomy) Research Computing Day September 29, 2011 Postdocs: A. Boley, S. Chatterjee, A. Moorhead, M.

More information

Efficient implementation of the overlap operator on multi-gpus

Efficient implementation of the overlap operator on multi-gpus Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator

More information

Unraveling the mysteries of quarks with hundreds of GPUs. Ron Babich NVIDIA

Unraveling the mysteries of quarks with hundreds of GPUs. Ron Babich NVIDIA Unraveling the mysteries of quarks with hundreds of GPUs Ron Babich NVIDIA Collaborators and QUDA developers Kip Barros (LANL) Rich Brower (Boston University) Mike Clark (NVIDIA) Justin Foley (University

More information

Perm State University Research-Education Center Parallel and Distributed Computing

Perm State University Research-Education Center Parallel and Distributed Computing Perm State University Research-Education Center Parallel and Distributed Computing A 25-minute Talk (S4493) at the GPU Technology Conference (GTC) 2014 MARCH 24-27, 2014 SAN JOSE, CA GPU-accelerated modeling

More information

Introduction to Benchmark Test for Multi-scale Computational Materials Software

Introduction to Benchmark Test for Multi-scale Computational Materials Software Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)

More information

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric

More information

Parallel Eigensolver Performance on High Performance Computers

Parallel Eigensolver Performance on High Performance Computers Parallel Eigensolver Performance on High Performance Computers Andrew Sunderland Advanced Research Computing Group STFC Daresbury Laboratory CUG 2008 Helsinki 1 Summary (Briefly) Introduce parallel diagonalization

More information

Cyclops Tensor Framework

Cyclops Tensor Framework Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r

More information

Tuning And Understanding MILC Performance In Cray XK6 GPU Clusters. Mike Showerman, Guochun Shi Steven Gottlieb

Tuning And Understanding MILC Performance In Cray XK6 GPU Clusters. Mike Showerman, Guochun Shi Steven Gottlieb Tuning And Understanding MILC Performance In Cray XK6 GPU Clusters Mike Showerman, Guochun Shi Steven Gottlieb Outline Background Lattice QCD and MILC GPU and Cray XK6 node architecture Implementation

More information

Scalable and Power-Efficient Data Mining Kernels

Scalable and Power-Efficient Data Mining Kernels Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the

More information

Computer Simulation of Lignocellulosic Biomass

Computer Simulation of Lignocellulosic Biomass Computer Simulation of Lignocellulosic Biomass Loukas Petridis UT/ORNL Center for Molecular Biophysics Oak Ridge National Laboratory Cellulosic Ethanol Production Ligocellulosic Biomass Cellulose Hemicellulose

More information

How is molecular dynamics being used in life sciences? Davide Branduardi

How is molecular dynamics being used in life sciences? Davide Branduardi How is molecular dynamics being used in life sciences? Davide Branduardi davide.branduardi@schrodinger.com Exploring molecular processes with MD Drug discovery and design Protein-protein interactions Protein-DNA

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Applications of Mathematical Economics

Applications of Mathematical Economics Applications of Mathematical Economics Michael Curran Trinity College Dublin Overview Introduction. Data Preparation Filters. Dynamic Stochastic General Equilibrium Models: Sunspots and Blanchard-Kahn

More information