CP2K. New Frontiers. ab initio Molecular Dynamics

Similar documents
Large Scale Electronic Structure Calculations

CP2K: Past, Present, Future. Jürg Hutter Department of Chemistry, University of Zurich

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

Supercomputers: instruments for science or dinosaurs that haven t gone extinct yet? Thomas C. Schulthess

DFT with Hybrid Functionals

CRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel?

Ab Initio Molecular Dynamics

Is there a future for quantum chemistry on supercomputers? Jürg Hutter Physical-Chemistry Institute, University of Zurich

CP2K: the gaussian plane wave (GPW) method

CP2K: Selected Developments

ab initio Electronic Structure Calculations

INITIAL INTEGRATION AND EVALUATION

Big Bang, Big Iron: CMB Data Analysis at the Petascale and Beyond

Parallel Eigensolver Performance on High Performance Computers

Software optimization for petaflops/s scale Quantum Monte Carlo simulations

ELECTRONIC STRUCTURE CALCULATIONS FOR THE SOLID STATE PHYSICS

A knowledge-based approach to high-performance computing in ab initio simulations.

Lecture 8: Fast Linear Solvers (Part 7)

Lattice Quantum Chromodynamics on the MIC architectures

MODULE 2: QUANTUM MECHANICS. Practice: Quantum ESPRESSO

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

Parallel Eigensolver Performance on the HPCx System

Introduction to numerical computations on the GPU

Density Functional Theory

Solving Systems of Linear Equations Using Matrices

Crossing the Chasm. On the Paths to Exascale: Presented by Mike Rezny, Monash University, Australia

Introduction to Benchmark Test for Multi-scale Computational Materials Software

Quantum Mechanical Simulations

Post Hartree-Fock: MP2 and RPA in CP2K

FEAST eigenvalue algorithm and solver: review and perspectives

Applications available on FERMI

From Physical Modelling to Big Data Analytics: Examples and Challenges

Carlo Cavazzoni, HPC department, CINECA

All-electron density functional theory on Intel MIC: Elk

Cyclops Tensor Framework

Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing

A Survey of HPC Usage in Europe and the PRACE Benchmark Suite

Dense Arithmetic over Finite Fields with CUMODP

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

Consider the following example of a linear system:

STCE. Adjoint Code Design Patterns. Uwe Naumann. RWTH Aachen University, Germany. QuanTech Conference, London, April 2016

Thermostatic Controls for Noisy Gradient Systems and Applications to Machine Learning

A Survey of HPC Systems and Applications in Europe

Minimization of the Kohn-Sham Energy with a Localized, Projected Search Direction

Numerical Modelling in Fortran: day 10. Paul Tackley, 2016

Kitaev honeycomb lattice model: from A to B and beyond

Variational Bayesian Inference Techniques

Localized Optimization: Exploiting non-orthogonality to efficiently minimize the Kohn-Sham Energy

Periodic motions. Periodic motions are known since the beginning of mankind: Motion of planets around the Sun; Pendulum; And many more...

Symmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano

Day 1 : Introduction to AIMD and PIMD

ECS289: Scalable Machine Learning

RWTH Aachen University

On the Paths to Exascale: Will We be Hungry?

Iterative Methods for Solving A x = b

1.1 Variational principle Variational calculations with Gaussian basis functions 5

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Parallel Sparse Tensor Decompositions using HiCOO Format

Zacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale

QuTiP: Quantum Toolbox in Python. Circuit-QED

Deutscher Wetterdienst

Eigenvalues and eigenvectors

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Parallelization of the Dirac operator. Pushan Majumdar. Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata

CRYSTAL in parallel: replicated and distributed (MPP) data

Computational Physics. J. M. Thijssen

ALMA: All-scale predictive design of heat management material structures

The Plane-Wave Pseudopotential Method

Case Study: Quantum Chromodynamics

Computing least squares condition numbers on hybrid multicore/gpu systems

Comparing the Efficiency of Iterative Eigenvalue Solvers: the Quantum ESPRESSO experience

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11

Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools

Discretization of PDEs and Tools for the Parallel Solution of the Resulting Systems

Optimisation under Uncertainty with Stochastic PDEs for the History Matching Problem in Reservoir Engineering

Simulating the generation, evolution and fate of electronic excitations in molecular and nanoscale materials with first principles methods.

Parallel Eigensolver Performance on High Performance Computers 1

A practical view on linear algebra tools

VASP: running on HPC resources. University of Vienna, Faculty of Physics and Center for Computational Materials Science, Vienna, Austria

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess

Density Functional Theory. Martin Lüders Daresbury Laboratory

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear-scaling ab initio study of surface defects in metal oxide and carbon nanostructures

Preconditioned Parallel Block Jacobi SVD Algorithm

Matrix Assembly in FEA

J S Parker (QUB), Martin Plummer (STFC), H W van der Hart (QUB) Version 1.0, September 29, 2015

Basic introduction of NWChem software

Electronic structure calculations with GPAW. Jussi Enkovaara CSC IT Center for Science, Finland

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

Extending Parallel Scalability of LAMMPS and Multiscale Reactive Molecular Simulations

Molecular Science Modelling

Parallel Polynomial Evaluation

André Schleife Department of Materials Science and Engineering

Matrix Eigensystem Tutorial For Parallel Computation

Hard scaling challenges for ab-initio molecular dynamics capabilities in NWChem: Using 100,000 CPUs per second

Fastfood Approximating Kernel Expansions in Loglinear Time. Quoc Le, Tamas Sarlos, and Alex Smola Presenter: Shuai Zheng (Kyle)

Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University

Transcription:

CP2K New Frontiers in ab initio Molecular Dynamics Jürg Hutter, Joost VandeVondele, Valery Weber Physical-Chemistry Institute, University of Zurich

Ab Initio Molecular Dynamics Molecular Dynamics Sampling of classical statistical ensemble (alternatively Monte Carlo sampling). Introduces temperature and pressure into simulation, connection to experimental conditions Electronic Structure Theory Interaction potentials are calculated using quantum mechanics for electrons. Ab initio theory - no precomputed or empirically adjusted potential functions needed.

Ab Initio Molecular Dynamics Applications Chemical reactions in solution Complex interfaces (liquid/vapor, solid/liquid) Structural phase transitions Proton and/or electron transfer etc.

Ab Initio Molecular Dynamics Embarrassingly Parallel Sampling Approaches Replica exchange algorithms Multiple walker metadynamics Free energy string methods Time Series Generation (MD, MC) Electronic Structure Calculation

Parallel Time Series Generation Parallel Molecular Dynamics Waveform relaxation; A. Bellen, M. Zennaro, J. Comput. Appl. Math. 25, 341 (1989) Solar system dynamics; P. Saha, J. Stadel, S. Tremaine, Astron. J. 114, 409 (1997) Speculative Monte Carlo Similar to branch prediction in CPUs Effective if generation of states much cheaper than calculation of state energy. Wasteful algorithms: Efficiency 0.2 0.6

Speculative Monte Carlo

Electronic Structure Calculations Kohn-Sham Density Functional Theory Nonlinear Schrödinger equation basis set expansion Nonlinear generalized eigenvalue equation H[C]C = SCE Basis: Atom-centered Gaussian functions ϕ(r) = Y lm (ˆr) exp[ α(r A) 2 ]

Key Algorithms Gaussian to plane wave transform Hαβ c = P γδ I αβ,γδ = (αβ, G) 4π G 2 P γδ (γδ, G) γδ G γδ Analytic Integrals for Hartree-Fock exchange H x αβ = γδ P γδ I αγ,βδ Direct constraint subspace optimization Replace diagonalization by direct optimization with constraints Sparse matrix linear algebra

Gaussian to Plane Wave Transform

Gaussian to Plane Wave Transform Distributed sparse matrices and distributed grids Halo exchange 3d FFTs Complicated load-balancing problems Collaboration with EPCC Edinburgh MPI/OpenMP parallelization Target for co-array Fortran implementation

Hartree-Fock Exchange Large structured matrix vector multiply H x αβ = γδ P γδ I αγ,βδ Current implementation: Distributed matrix (I αγ,βδ ), replicated vectors (P γδ, H x αβ )

Sparse Linear Algebra Optimization (= solving the nonlinear eigenvalue equation) is mapped to matrix multiplications Old code: sparse matrices (operators) and dense matrices (solution vectors) Basic operation: sparse dense matrix multiplication Linear scaling algorithms require Code refactoring: All matrices are of sparse type DBCSR library Distributed Block Compressed Sparse Row matrix format

DBCSR Library ca. 50 000 lines Fortran 95 Extensive functionality (single/double real/complex) Sparse-sparse matrix multiplication filtering, predetermined sparsity pattern Interface to ScaLapack MPI/OpenMP parallelization

DBCSR vs. PDGEMM

DBCSR: Sparse-Dense Matrix Multiplication

State of the Art Application 1319 atoms, 5890 electrons, 15733 bsf Optimization: Function of 46 Mio. variables 4000 derivatives of the energy function 60s/MD step on 512 cores Cray XT-5 5 ps dynamics/week

Target Application Scenarios Small system - Very extensive sampling 200-500 atoms 10 7 10 9 force evaluations 2.1 Mio threads = 64 (replica exchange) 16 (speculative MC) 128 (MPI) 16 (OpenMP) Medium system - Extensive sampling 1000-5000 atoms 10 5 10 6 force evaluations 524 288 threads = 32 (string method) 1024 (MPI) 16 (OpenMP) Large system - Minimal sampling 10 000-100 000 atoms 10 3 10 4 force evaluations 262 144 threads = 16384 (MPI) 16 (OpenMP)

CP2K: Software Engineering 900 000 lines Fortran 95 machine generated 200 000 lines Explicit subroutine interfaces- strong type checking Templates 1600 regression tests all commits are automatically tested External libraries: Lapack/BLAS, ScaLapack/BLACS, MPI, OpenMP, FFTW, libint

Refactoring Separate code into independent libraries (e.g. DBCSR) Use more unit testing (vs. regression testing) Extended OpenMP fine grain parallelization Identify code segments for heterogenous computing (GPGPU) Parallel I/O (MPI) Fault tolerance (MPI)

CP2K Universe GPL code Open CVS development (cp2k.berlios.de) User community (Google group, 250 members) Developers UZH, PNNL, LLNL, PSI, U Bochum, U Minnesota, IBM Research EPCC Edinburgh dcse Grant: OpenMP parallelization Software Sustainability Institute Exascale Technology Centre

CP2K HP2C Team Jürg Hutter Prof. UZH Joost VandeVondele senior postdoc UZH Valery Weber senior postdoc UZH Urban Borstnik postdoc HP2C N.N. postdoc HP2C Mandes Schönherr PhD student HP2C/UZH Manuel Guidon PhD student HP2C/UZH Zelong Yi PhD student HP2C/UZH