A Survey of HPC Systems and Applications in Europe

Size: px
Start display at page:

Download "A Survey of HPC Systems and Applications in Europe"

Transcription

1 A Survey of HPC Systems and Applications in Europe Dr Mark Bull, Dr Jon Hill and Dr Alan Simpson EPCC, University of Edinburgh

2 Overview Background Survey design Survey results HPC Systems HPC Applications Selecting a benchmark suite

3 Background to the survey The PRACE project is working towards the installation of Petaflop/s scale systems in Europe. Requirement for a set of benchmark applications to assess performance of systems before and during procurement process Benchmark applications should be representative of HPC usage by PRACE partners To understand current applications usage, we conducted a survey of PRACE partners current HPC systems

4 We took the opportunity to gather other interesting data as well We also devised a method for selecting (and weighting) a set of applications which can be considered representative of the current usage we wanted to do this in a quantifiable way we wanted to avoid political considerations but it was not entirely successful!

5 Survey Design We asked the PRACE centres to complete: a systems survey for their largest system, and any other system over 0 Tflop/s Linpack an application survey for all applications which consumed more than 5% of the utilised cycles on each system We collected data for 24 systems and 69 applications Survey was conducted in April 2008 data relates to 2007/8

6 Systems surveyed System Jugene MareNostrum HLRB II HECToR Neolith Platine Hexagon Galera Jubl BCX Stallo Palu HPCx Huygens Legion hww SX-8 Louhi murska.csc.fi Jump ZAHIR HERA XC5 Milipeia TNC Centre FZJ BSC BADW-LRZ EPSRC SNIC GENCI SIGMA PSNC FZJ CINECA SIGMA ETHZ EPSRC NCF EPSRC USTUTT-HLRS CSC CSC FZJ GENCI GENCI CINECA UC-LCA PSNC Manufacturer IBM IBM SGI Cray HP Bull Cray Supermicro IBM IBM HP Cray IBM IBM IBM NEC Cray HP IBM IBM IBM HP SUN IBM, Sun Model Blue Gene/P JS2 cluster Altix 4700 XT4 Cluster 3000 DL XT4 X7DBT-INF Blue Gene/L BladeCenter Cluster LS2 BL460c XT3 p575 cluster p575 cluster Blue Gene/P SX8 XT4 CP400 BL ProLiant SuperCluster p690 cluster p690/p690+/p655 cluster p690/p575 cluster HS2 cluster x400 cluster e325/v40z/x4600 cluster Architecture MPP TNC FNC MPP TNC TNC MPP TNC MPP TNC TNC MPP FNC FNC MPP VEC MPP TNC FNC FNC FNC TNC TNC TNC R peak R max Cores Totals

7 Compute power by architecture type Fat-node Cluster 4% Vector % MPP 50% Thin-node Cluster 35%

8 LEFs The measure of computational power and consumed cycles we use is the Linpack Equivalent Flop (LEF). A system which has a Linpack R max of 50 Tflop/s is said to have a power of 50T LEFs An application which uses 0% of the time on that system is said to consume 5T LEFs

9 Distribution of LEFs by job size > % < 32 4.% % % %

10 Jubl Jugene XC5 Legion Jump Palu hww SX-8 HERA HPCx HECToR Louhi Mean job size as % of machine % of machine of the mean job size 5 0 Galera Neolith Stallo BCX HLRB II murska.csc.fi Platine Milipeia halo Huygens ZAHIR MareNostrum

11 Job size distribution by system 00% 90% 80% 70% 60% 50% 40% 30% 20% 0% 0% > <32 Jugene Jubl Louhi XC5 Legion HECToR Palu MareNostrum Stallo Platine Neolith HPCx HLRB II BCX Huygens ZAHIR murska.csc.fi hww SX-8 Jump HERA Galera Milipeia TNC

12 Distribution of LEFs by scientific area Plasma Physics 3.3 Other 5.8 Computational Engineering 3.7 Particle Physics 23.5 Life Sciences 5.3 Astronomy & Cosmology 5.8 Earth & Climate 7.8 CFD 8.6 Computational Chemistry 22. Condensed Matter Physics 4.2

13 Milipeia Legion XC5 Louhi Galera Palu No. of users and Rmax per user No. of Users Jump HPCx hww SX-8 ZAHIR HLRB II Jubl HERA murska.csc.fi Huygens Neolith Jugene HECToR BCX TNC Machine Name Rmax per User No. Users Rmax Per User

14 Parallelisation techniques Of the 69 applications, all but two use MPI for parallelisation exceptions are Gaussian (OpenMP) and BLAST (sequential). Of the 67 MPI applications, six also have standalone OpenMP versions and three have standalone SHMEM versions. 3 applications have hybrid implementations 0 MPI+OpenMP, 2 MPI+SHMEM, MPI+Posix threads Only one application was reported as using MPI2 single sided communication.

15 Languages Language Fortran90 C90 Fortran77 C++ C99 Python Perl Mathematica No. of applications applications mix Fortran with C/C

16 Distribution of LEFs by dwarves Structured grids 9.0% Dense linear algebra 4.4% Sparse linear algebra 3.4% Particle methods 7.2% Unstructured grids 2.4% Map reduce methods 45.% Spectral methods 8.4%

17 Distribution of LEFs by dwarf and area Area/Dwarf Dense linear algebra Spectral methods Structured grids Sparse linear algebra Particle methods Unstructured grids Map reduce methods Astronomy and Cosmology Computational Chemistry Computational Engineering Computational Fluid Dynamics Condensed Matter Physics Earth and Climate Science Life Science Particle Physics Plasma Physics Other

18 Choosing a benchmark suite Want to choose a set of applications to form a benchmark suite to be used in the procurement process for Petaflop/s systems Suggested process: find a set of applications that is a best fit to the area/dwarf table in the sense that it minimises the norm of Uw-v where v is a linearised vector containing the table entries U is a matrix describing the area/dwarf combinations satisfied by the applications w is vector of weights

19 In principle, one could search all possible lists of applications up to a certain length and find the list with the smallest residual in practise, do a manual search we want to include other criteria, such as usage of applications, geographical spread, etc. Gives a quantitative measure of how well a benchmark suite represents current usage Also gives a weighting for the applications which could be used to weight benchmark results

20 Problems with this approach Classification of codes into dwarves (and to some extent, areas) is somewhat arbitrary some applications use more than one dwarf: we split the LEFs equally between dwarves Bias to recently acquired systems high LEFs recently acquired systems may have atypical usage by early users Reflects past, rather than future usage

21 Current status We used the above process as a starting point, then swapped some applications to meet some of the concerns 2 core applications, plus 8 additional applications Core apps: NAMD, CPMD, VASP, QCD, GADGET, Code_Saturne, TORB, NEMO, ECHAM5, CP2K, GROMACS, N3D Additional apps: AVBP, HELIUM, TRIPOLI_4, GPAW, ALYA, SIESTA, BSIT, PEPC

22 We have undertaken work to port these applications to the PRACE prototype systems, and optimise them for sequential performance and scalability. We are currently collecting benchmark data from the prototype systems which have been installed so far. Based on this data, we are reviewing the list of applications to ensure that the final benchmark suite contains scalable codes and avoids licensing problems.

23 Acknowledgements The authors would like to acknowledge all those who contributed by filling in survey forms and taking part in subsequent discussions. A full report is available from:

24

25 Availability and utilisation Availability % Utilisation % Neolith Palu Jubl Legion MareNostrum Jugene hwwsx-8 BCX halo Jump Platine Huygens HERA HPCx Milipeia HLRBII ZAHIR Stallo XC5 HECToR murska.csc.fi Galera

26 Top 30 applications by usage Application Name overlap and wilson fermions vasp lqcd (twisted mass) lqcd (two flavor) namd dalton cpmd gadget dynamical fermions spintronics materials with strong correlations dl_poly casino quantum-espresso cactus trio_u smmp tfs/piano gromacs pepc tripoli4 chroma wien2k bam trace bqcd cp2k helium magnum pdkgrav-gasoline LEFs Used (Gflop/s) Number of systems

A Survey of HPC Usage in Europe and the PRACE Benchmark Suite

A Survey of HPC Usage in Europe and the PRACE Benchmark Suite A Survey of HPC Usage in Europe and the PRACE Benchmark Suite Dr Mark Bull Dr Jon Hill Dr Alan Simpson EPCC, University of Edinburgh Email: m.bull@epcc.ed.ac.uk, j.hill@epcc.ed.ac.uk, a.simpson@epcc.ed.ac.uk

More information

Deutscher Wetterdienst

Deutscher Wetterdienst Deutscher Wetterdienst The Enhanced DWD-RAPS Suite Testing Computers, Compilers and More? Ulrich Schättler, Florian Prill, Harald Anlauf Deutscher Wetterdienst Research and Development Deutscher Wetterdienst

More information

Applications available on FERMI

Applications available on FERMI Applications available on FERMI e.rossi@cineca.it www.cineca.it Is my application suitable for Fermi? What is Fermi best suitable for? FERMI has an enormous number of cores Fermi cores are not so powerful

More information

An Overview of HPC at the Met Office

An Overview of HPC at the Met Office An Overview of HPC at the Met Office Paul Selwood Crown copyright 2006 Page 1 Introduction The Met Office National Weather Service for the UK Climate Prediction (Hadley Centre) Operational and Research

More information

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms

More information

Simulation Laboratories at JSC

Simulation Laboratories at JSC Mitglied der Helmholtz-Gemeinschaft Simulation Laboratories at JSC Paul Gibbon Jülich Supercomputing Centre Jülich Supercomputing Centre Supercomputer operation for Centre FZJ Regional JARA Helmholtz &

More information

Investigation of an Unusual Phase Transition Freezing on heating of liquid solution

Investigation of an Unusual Phase Transition Freezing on heating of liquid solution Investigation of an Unusual Phase Transition Freezing on heating of liquid solution Calin Gabriel Floare National Institute for R&D of Isotopic and Molecular Technologies, Cluj-Napoca, Romania Max von

More information

Linear-scaling ab initio study of surface defects in metal oxide and carbon nanostructures

Linear-scaling ab initio study of surface defects in metal oxide and carbon nanostructures Linear-scaling ab initio study of surface defects in metal oxide and carbon nanostructures Rubén Pérez SPM Theory & Nanomechanics Group Departamento de Física Teórica de la Materia Condensada & Condensed

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

CP2K. New Frontiers. ab initio Molecular Dynamics

CP2K. New Frontiers. ab initio Molecular Dynamics CP2K New Frontiers in ab initio Molecular Dynamics Jürg Hutter, Joost VandeVondele, Valery Weber Physical-Chemistry Institute, University of Zurich Ab Initio Molecular Dynamics Molecular Dynamics Sampling

More information

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Physics and Computing: Exascale Partnerships Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Science and Exascale i Workshop held in DC to identify scientific challenges

More information

From Supercomputers to GPUs

From Supercomputers to GPUs From Supercomputers to GPUs What a physicist should know about current computational capabilities Craig Rasmussen (Research Support Services, University of Oregon) Which one? Gordon Bell Prize: Price Performance

More information

The Blue Gene/P at Jülich Case Study & Optimization. W.Frings, Forschungszentrum Jülich,

The Blue Gene/P at Jülich Case Study & Optimization. W.Frings, Forschungszentrum Jülich, The Blue Gene/P at Jülich Case Study & Optimization W.Frings, Forschungszentrum Jülich, 26.08.2008 Jugene Case-Studies: Overview Case Study: PEPC Case Study: racoon Case Study: QCD CPU0CPU3 CPU1CPU2 2

More information

Update on Cray Earth Sciences Segment Activities and Roadmap

Update on Cray Earth Sciences Segment Activities and Roadmap Update on Cray Earth Sciences Segment Activities and Roadmap 31 Oct 2006 12 th ECMWF Workshop on Use of HPC in Meteorology Per Nyberg Director, Marketing and Business Development Earth Sciences Segment

More information

Parallel Eigensolver Performance on High Performance Computers 1

Parallel Eigensolver Performance on High Performance Computers 1 Parallel Eigensolver Performance on High Performance Computers 1 Andrew Sunderland STFC Daresbury Laboratory, Warrington, UK Abstract Eigenvalue and eigenvector computations arise in a wide range of scientific

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

Density Functional Theory

Density Functional Theory Density Functional Theory Iain Bethune EPCC ibethune@epcc.ed.ac.uk Overview Background Classical Atomistic Simulation Essential Quantum Mechanics DFT: Approximations and Theory DFT: Implementation using

More information

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

Physics plans and ILDG usage

Physics plans and ILDG usage Physics plans and ILDG usage in Italy Francesco Di Renzo University of Parma & INFN Parma The MAIN ILDG USERS in Italy are the ROME groups A (by now) well long track of ILDG-based projects mainly within

More information

The Fast Multipole Method in molecular dynamics

The Fast Multipole Method in molecular dynamics The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules

More information

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory MAX PLANCK INSTITUTE November 5, 2010 MPI at MPI Jens Saak Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory FOR DYNAMICS OF COMPLEX TECHNICAL

More information

A Numerical QCD Hello World

A Numerical QCD Hello World A Numerical QCD Hello World Bálint Thomas Jefferson National Accelerator Facility Newport News, VA, USA INT Summer School On Lattice QCD, 2007 What is involved in a Lattice Calculation What is a lattice

More information

A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries

A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries SC13, November 21 st 2013 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler, Ulrich

More information

History of the partnership between SMHI and NSC. Per Undén

History of the partnership between SMHI and NSC. Per Undén History of the partnership between SMHI and NSC Per Undén Outline Pre-history and NWP Preparations parallelisation HPD Council Decision and early developments Climate modelling Other applications HPD Project

More information

Performance of WRF using UPC

Performance of WRF using UPC Performance of WRF using UPC Hee-Sik Kim and Jong-Gwan Do * Cray Korea ABSTRACT: The Weather Research and Forecasting (WRF) model is a next-generation mesoscale numerical weather prediction system. We

More information

Scalable Systems for Computational Biology

Scalable Systems for Computational Biology John von Neumann Institute for Computing Scalable Systems for Computational Biology Ch. Pospiech published in From Computational Biophysics to Systems Biology (CBSB08), Proceedings of the NIC Workshop

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

ab initio Electronic Structure Calculations

ab initio Electronic Structure Calculations ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab

More information

Hybrid programming with MPI & OpenMP

Hybrid programming with MPI & OpenMP workshop on application porting and performance tuning 11-12 June 2009 Hybrid programming with MPI & OpenMP Carlo Cavazzoni (CINECA) Outline Motivation Application test case: QuantumESPRESSO Mixed paradigm

More information

The Performance Evolution of the Parallel Ocean Program on the Cray X1

The Performance Evolution of the Parallel Ocean Program on the Cray X1 The Performance Evolution of the Parallel Ocean Program on the Cray X1 Patrick H. Worley Oak Ridge National Laboratory John Levesque Cray Inc. 46th Cray User Group Conference May 18, 2003 Knoxville Marriott

More information

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

Some thoughts about energy efficient application execution on NEC LX Series compute clusters Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science

More information

XXL-BIOMD. Large Scale Biomolecular Dynamics Simulations. onsdag, 2009 maj 13

XXL-BIOMD. Large Scale Biomolecular Dynamics Simulations. onsdag, 2009 maj 13 XXL-BIOMD Large Scale Biomolecular Dynamics Simulations David van der Spoel, PI Aatto Laaksonen Peter Coveney Siewert-Jan Marrink Mikael Peräkylä Uppsala, Sweden Stockholm, Sweden London, UK Groningen,

More information

SHAPE Pilot Albatern: Numerical Simulation of Extremely Large Interconnected Wavenet Arrays

SHAPE Pilot Albatern: Numerical Simulation of Extremely Large Interconnected Wavenet Arrays Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe SHAPE Pilot Albatern: Numerical Simulation of Extremely Large Interconnected Wavenet Arrays William Edwards a bill.edwards@albatern.co.uk,

More information

Parallel Eigensolver Performance on the HPCx System

Parallel Eigensolver Performance on the HPCx System Parallel Eigensolver Performance on the HPCx System Andrew Sunderland, Elena Breitmoser Terascaling Applications Group CCLRC Daresbury Laboratory EPCC, University of Edinburgh Outline 1. Brief Introduction

More information

Analysis of PFLOTRAN on Jaguar

Analysis of PFLOTRAN on Jaguar Analysis of PFLOTRAN on Jaguar Kevin Huck, Jesus Labarta, Judit Gimenez, Harald Servat, Juan Gonzalez, and German Llort CScADS Workshop on Performance Tools for Petascale Computing Snowbird, UT How this

More information

Parallel Eigensolver Performance on High Performance Computers

Parallel Eigensolver Performance on High Performance Computers Parallel Eigensolver Performance on High Performance Computers Andrew Sunderland Advanced Research Computing Group STFC Daresbury Laboratory CUG 2008 Helsinki 1 Summary (Briefly) Introduce parallel diagonalization

More information

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)

More information

Applications of Mathematical Economics

Applications of Mathematical Economics Applications of Mathematical Economics Michael Curran Trinity College Dublin Overview Introduction. Data Preparation Filters. Dynamic Stochastic General Equilibrium Models: Sunspots and Blanchard-Kahn

More information

ECMWF Scalability Programme

ECMWF Scalability Programme ECMWF Scalability Programme Picture: Stan Tomov, ICL, University of Tennessee, Knoxville Peter Bauer, Mike Hawkins, Deborah Salmond, Stephan Siemen, Yannick Trémolet, and Nils Wedi Next generation science

More information

Cactus Tools for Petascale Computing

Cactus Tools for Petascale Computing Cactus Tools for Petascale Computing Erik Schnetter Reno, November 2007 Gamma Ray Bursts ~10 7 km He Protoneutron Star Accretion Collapse to a Black Hole Jet Formation and Sustainment Fe-group nuclei Si

More information

ON THE FUTURE OF HIGH PERFORMANCE COMPUTING: HOW TO THINK FOR PETA AND EXASCALE COMPUTING

ON THE FUTURE OF HIGH PERFORMANCE COMPUTING: HOW TO THINK FOR PETA AND EXASCALE COMPUTING ON THE FUTURE OF HIGH PERFORMANCE COMPUTING: HOW TO THINK FOR PETA AND EXASCALE COMPUTING JACK DONGARRA UNIVERSITY OF TENNESSEE OAK RIDGE NATIONAL LAB What Is LINPACK? LINPACK is a package of mathematical

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Dark Energy and Massive Neutrino Universe Covariances

Dark Energy and Massive Neutrino Universe Covariances Dark Energy and Massive Neutrino Universe Covariances (DEMNUniCov) Carmelita Carbone Physics Dept, Milan University & INAF-Brera Collaborators: M. Calabrese, M. Zennaro, G. Fabbian, J. Bel November 30

More information

Julian Merten. GPU Computing and Alternative Architecture

Julian Merten. GPU Computing and Alternative Architecture Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg

More information

The Memory Intensive System

The Memory Intensive System DiRAC@Durham The Memory Intensive System The DiRAC-2.5x Memory Intensive system at Durham in partnership with Dell Dr Lydia Heck, Technical Director ICC HPC and DiRAC Technical Manager 1 DiRAC Who we are:

More information

PicSim: 5D/6D Particle-In-Cell Plasma Simulation

PicSim: 5D/6D Particle-In-Cell Plasma Simulation PicSim: 5D/6D Particle-In-Cell Plasma Simulation Code Porting and Validation Patrick Guio Department of Physics and Astronomy, University College London November 2012 1 Introduction The plasma state ionised

More information

Efficient implementation of the overlap operator on multi-gpus

Efficient implementation of the overlap operator on multi-gpus Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator

More information

Is there a future for quantum chemistry on supercomputers? Jürg Hutter Physical-Chemistry Institute, University of Zurich

Is there a future for quantum chemistry on supercomputers? Jürg Hutter Physical-Chemistry Institute, University of Zurich Is there a future for quantum chemistry on supercomputers? Jürg Hutter Physical-Chemistry Institute, University of Zurich Chemistry Chemistry is the science of atomic matter, especially its chemical reactions,

More information

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications

Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications Parallel Algorithms for Solution of Large Sparse Linear Systems with Applications Murat Manguoğlu Department of Computer Engineering Middle East Technical University, Ankara, Turkey Prace workshop: HPC

More information

CRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel?

CRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel? CRYSTAL in parallel: replicated and distributed (MPP) data Roberto Orlando Dipartimento di Chimica Università di Torino Via Pietro Giuria 5, 10125 Torino (Italy) roberto.orlando@unito.it 1 Why parallel?

More information

Crossing the Chasm. On the Paths to Exascale: Presented by Mike Rezny, Monash University, Australia

Crossing the Chasm. On the Paths to Exascale: Presented by Mike Rezny, Monash University, Australia On the Paths to Exascale: Crossing the Chasm Presented by Mike Rezny, Monash University, Australia michael.rezny@monash.edu Crossing the Chasm meeting Reading, 24 th October 2016 Version 0.1 In collaboration

More information

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp

More information

Lecture 19. Architectural Directions

Lecture 19. Architectural Directions Lecture 19 Architectural Directions Today s lecture Advanced Architectures NUMA Blue Gene 2010 Scott B. Baden / CSE 160 / Winter 2010 2 Final examination Announcements Thursday, March 17, in this room:

More information

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne

More information

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen Verbundprojekt ELPA-AEO http://elpa-aeo.mpcdf.mpg.de Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen BMBF Projekt 01IH15001 Feb 2016 - Jan 2019 7. HPC-Statustagung,

More information

Performance Evaluation of MPI on Weather and Hydrological Models

Performance Evaluation of MPI on Weather and Hydrological Models NCAR/RAL Performance Evaluation of MPI on Weather and Hydrological Models Alessandro Fanfarillo elfanfa@ucar.edu August 8th 2018 Cheyenne - NCAR Supercomputer Cheyenne is a 5.34-petaflops, high-performance

More information

Exascale challenges for Numerical Weather Prediction : the ESCAPE project

Exascale challenges for Numerical Weather Prediction : the ESCAPE project Exascale challenges for Numerical Weather Prediction : the ESCAPE project O Olivier Marsden This project has received funding from the European Union s Horizon 2020 research and innovation programme under

More information

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners José I. Aliaga Leveraging task-parallelism in energy-efficient ILU preconditioners Universidad Jaime I (Castellón, Spain) José I. Aliaga

More information

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system Oliver Fuhrer and Thomas C. Schulthess 1 Piz Daint Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Compute node:

More information

HPC APPLICATION SUPPORT FOR GPU COMPUTING

HPC APPLICATION SUPPORT FOR GPU COMPUTING HPC APPLICATION SUPPORT FOR GPU COMPUTING Addison Snell Laura Segervall Sponsored research report November 2017 EXECUTIVE SUMMARY In this report, has listed the 50 most commonly used high performance computing

More information

Making electronic structure methods scale: Large systems and (massively) parallel computing

Making electronic structure methods scale: Large systems and (massively) parallel computing AB Making electronic structure methods scale: Large systems and (massively) parallel computing Ville Havu Department of Applied Physics Helsinki University of Technology - TKK Ville.Havu@tkk.fi 1 Outline

More information

INTENSIVE COMPUTATION. Annalisa Massini

INTENSIVE COMPUTATION. Annalisa Massini INTENSIVE COMPUTATION Annalisa Massini 2015-2016 Course topics The course will cover topics that are in some sense related to intensive computation: Matlab (an introduction) GPU (an introduction) Sparse

More information

GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group

GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group 1 Acknowledgements NERC, NCAS Research Councils UK, HECToR Resource University of Leeds School of Earth and Environment

More information

Cyclops Tensor Framework

Cyclops Tensor Framework Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r

More information

ELECTRONIC STRUCTURE CALCULATIONS FOR THE SOLID STATE PHYSICS

ELECTRONIC STRUCTURE CALCULATIONS FOR THE SOLID STATE PHYSICS FROM RESEARCH TO INDUSTRY 32 ème forum ORAP 10 octobre 2013 Maison de la Simulation, Saclay, France ELECTRONIC STRUCTURE CALCULATIONS FOR THE SOLID STATE PHYSICS APPLICATION ON HPC, BLOCKING POINTS, Marc

More information

Stochastic Modelling of Electron Transport on different HPC architectures

Stochastic Modelling of Electron Transport on different HPC architectures Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy

More information

WRF Modeling System Overview

WRF Modeling System Overview WRF Modeling System Overview Jimy Dudhia What is WRF? WRF: Weather Research and Forecasting Model Used for both research and operational forecasting It is a supported community model, i.e. a free and shared

More information

Large Scale High Resolution Blood Flow Simulations

Large Scale High Resolution Blood Flow Simulations Large Scale High Resolution Blood Flow Simulations Florian Janoschek Jens Harting Federico Toschi Department of Applied Physics, Eindhoven University of Technology, The Netherlands Institute for Computational

More information

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures José I. Aliaga Performance and Energy Analysis of the Iterative Solution of Sparse

More information

Operational and research activities at ECMWF now and in the future

Operational and research activities at ECMWF now and in the future Operational and research activities at ECMWF now and in the future Sarah Keeley Education Officer Erland Källén Director of Research ECMWF An independent intergovernmental organisation established in 1975

More information

Lattice QCD at non-zero temperature and density

Lattice QCD at non-zero temperature and density Lattice QCD at non-zero temperature and density Frithjof Karsch Bielefeld University & Brookhaven National Laboratory QCD in a nutshell, non-perturbative physics, lattice-regularized QCD, Monte Carlo simulations

More information

A diagnostic interface for ICON Coping with the challenges of high-resolution climate simulations

A diagnostic interface for ICON Coping with the challenges of high-resolution climate simulations DLR.de Chart 1 > Challenges of high-resolution climate simulations > Bastian Kern HPCN Workshop, Göttingen > 10.05.2016 A diagnostic interface for ICON Coping with the challenges of high-resolution climate

More information

Progress in Numerical Methods at ECMWF

Progress in Numerical Methods at ECMWF Progress in Numerical Methods at ECMWF EWGLAM / SRNWP October 2016 W. Deconinck, G. Mengaldo, C. Kühnlein, P.K. Smolarkiewicz, N.P. Wedi, P. Bauer willem.deconinck@ecmwf.int ECMWF November 7, 2016 2 The

More information

Improving the solar cells efficiency of metal Sulfide thin-film nanostructure via first principle calculations

Improving the solar cells efficiency of metal Sulfide thin-film nanostructure via first principle calculations Improving the solar cells efficiency of metal Sulfide thin-film nanostructure via first principle calculations Alwaleed Adllan Abusin, PhD Elzina Bala Africa City of Technology Agenda INTRODUCTION TO HIGH-PERFORMANCE

More information

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers Victor Yu and the ELSI team Department of Mechanical Engineering & Materials Science Duke University Kohn-Sham Density-Functional

More information

Lattice Quantum Chromodynamics on the MIC architectures

Lattice Quantum Chromodynamics on the MIC architectures Lattice Quantum Chromodynamics on the MIC architectures Piotr Korcyl Universität Regensburg Intel MIC Programming Workshop @ LRZ 28 June 2017 Piotr Korcyl Lattice Quantum Chromodynamics on the MIC 1/ 25

More information

SHAPE Project Vortex Bladeless: Parallel multi-code coupling for Fluid-Structure Interaction in Wind Energy Generation

SHAPE Project Vortex Bladeless: Parallel multi-code coupling for Fluid-Structure Interaction in Wind Energy Generation Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe SHAPE Project Vortex Bladeless: Parallel multi-code coupling for Fluid-Structure Interaction in Wind Energy Generation J.C.

More information

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM

More information

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse

More information

Large Scale Electronic Structure Calculations

Large Scale Electronic Structure Calculations Large Scale Electronic Structure Calculations Jürg Hutter University of Zurich 8. September, 2008 / Speedup08 CP2K Program System GNU General Public License Community Developers Platform on "Berlios" (cp2k.berlios.de)

More information

Parallel Sparse Tensor Decompositions using HiCOO Format

Parallel Sparse Tensor Decompositions using HiCOO Format Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline

More information

Review for the Midterm Exam

Review for the Midterm Exam Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale Fortran Expo: 15 Jun 2012 Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale Lianheng Tong Overview Overview of Conquest project Brief Introduction

More information

GPU Computing Activities in KISTI

GPU Computing Activities in KISTI International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr

More information

Marwan Burelle. Parallel and Concurrent Programming. Introduction and Foundation

Marwan Burelle.  Parallel and Concurrent Programming. Introduction and Foundation and and marwan.burelle@lse.epita.fr http://wiki-prog.kh405.net Outline 1 2 and 3 and Evolutions and Next evolutions in processor tends more on more on growing of cores number GPU and similar extensions

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University

Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University Model Order Reduction via Matlab Parallel Computing Toolbox E. Fatih Yetkin & Hasan Dağ Istanbul Technical University Computational Science & Engineering Department September 21, 2009 E. Fatih Yetkin (Istanbul

More information

Enhancing Scalability of Sparse Direct Methods

Enhancing Scalability of Sparse Direct Methods Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.

More information

Domain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016

Domain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Domain specific libraries Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Part 1: Introduction Kohn-Shame equations 1 2 Eigen-value problem + v eff (r) j(r)

More information

NOAA Research and Development High Performance Compu3ng Office Craig Tierney, U. of Colorado at Boulder Leslie Hart, NOAA CIO Office

NOAA Research and Development High Performance Compu3ng Office Craig Tierney, U. of Colorado at Boulder Leslie Hart, NOAA CIO Office A survey of performance characteris3cs of NOAA s weather and climate codes across our HPC systems NOAA Research and Development High Performance Compu3ng Office Craig Tierney, U. of Colorado at Boulder

More information

ACCELERATING WEATHER PREDICTION WITH NVIDIA GPUS

ACCELERATING WEATHER PREDICTION WITH NVIDIA GPUS ACCELERATING WEATHER PREDICTION WITH NVIDIA GPUS Alan Gray, Developer Technology Engineer, NVIDIA ECMWF 18th Workshop on high performance computing in meteorology, 28 th September 2018 ESCAPE NVIDIA s

More information

BSMBench: A flexible and scalable HPC benchmark from beyond the standard model physics.

BSMBench: A flexible and scalable HPC benchmark from beyond the standard model physics. University of Plymouth PEARL Faculty of Science and Engineering https://pearl.plymouth.ac.uk School of Computing, Electronics and Mathematics 2016 BSMBench: A flexible and scalable HPC benchmark from beyond

More information

Practical performance portability in the Parallel Ocean Program (POP)

Practical performance portability in the Parallel Ocean Program (POP) See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/227606785 Practical performance portability in the Parallel Ocean Program (POP) Article in

More information

Advanced Computing Systems for Scientific Research

Advanced Computing Systems for Scientific Research Undergraduate Review Volume 10 Article 13 2014 Advanced Computing Systems for Scientific Research Jared Buckley Jason Covert Talia Martin Recommended Citation Buckley, Jared; Covert, Jason; and Martin,

More information

S3D Direct Numerical Simulation: Preparation for the PF Era

S3D Direct Numerical Simulation: Preparation for the PF Era S3D Direct Numerical Simulation: Preparation for the 10 100 PF Era Ray W. Grout, Scientific Computing SC 12 Ramanan Sankaran ORNL John Levesque Cray Cliff Woolley, Stan Posey nvidia J.H. Chen SNL NREL

More information

Massively parallel electronic structure calculations with Python software. Jussi Enkovaara Software Engineering CSC the finnish IT center for science

Massively parallel electronic structure calculations with Python software. Jussi Enkovaara Software Engineering CSC the finnish IT center for science Massively parallel electronic structure calculations with Python software Jussi Enkovaara Software Engineering CSC the finnish IT center for science GPAW Software package for electronic structure calculations

More information

Bigger, Better BigJob - Pilot-Job frameworks for large scale simulations

Bigger, Better BigJob - Pilot-Job frameworks for large scale simulations Bigger, Better BigJob - Pilot-Job frameworks for large scale simulations Antons Treikalis August 23, 2013 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2013 Contents

More information