Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR

Size: px
Start display at page:

Download "Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR"

Transcription

1 Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR Podgainy D.V., Streltsova O.I., Zuev M.I. on behalf of Heterogeneous Computations team HybriLIT LIT, Joint Institute for Nuclear Research

2 The main objective Investigation of the new Intel Xeon Phi 7200 (KNL) architecture for solving scientific problems at JINR (Dubna, Moscow region) using parallel programming technologies. Research includes: Calculations using application programs: - Calculation of the parameters of long Josephson junctions; - Bayesian analysis of realistic models of the equations of state of super dense nuclear matter; - Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors.

3 Intel Xeon Phi systems Heterogeneous cluster HybriLIT Intel Xeon Phi 7250 (1.4 GHz) 16 GB 68 cores 4 threads per core Intel Omni-Path Intel Xeon Phi 7290 (1.5 GHz) 16 GB 72 cores 4 threads per core Intel Omni-Path 2x Intel Xeon E5-2695v3 (2.3 GHz) RAM 128 GB 14 cores 2 threads per core Ethernet 1Gbit/s

4 Long Josephson Junctions Atanasova P.Kh., Bashashin M.V., Rahmonov I.R., Shukrinov Yu.M., Streltsova O.I., Volokhova A.V., Zemlyanaya E.V., Zuev M.I. (JINR, Plovdiv University (Bulgaria)) We consider a model that takes into account the inductive and capacitive coupling between the JJs. LJJ system consists of superconducting layers with intermediate dielectric (insulator) layers of length L. Phase dynamic of the LJJ system with capacitive and inductive coupling between the contacts is described by a system of equations with respect to the phase difference l and voltage Vl at each l-th contact: l t DCVl scvl 1 scvl 1 2 N V 1 l Ll,n 2n sin l l I t x t n 1 where L inductive coupling matrix. 1 L= S S 0 0 S 1 S 0 0 S S 1 V0 dl jc dissipation parameter, inductive coupling parameter S takes a S superconducting layers, I insulator (dielectric) layers value in the range 0< S <0.5. Dc the effective electrical thickness of JJ, normalized to the thickness of the dielectric layer. sc capacitive coupling parameter. Vl voltage at l-th JJ, I - external current. All quantities converted to dimensionless. Boundary conditions: l 0, t l L, t x x He

5 Long Josephson junctions. The numerical approach Numerical solution of the system (1-3). We introduce a uniform discrete grid in the space coordinate x with step x and in the time coordinate t with step t. To approximate the derivatives with respect to spatial coordinates, we use the standard three-point finite-difference formulas. The resulting system of ordinary differential equations for the values of the phase voltages and the differences in the discrete grid points is solved numerically using the four-step Runge-Kutta (RK) procedure.

6 Long Josephson junctions. The numerical approach Calculation of Current-Voltage Characteristic (CVC). We calculate dependence <V> on I. At each time step, we calculate integral L 1 V t V x, t dx (4) using Simpson's formula. Next, the integral V l l T max L 1 T 0 min Is calculated by means of rectangles formula. Finally, the voltage is averaged by the number of contacts V N l 1 l T T max min V l l V t dt (6) (5)

7 Long Josephson junctions. Results Number of contacts= 10, Number of nodes= 1000, x= ,14 176,16 Computation time, sec Less is better ,9 172,7 178,28 171,81 171,25 171, , Intel Xeon E v3 (28 cores) 2x Intel Xeon E v3 (56 cores) Default -O3 O3 -xmic-avx512 Intel Xeon Phi 7290 (72 cores)

8 Bayesian analysis of realistic models of the equations of state of super dense nuclear matter Ayriyan A., Blaschke D., Grigorian H., Poghosyan G. (JINR, KIT (Germany))

9 Bayesian analysis. Software package for BA of EoS models 1. Construction of hybrid models of EoS The fourth-order Runge-Kutta File Interface method with adaptive step 2. Calculation of netron star structure (solving of the TolmanOppenheimer-Volkoff equations) File Interface 3. Extracting the stable configuration File Interface 4. Determination of the a priori probabilities MPI realization 5. Calculation of conditional probabilities of the observations 6. Calculation of a posteriori probabilities of the models 7. Creation of the demonstration materials (graphs).

10 Laboratory of Information Technologies (JINR) Bayesian analysis. Get the a posteriori probabilities Alvarez, Ayriyan, Benic, Blaschke, Grigorian, Typel. Eur. Phys. J. A (2016) 52, 69

11 Bayesian analysis. Results of MPI realisation of TOV solver 4 х Intel Xeon Phi 7250, AVX512 Intel Omni-Path (100Gbs) 4 х Intel Xeon E5-2695v3, AVX2

12 Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors Hristov I., Hristova R. (LIT, JINR)

13 Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors. Numerical Scheme We follow the approach for construction of the numerical scheme from [1]. We solve the problem numerically in rectangular domain by using second order central finite differences with respect to all derivatives. As a result at every mesh point of the domain we have to solve a system with the tridiagonal matrix S. Because of the specific tridiagonal structure of S we need only 9*Neq - 12 floating point operations per solving one system, where Neq are the number of equations. [1] Kazacha G.S., Serdyukova S.I. Numerical Investigation of the Behaviour of Solutions of the SineGordon Equation with a Singularity for Large t, Zhurnal Vychislitelnoi Matematiki i Matematicheskoi Fiziki, 1993, Vol. 33, Issue 3, P

14 Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors. Parallelization strategy (Thread and SIMD levels of parallelism) 1. To achieve good data locality we take into account the column major order for multidimensional arrays of Fortran language. 2. We use two levels of parallelism thread level and SIMD level (automatic vectorization) for our pure OpenMP implementation. 3. The smallest piece of work is the calculation of 8 consecutive linear systems and solving them. Both calculations of right-hand sides and solving 8 linear systems at once are vectorized. The above pieces of work are distributed between OpenMP threads.

15 Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors. Performance results

16 Solving 2D Sine-Gordon equations on Ivy Bridge and KNL processors. Results of computations

17 Conclusions Long Josephson junctions Bayesian analysis 2D Sine-Gordon equations

18 Acknowledgements

19 Thank you for your attention! Any questions?

Numerical Study of a System of Long Josephson Junctions with Inductive and Capacitive Couplings

Numerical Study of a System of Long Josephson Junctions with Inductive and Capacitive Couplings EPJ Web of Conferences 108, 02038 (2016) DOI: 10.1051/ epjconf/ 201610802038 C Owned by the authors, published by EDP Sciences, 2016 Numerical Study of a System of Long Josephson Junctions with Inductive

More information

APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD

APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia

More information

COMBINED EXPLICIT-IMPLICIT TAYLOR SERIES METHODS

COMBINED EXPLICIT-IMPLICIT TAYLOR SERIES METHODS COMBINED EXPLICIT-IMPLICIT TAYLOR SERIES METHODS S.N. Dimova 1, I.G. Hristov 1, a, R.D. Hristova 1, I V. Puzynin 2, T.P. Puzynina 2, Z.A. Sharipov 2, b, N.G. Shegunov 1, Z.K. Tukhliev 2 1 Sofia University,

More information

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

Some thoughts about energy efficient application execution on NEC LX Series compute clusters Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science

More information

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing

More information

WRF performance tuning for the Intel Woodcrest Processor

WRF performance tuning for the Intel Woodcrest Processor WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,

More information

Benchmarking program performance evaluation of Parallel programming language XcalableMP on Many core processor

Benchmarking program performance evaluation of Parallel programming language XcalableMP on Many core processor XcalableMP 1 2 2 2 Xeon Phi Xeon XcalableMP HIMENO L Phi XL 16 Xeon 1 16 Phi XcalableMP MPI XcalableMP OpenMP 16 2048 Benchmarking program performance evaluation of Parallel programming language XcalableMP

More information

Perm State University Research-Education Center Parallel and Distributed Computing

Perm State University Research-Education Center Parallel and Distributed Computing Perm State University Research-Education Center Parallel and Distributed Computing A 25-minute Talk (S4493) at the GPU Technology Conference (GTC) 2014 MARCH 24-27, 2014 SAN JOSE, CA GPU-accelerated modeling

More information

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory MAX PLANCK INSTITUTE November 5, 2010 MPI at MPI Jens Saak Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory FOR DYNAMICS OF COMPLEX TECHNICAL

More information

Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors

Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors Igor Chernykh 1, Igor Kulikov 1, Boris Glinsky 1, Vitaly Vshivkov 1, Lyudmila Vshivkova 1, Vladimir Prigarin 1 Institute of Computational

More information

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp

More information

Beam dynamics calculation

Beam dynamics calculation September 6 Beam dynamics calculation S.B. Vorozhtsov, Е.Е. Perepelkin and V.L. Smirnov Dubna, JINR http://parallel-compute.com Outline Problem formulation Numerical methods OpenMP and CUDA realization

More information

Some notes on efficient computing and setting up high performance computing environments

Some notes on efficient computing and setting up high performance computing environments Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient

More information

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology

More information

arxiv: v2 [nucl-th] 10 Oct 2017

arxiv: v2 [nucl-th] 10 Oct 2017 Mixed Phase within the Multi-polytrope Approach to High Mass Twins. David Alvarez-Castillo, 1,2 David Blaschke, 1,3,4 and Stefan Typel 5,6 arxiv:1709.08857v2 [nucl-th] 10 Oct 2017 1 Bogoliubov Laboratory

More information

Computational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation

Computational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation Computational Numerical Integration for Spherical Quadratures Verified by the Boltzmann Equation Huston Rogers. 1 Glenn Brook, Mentor 2 Greg Peterson, Mentor 2 1 The University of Alabama 2 Joint Institute

More information

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen Verbundprojekt ELPA-AEO http://elpa-aeo.mpcdf.mpg.de Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen BMBF Projekt 01IH15001 Feb 2016 - Jan 2019 7. HPC-Statustagung,

More information

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing

More information

Calculation of ground states of few-body nuclei using NVIDIA CUDA technology

Calculation of ground states of few-body nuclei using NVIDIA CUDA technology Calculation of ground states of few-body nuclei using NVIDIA CUDA technology M. A. Naumenko 1,a, V. V. Samarin 1, 1 Flerov Laboratory of Nuclear Reactions, Joint Institute for Nuclear Research, 6 Joliot-Curie

More information

Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University

Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University Model Order Reduction via Matlab Parallel Computing Toolbox E. Fatih Yetkin & Hasan Dağ Istanbul Technical University Computational Science & Engineering Department September 21, 2009 E. Fatih Yetkin (Istanbul

More information

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse

More information

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS Progress In Electromagnetics Research M, Vol. 23, 53 63, 2012 AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS T.-S. Nguyen *, J.-M. Guichon, O. Chadebec, G. Meunier, and

More information

ELECTRONICS E # 1 FUNDAMENTALS 2/2/2011

ELECTRONICS E # 1 FUNDAMENTALS 2/2/2011 FE Review 1 ELECTRONICS E # 1 FUNDAMENTALS Electric Charge 2 In an electric circuit it there is a conservation of charge. The net electric charge is constant. There are positive and negative charges. Like

More information

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and Accelerating the Multifrontal Method Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes {rflucas,genew,ddavis}@isi.edu and grimes@lstc.com 3D Finite Element

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

16. Solution of elliptic partial differential equation

16. Solution of elliptic partial differential equation 16. Solution of elliptic partial differential equation Recall in the first lecture of this course. Assume you know how to use a computer to compute; but have not done any serious numerical computations

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Parallelism in Structured Newton Computations

Parallelism in Structured Newton Computations Parallelism in Structured Newton Computations Thomas F Coleman and Wei u Department of Combinatorics and Optimization University of Waterloo Waterloo, Ontario, Canada N2L 3G1 E-mail: tfcoleman@uwaterlooca

More information

Magnetic core memory (1951) cm 2 ( bit)

Magnetic core memory (1951) cm 2 ( bit) Magnetic core memory (1951) 16 16 cm 2 (128 128 bit) Semiconductor Memory Classification Read-Write Memory Non-Volatile Read-Write Memory Read-Only Memory Random Access Non-Random Access EPROM E 2 PROM

More information

Matrix Eigensystem Tutorial For Parallel Computation

Matrix Eigensystem Tutorial For Parallel Computation Matrix Eigensystem Tutorial For Parallel Computation High Performance Computing Center (HPC) http://www.hpc.unm.edu 5/21/2003 1 Topic Outline Slide Main purpose of this tutorial 5 The assumptions made

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

RWTH Aachen University

RWTH Aachen University IPCC @ RWTH Aachen University Optimization of multibody and long-range solvers in LAMMPS Rodrigo Canales William McDoniel Markus Höhnerbach Ahmed E. Ismail Paolo Bientinesi IPCC Showcase November 2016

More information

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms

More information

HPMPC - A new software package with efficient solvers for Model Predictive Control

HPMPC - A new software package with efficient solvers for Model Predictive Control - A new software package with efficient solvers for Model Predictive Control Technical University of Denmark CITIES Second General Consortium Meeting, DTU, Lyngby Campus, 26-27 May 2015 Introduction Model

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Transient Response of a Chemical Reactor Concentration of a substance in a chemical reactor

More information

Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver

Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,

More information

Sinusoidal Steady State Analysis (AC Analysis) Part II

Sinusoidal Steady State Analysis (AC Analysis) Part II Sinusoidal Steady State Analysis (AC Analysis) Part II Amin Electronics and Electrical Communications Engineering Department (EECE) Cairo University elc.n102.eng@gmail.com http://scholar.cu.edu.eg/refky/

More information

MOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES

MOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES MOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES V.V. Korenkov 1,3, a, E.G. Nikonov 1, b, M. Popovičová 2, с 1 Joint Institute for Nuclear

More information

Finite Difference Method

Finite Difference Method Finite Difference Method for BVP ODEs Dec 3, 2014 1 Recall An ordinary differential equation is accompanied by auxiliary conditions. In the analytical method, these conditions are used to evaluate the

More information

Application of Rarefied Flow & Plasma Simulation Software

Application of Rarefied Flow & Plasma Simulation Software 2016/5/18 Application of Rarefied Flow & Plasma Simulation Software Yokohama City in Japan Profile of Wave Front Co., Ltd. Name : Wave Front Co., Ltd. Incorporation : March 1990 Head Office : Yokohama

More information

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers Victor Yu and the ELSI team Department of Mechanical Engineering & Materials Science Duke University Kohn-Sham Density-Functional

More information

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

Lab 1: Numerical Solution of Laplace s Equation

Lab 1: Numerical Solution of Laplace s Equation Lab 1: Numerical Solution of Laplace s Equation ELEC 3105 last modified August 27, 2012 1 Before You Start This lab and all relevant files can be found at the course website. You will need to obtain an

More information

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance

More information

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria 1 Overview Improving LSTC s Multifrontal Linear Solver Roger Grimes 3, Robert Lucas 3, Nick Meng 2, Francois-Henry Rouet 3, Clement Weisbecker 3, and Ting-Ting Zhu 1 1 Cray Incorporated 2 Intel Corporation

More information

INVERSE DETERMINATION OF SPATIAL VARIATION OF DIFFUSION COEFFICIENTS IN ARBITRARY OBJECTS CREATING DESIRED NON- ISOTROPY OF FIELD VARIABLES

INVERSE DETERMINATION OF SPATIAL VARIATION OF DIFFUSION COEFFICIENTS IN ARBITRARY OBJECTS CREATING DESIRED NON- ISOTROPY OF FIELD VARIABLES Contributed Papers from Materials Science and Technology (MS&T) 05 October 4 8, 05, Greater Columbus Convention Center, Columbus, Ohio, USA Copyright 05 MS&T5 INVERSE DETERMINATION OF SPATIAL VARIATION

More information

Multi-GPU Simulations of the Infinite Universe

Multi-GPU Simulations of the Infinite Universe () Multi-GPU of the Infinite with with G. Rácz, I. Szapudi & L. Dobos Physics of Complex Systems Department Eötvös Loránd University, Budapest June 22, 2018, Budapest, Hungary Outline 1 () 2 () Concordance

More information

L ECE 4211 UConn F. Jain Scaling Laws for NanoFETs Chapter 10 Logic Gate Scaling

L ECE 4211 UConn F. Jain Scaling Laws for NanoFETs Chapter 10 Logic Gate Scaling L13 04202017 ECE 4211 UConn F. Jain Scaling Laws for NanoFETs Chapter 10 Logic Gate Scaling Scaling laws: Generalized scaling (GS) p. 610 Design steps p.613 Nanotransistor issues (page 626) Degradation

More information

Toward High Performance Matrix Multiplication for Exact Computation

Toward High Performance Matrix Multiplication for Exact Computation Toward High Performance Matrix Multiplication for Exact Computation Pascal Giorgi Joint work with Romain Lebreton (U. Waterloo) Funded by the French ANR project HPAC Séminaire CASYS - LJK, April 2014 Motivations

More information

A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya

A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya Sarah Swatski, Samuel Khuvis, and Matthias K. Gobbert (gobbert@umbc.edu) Department

More information

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:

More information

Lenstool-HPC. From scratch to supercomputers: building a large-scale strong lensing computational software bottom-up. HPC Advisory Council, April 2018

Lenstool-HPC. From scratch to supercomputers: building a large-scale strong lensing computational software bottom-up. HPC Advisory Council, April 2018 LenstoolHPC From scratch to supercomputers: building a largescale strong lensing computational software bottomup HPC Advisory Council, April 2018 Christoph Schäfer and Markus Rexroth (LASTRO) Gilles Fourestey

More information

Chapter 9b: Numerical Methods for Calculus and Differential Equations. Initial-Value Problems Euler Method Time-Step Independence MATLAB ODE Solvers

Chapter 9b: Numerical Methods for Calculus and Differential Equations. Initial-Value Problems Euler Method Time-Step Independence MATLAB ODE Solvers Chapter 9b: Numerical Methods for Calculus and Differential Equations Initial-Value Problems Euler Method Time-Step Independence MATLAB ODE Solvers Acceleration Initial-Value Problems Consider a skydiver

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

Parallel Transposition of Sparse Data Structures

Parallel Transposition of Sparse Data Structures Parallel Transposition of Sparse Data Structures Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng Department of Computer Science, Virginia Tech Niels Bohr Institute, University of Copenhagen Scientific Computing

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing

Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing Issaku Kanamori Department of Physics, Hiroshima University Higashi-hiroshima 739-8526, Japan Email: kanamori@hiroshima-u.ac.jp

More information

Crossing the Chasm. On the Paths to Exascale: Presented by Mike Rezny, Monash University, Australia

Crossing the Chasm. On the Paths to Exascale: Presented by Mike Rezny, Monash University, Australia On the Paths to Exascale: Crossing the Chasm Presented by Mike Rezny, Monash University, Australia michael.rezny@monash.edu Crossing the Chasm meeting Reading, 24 th October 2016 Version 0.1 In collaboration

More information

Zacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis

Zacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis Zacros Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis Jens H Nielsen, Mayeul D'Avezac, James Hetherington & Michail Stamatakis Introduction to Zacros

More information

arxiv: v1 [hep-lat] 8 Nov 2014

arxiv: v1 [hep-lat] 8 Nov 2014 Staggered Dslash Performance on Intel Xeon Phi Architecture arxiv:1411.2087v1 [hep-lat] 8 Nov 2014 Department of Physics, Indiana University, Bloomington IN 47405, USA E-mail: ruizli AT umail.iu.edu Steven

More information

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric

More information

FE Review 2/2/2011. Electric Charge. Electric Energy ELECTRONICS # 1 FUNDAMENTALS

FE Review 2/2/2011. Electric Charge. Electric Energy ELECTRONICS # 1 FUNDAMENTALS FE eview ELECONICS # FUNDAMENALS Electric Charge 2 In an electric circuit there is a conservation of charge. he net electric charge is constant. here are positive and negative charges. Like charges repel

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

ERLANGEN REGIONAL COMPUTING CENTER

ERLANGEN REGIONAL COMPUTING CENTER ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,

More information

INTRODUCTION TO SUPERCONDUCTING QUBITS AND QUANTUM EXPERIENCE: A 5-QUBIT QUANTUM PROCESSOR IN THE CLOUD

INTRODUCTION TO SUPERCONDUCTING QUBITS AND QUANTUM EXPERIENCE: A 5-QUBIT QUANTUM PROCESSOR IN THE CLOUD INTRODUCTION TO SUPERCONDUCTING QUBITS AND QUANTUM EXPERIENCE: A 5-QUBIT QUANTUM PROCESSOR IN THE CLOUD Hanhee Paik IBM Quantum Computing Group IBM T. J. Watson Research Center, Yorktown Heights, NY USA

More information

20D - Homework Assignment 4

20D - Homework Assignment 4 Brian Bowers (TA for Hui Sun) MATH 0D Homework Assignment November, 03 0D - Homework Assignment First, I will give a brief overview of how to use variation of parameters. () Ensure that the differential

More information

Susumu YAMADA 1,3 Toshiyuki IMAMURA 2,3, Masahiko MACHIDA 1,3

Susumu YAMADA 1,3 Toshiyuki IMAMURA 2,3, Masahiko MACHIDA 1,3 Dynamical Variation of Eigenvalue Problems in Density-Matrix Renormalization-Group Code PP12, Feb. 15, 2012 1 Center for Computational Science and e-systems, Japan Atomic Energy Agency 2 The University

More information

Introduction. HFSS 3D EM Analysis S-parameter. Q3D R/L/C/G Extraction Model. magnitude [db] Frequency [GHz] S11 S21 -30

Introduction. HFSS 3D EM Analysis S-parameter. Q3D R/L/C/G Extraction Model. magnitude [db] Frequency [GHz] S11 S21 -30 ANSOFT Q3D TRANING Introduction HFSS 3D EM Analysis S-parameter Q3D R/L/C/G Extraction Model 0-5 -10 magnitude [db] -15-20 -25-30 S11 S21-35 0 1 2 3 4 5 6 7 8 9 10 Frequency [GHz] Quasi-static or full-wave

More information

GMU, ECE 680 Physical VLSI Design 1

GMU, ECE 680 Physical VLSI Design 1 ECE680: Physical VLSI Design Chapter VIII Semiconductor Memory (chapter 12 in textbook) 1 Chapter Overview Memory Classification Memory Architectures The Memory Core Periphery Reliability Case Studies

More information

arxiv: v1 [hep-lat] 1 Nov 2018

arxiv: v1 [hep-lat] 1 Nov 2018 Practical Implementation of Lattice QCD Simulation on SIMD Machines with Intel AVX-512 arxiv:1811.00893v1 [hep-lat] 1 Nov 2018 Issaku Kanamori 1 and Hideo Matsufuru 2 1 Department of Physical Science,

More information

phys4.20 Page 1 - the ac Josephson effect relates the voltage V across a Junction to the temporal change of the phase difference

phys4.20 Page 1 - the ac Josephson effect relates the voltage V across a Junction to the temporal change of the phase difference Josephson Effect - the Josephson effect describes tunneling of Cooper pairs through a barrier - a Josephson junction is a contact between two superconductors separated from each other by a thin (< 2 nm)

More information

Advanced Computing Systems for Scientific Research

Advanced Computing Systems for Scientific Research Undergraduate Review Volume 10 Article 13 2014 Advanced Computing Systems for Scientific Research Jared Buckley Jason Covert Talia Martin Recommended Citation Buckley, Jared; Covert, Jason; and Martin,

More information

Laplace Transform Problems

Laplace Transform Problems AP Calculus BC Name: Laplace Transformation Day 3 2 January 206 Laplace Transform Problems Example problems using the Laplace Transform.. Solve the differential equation y! y = e t, with the initial value

More information

Performance of WRF using UPC

Performance of WRF using UPC Performance of WRF using UPC Hee-Sik Kim and Jong-Gwan Do * Cray Korea ABSTRACT: The Weather Research and Forecasting (WRF) model is a next-generation mesoscale numerical weather prediction system. We

More information

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits A Design Perspective Semiconductor Memories Adapted from Chapter 12 of Digital Integrated Circuits A Design Perspective Jan M. Rabaey et al. Copyright 2003 Prentice Hall/Pearson Outline Memory Classification Memory Architectures

More information

Hybrid modeling of plasmas

Hybrid modeling of plasmas Hybrid modeling of plasmas Mats Holmström Swedish Institute of Space Physics Kiruna, Sweden ENUMATH 2009 Uppsala, June 30 matsh@irf.se www.irf.se/~matsh/ Outline Background and motivation Space plasma

More information

The Role of Annihilation in a Wigner Monte Carlo Approach

The Role of Annihilation in a Wigner Monte Carlo Approach The Role of Annihilation in a Wigner Monte Carlo Approach Jean Michel Sellier 1, Mihail Nedjalkov 2, Ivan Dimov 1(B), and Siegfried Selberherr 2 1 Institute for Parallel Processing, Bulgarian Academy of

More information

Introduction to Benchmark Test for Multi-scale Computational Materials Software

Introduction to Benchmark Test for Multi-scale Computational Materials Software Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)

More information

Computational Methods in Plasma Physics

Computational Methods in Plasma Physics Computational Methods in Plasma Physics Richard Fitzpatrick Institute for Fusion Studies University of Texas at Austin Purpose of Talk Describe use of numerical methods to solve simple problem in plasma

More information

AIMS Exercise Set # 1

AIMS Exercise Set # 1 AIMS Exercise Set #. Determine the form of the single precision floating point arithmetic used in the computers at AIMS. What is the largest number that can be accurately represented? What is the smallest

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

Parallel Multivariate SpatioTemporal Clustering of. Large Ecological Datasets on Hybrid Supercomputers

Parallel Multivariate SpatioTemporal Clustering of. Large Ecological Datasets on Hybrid Supercomputers Parallel Multivariate SpatioTemporal Clustering of Large Ecological Datasets on Hybrid Supercomputers Sarat Sreepathi1, Jitendra Kumar1, Richard T. Mills2, Forrest M. Hoffman1, Vamsi Sripathi3, William

More information

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry and Eugene DePrince Argonne National Laboratory (LCF and CNM) (Eugene moved to Georgia Tech last week)

More information

Distributed computing of simultaneous Diophantine approximation problems

Distributed computing of simultaneous Diophantine approximation problems Stud. Univ. Babeş-Bolyai Math. 59(2014), No. 4, 557 566 Distributed computing of simultaneous Diophantine approximation problems Norbert Tihanyi, Attila Kovács and Ádám Szűcs Abstract. In this paper we

More information

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk

More information

Transient Finite Element Analysis of a Spice-Coupled Transformer with COMSOL-Multiphysics

Transient Finite Element Analysis of a Spice-Coupled Transformer with COMSOL-Multiphysics Presented at the COMSOL Conference 2010 Paris Thomas Bödrich 1, Holger Neubert 1, Rolf Disselnkötter 2 Transient Finite Element Analysis of a Spice-Coupled Transformer with COMSOL-Multiphysics 2010-11-17

More information

Semiconductor memories

Semiconductor memories Semiconductor memories Semiconductor Memories Data in Write Memory cell Read Data out Some design issues : How many cells? Function? Power consuption? Access type? How fast are read/write operations? Semiconductor

More information

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale

Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale Fortran Expo: 15 Jun 2012 Conquest order N ab initio Electronic Structure simulation code for quantum mechanical modelling in large scale Lianheng Tong Overview Overview of Conquest project Brief Introduction

More information

EM Simulations using the PEEC Method - Case Studies in Power Electronics

EM Simulations using the PEEC Method - Case Studies in Power Electronics EM Simulations using the PEEC Method - Case Studies in Power Electronics Andreas Müsing Swiss Federal Institute of Technology (ETH) Zürich Power Electronic Systems www.pes.ee.ethz.ch 1 Outline Motivation:

More information

Medical Physics & Science Applications

Medical Physics & Science Applications Power Conversion & Electromechanical Devices Medical Physics & Science Applications Transportation Power Systems 1-5: Introduction to the Finite Element Method Introduction Finite Element Method is used

More information

REQUEST FOR A SPECIAL PROJECT

REQUEST FOR A SPECIAL PROJECT REQUEST FOR A SPECIAL PROJECT 2015 2017 MEMBER STATE: Germany Principal Investigator: Dr. Martin Losch Affiliation: Address: Am Handelshafen 12 D-27570 Bremerhaven Germany E-mail: Other researchers: Project

More information

Exponential integrators for semilinear parabolic problems

Exponential integrators for semilinear parabolic problems Exponential integrators for semilinear parabolic problems Marlis Hochbruck Heinrich-Heine University Düsseldorf Germany Innsbruck, October 2004 p. Outline Exponential integrators general class of methods

More information

Simple ODE Solvers - Derivation

Simple ODE Solvers - Derivation Simple ODE Solvers - Derivation These notes provide derivations of some simple algorithms for generating, numerically, approximate solutions to the initial value problem y (t =f ( t, y(t y(t 0 =y 0 Here

More information

Electrical Eng. fundamental Lecture 1

Electrical Eng. fundamental Lecture 1 Electrical Eng. fundamental Lecture 1 Contact details: h-elhelw@staffs.ac.uk Introduction Electrical systems pervade our lives; they are found in home, school, workplaces, factories,

More information

Parallel sparse direct solvers for Poisson s equation in streamer discharges

Parallel sparse direct solvers for Poisson s equation in streamer discharges Parallel sparse direct solvers for Poisson s equation in streamer discharges Margreet Nool, Menno Genseberger 2 and Ute Ebert,3 Centrum Wiskunde & Informatica (CWI), P.O.Box 9479, 9 GB Amsterdam, The Netherlands

More information

Electro-Thermal Modelling of High Power Light Emitting Diodes Based on Experimental Device Characterisation

Electro-Thermal Modelling of High Power Light Emitting Diodes Based on Experimental Device Characterisation Presented at the COMSOL Conference 2008 Boston Electro-Thermal Modelling of High Power Light Emitting Diodes Based on Experimental Device Characterisation Toni López Comsol Conference, Boston 2008 Outline

More information

Modeling and Experimentation: Compound Pendulum

Modeling and Experimentation: Compound Pendulum Modeling and Experimentation: Compound Pendulum Prof. R.G. Longoria Department of Mechanical Engineering The University of Texas at Austin Fall 2014 Overview This lab focuses on developing a mathematical

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

Two applications of macros in PSTricks

Two applications of macros in PSTricks Two applications of macros in PSTricks Le Phuong Quan (Cantho University, Vietnam) April 5, 00 Contents. Drawing approximations to the area under a graph by rectangles.. Description...........................................

More information