Calculation of ground states of few-body nuclei using NVIDIA CUDA technology

Similar documents
APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD

Application of CUDA technology to calculation of ground states of few-body nuclei by Feynman s continual integrals method

Nucleon Transfer within Distorted Wave Born Approximation

Introduction to numerical computations on the GPU

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU

Web Knowledge Base on Low Energy Nuclear Physics

Perm State University Research-Education Center Parallel and Distributed Computing

MOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES

Superheavy nuclei: Decay and Stability

Sub-barrier fusion enhancement due to neutron transfer

Semi-Classical Model of Neutron Rearrangement Using Quantum Coupled-Channel Approach

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Theoretical basics and modern status of radioactivity studies

arxiv: v1 [hep-lat] 7 Oct 2010

PSEUDORANDOM numbers are very important in practice

Cross Sections of Gadolinium Isotopes in Neutron Transmission Simulated Experiments with Low Energy Neutrons up to 100 ev

Study of heavy ion elastic scattering within quantum optical model

Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice

Review of lattice EFT methods and connections to lattice QCD

Description of the fusion-fission reactions in the framework of dinuclear system conception

Observables predicted by HF theory

GPU Acceleration of BCP Procedure for SAT Algorithms

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

Nuclear and Radiation Physics

Introduction to Neural Networks

Dense Arithmetic over Finite Fields with CUMODP

Welcome to MCS 572. content and organization expectations of the course. definition and classification

GPU Computing Activities in KISTI

TDHF Basic Facts. Advantages. Shortcomings

Radioactivity and energy levels

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

PLATFORM DEPENDENT EFFICIENCY OF A MONTE CARLO CODE FOR TISSUE NEUTRON DOSE ASSESSMENT

arxiv:nucl-ex/ v1 16 May 2006

Cluster Structure of 9 Be from 3 He+ 9 Be Reaction

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Alpha Decay of Superheavy Nuclei

Shortest Lattice Vector Enumeration on Graphics Cards

Lecture 4: Nuclear Energy Generation

Thermodynamics of nuclei in thermal contact

Hydra. A library for data analysis in massively parallel platforms. A. Augusto Alves Jr and Michael D. Sokoloff

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Production of new neutron rich heavy and superheavy nuclei

PHYSICS CET-2014 MODEL QUESTIONS AND ANSWERS NUCLEAR PHYSICS

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic

PhysicsAndMathsTutor.com 1

A CUDA Solver for Helmholtz Equation

Subbarrier cold fusion reactions leading to superheavy elements( )

Width of a narrow resonance. t = H ˆ ψ expansion of ψ in the basis of H o. = E k

Antti-Pekka Hynninen, 5/10/2017, GTC2017, San Jose CA

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

Nuclear Structure and Reactions using Lattice Effective Field Theory

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD

Nucleosynthesis of Transuranium Nuclides under Conditions of Mike, Par and Barbel Thermonuclear Explosions

Cosmology with Galaxy Clusters: Observations meet High-Performance-Computing

Nuclear Spectroscopy: Radioactivity and Half Life

Nuclear Physics 3 8 O+ B. always take place and the proton will be emitted with kinetic energy.

Tips Geared Towards R. Adam J. Suarez. Arpil 10, 2015

Photonuclear Reaction Cross Sections for Gallium Isotopes. Serkan Akkoyun 1, Tuncay Bayram 2

NEW COMPLEX OF MODERATORS FOR CONDENSED MATTER RESEARCH AT THE IBR-2M REACTOR *

Dissipative nuclear dynamics

Nuclear contribution into single-event upset in 3D on-board electronics at moderate energy cosmic proton impact

Feynman diagrams in nuclear physics at low and intermediate energies

arxiv: v1 [hep-lat] 10 Jul 2012

GPU-based computation of the Monte Carlo simulation of classical spin systems

Nuclear Reactions. Shape, interaction, and excitation structures of nuclei scattering expt. cf. Experiment by Rutherford (a scatt.

ISOTONIC AND ISOTOPIC DEPENDENCE OF THE RADIATIVE NEUTRON CAPTURE CROSS-SECTION ON THE NEUTRON EXCESS

Nuclear Physics and Astrophysics

Repulsive aspects of pairing correlation in nuclear fusion reaction

Stochastic Modelling of Electron Transport on different HPC architectures

FIRST STEPS IN PHYSICAL TREATING OF THE COLLINEAR CLUSTER TRI-PARTITION MECHANISM

Julian Merten. GPU Computing and Alternative Architecture

Nuclear Binding Energy

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR

Core evolution for high mass stars after helium-core burning.

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA

Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method

Nuclear Binding Energy

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures

ACCUMULATION OF ACTIVATION PRODUCTS IN PB-BI, TANTALUM, AND TUNGSTEN TARGETS OF ADS

FUSION AND FISSION DYNAMICS OF HEAVY NUCLEAR SYSTEM

Introduction to Modern Physics Problems from previous Exams 3

Activity 12: Energy from Nuclear Reactions

Measuring freeze-out parameters on the Bielefeld GPU cluster

DOING PHYSICS WITH MATLAB QUANTUM MECHANICS THEORY OF ALPHA DECAY. Ian Cooper. School of Physics, University of Sydney.

ALPHA-DECAY AND SPONTANEOUS FISSION HALF-LIVES OF SUPER-HEAVY NUCLEI AROUND 270Hs

MONTE CARLO NEUTRON TRANSPORT SIMULATING NUCLEAR REACTIONS ONE NEUTRON AT A TIME Tony Scudiero NVIDIA

The number of protons in the nucleus is known as the atomic number Z, and determines the chemical properties of the element.

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics

CRYPTOGRAPHIC COMPUTING

Quantifying Uncertainties in Nuclear Density Functional Theory

Performance Evaluation of Scientific Applications on POWER8

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

S XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library

Systematic Study of Survival Probability of Excited Superheavy Nucleus

MURCIA: Fast parallel solvent accessible surface area calculation on GPUs and application to drug discovery and molecular visualization

STATISTICAL MODEL OF DECAY OF EXCITED NUCLEI

Transcription:

Calculation of ground states of few-body nuclei using NVIDIA CUDA technology M. A. Naumenko 1,a, V. V. Samarin 1, 1 Flerov Laboratory of Nuclear Reactions, Joint Institute for Nuclear Research, 6 Joliot-Curie st., Moscow reg., Dubna, 14198, Russian Federation Dubna State University, 19 Universitetskaya st., Moscow reg., Dubna, 14198, Russian Federation -mail: a anaumenko@inr.ru The possibility of application of modern parallel computing solutions to speed up the calculations of ground states of few-body nuclei by Feynman's continual integrals method has been investigated. These calculations may require large computational time, particularly in the case of systems with many degrees of freedom. The results of application of general-purpose computing on graphics processing units (GPGPU) using NVIDIA CUDA technology are presented. The algorithm allowing us to perform calculations directly on GPU was developed and implemented in C++ programming language. Calculations were performed on the NVIDIA Tesla K4 accelerator installed within the heterogeneous cluster of the Laboratory of Information Technologies, Joint Institute for Nuclear Research, Dubna. The energy and the square modulus of the wave function of the ground states of several fewbody nuclei have been calculated. The results show that the use of GPGPU significantly increases the speed of calculations. Keywords: NVIDIA CUDA, Feynman s continual integrals method, few-body nuclei The work was supported by grant 15-7-7673-a of the Russian Foundation for Basic Research (RFBR). 16 Михаил Алексеевич Науменко, Вячеслав Владимирович Самарин 376

1. Introduction In this work an attempt is made to use modern parallel computing solutions to speed up the calculations of ground states of few-body nuclei by Feynman s continual integrals method [Samarin, 15; Samarin, Naumenko, 16; Naumenko, Samarin, 16]. The algorithm allowing us to perform calculations directly on GPU was developed and implemented in C++ programming language. The energy and the square modulus of the wave function of the ground states of several few-body nuclei have been calculated using NVIDIA CUDA technology [NVIDIA; Sanders J., Kandrot., 11]. The results show that the use of GPU is very effective for these calculations.. Theory and computing The energy and the square modulus of the wave function of the ground state of a system of few particles with coordinates q may be calculated by Feynman s continual integrals method using the propagator K q, ; q, in uclidian time [Shuryak, Zhirov, 1984] K q, ; q, ( q) exp ( q) exp g( ) d n n n cont. (1) Here g ( ) is the density of states with the continuous spectrum cont. For the system with a discrete spectrum and finite motion of particles the square modulus of the wave function of the ground state may be found in the limit together with the energy K q, ; q, ( q) exp,. () The theoretical approach is described in detail in [Naumenko, Samarin, 16]. The calculation of K q, ; q, for the fixed was performed by parallel calculation of exponentials F N F exp b V q k 1 for every random traectory q f ( q, k ), where N /. The same nucleon-nucleon interaction potentials k k V q were used for all the studied nuclei. k (3) The Monte Carlo algorithm for numerical calculations was developed and implemented in C++ programming language using NVIDIA CUDA technology. The integration method does not require the use of any additional integration libraries. The calculation included 3 steps: K q, ; q, was calculated in a set of multidimensional points q and the maximum of 1) K q, ; q, ) The (i.e. ) was found. q corresponding to the obtained maximum was fixed, K q, ; q, several increasing values of and the ear region of K q, ; q, was calculated for was found. 3) The time corresponding to the beginning of the obtained ear region was fixed and K q, ; q, (i.e. ) was calculated in all points of the necessary region. The principal scheme of the calculation of the ground state energy is shown in Fig. 1. The calculation of the propagator is performed using L sequential launches of the kernel. ach kernel launch simulates n random traectories in the space evolving from the uclidean time to, where 377

1, L. All traectories with N / time steps start at the same point the moment return to the same point () q in the space and in () q according to the chosen probability distribution. Fig. 1. The scheme of calculation of the ground state energy All threads in a given kernel launch finish at approximately the same time, which makes the scheme quite effective in spite of the possible delays associated with the kernel launch overhead. Besides, the typical number of kernel launches L required for the calculation of the ground state energy usually does not exceed 1. Starting from the certain time b K 1 ln the obtained values of the logarithm of the propagator tend to lie on the straight e, the slope of which gives the value of the ground state energy. The time is then used in the calculation of the square modulus of the wave function. The principal scheme of the calculation of the square modulus of the wave function is shown in Fig.. Similarly, the calculation is performed using M sequential launches of the kernel. ach kernel launch simulates n random traectories in the space evolving from the uclidean time to the time determined in the calculation of the ground state energy. All traectories start at the same point ( s) q in the space and in the moment return back to the same point ( s) q according to the chosen probability distribution. Here s 1, M, where M is the total number of points in the space in which the square modulus of the wave function must be calculated. One of the benefits of the approach is that the calculation may be easily resumed later. For example, initially the square modulus of the wave function may be calculated with a large space step to obtain the general features of the probability distribution, and later new intermediate points are calculated and combined with those calculated previously. This may be very useful because the calculation of the square modulus of the wave function is generally much more time-consuming since it requires calculation in many points in the multidimensional space. An important feature of the algorithm allowing effective use of graphic processors is low consumption of memory during the calculation because it is not necessary to prepare a grid of values and store it in the memory. To obtain normally distributed random numbers the curand random number generator was used. According to the recommendations of the curand developers, each experiment was assigned a unique seed. Within the experiment, each thread of computation was assigned a unique sequence 378

number. All threads between kernel launches were given the same seed, and the sequence numbers were assigned in a monotonically increasing way. Fig.. The scheme of calculation of the square modulus of the wave function Calculations were performed on the NVIDIA Tesla K4 accelerator installed within the heterogeneous cluster [Heterogeneous Cluster] of the Laboratory of Information Technologies, Joint Institute for Nuclear Research, Dubna. The code was compiled with NVIDIA CUDA version 7.5 for architecture version 3.5. Calculations were performed with single precision. The energy of the ground state is negative and therefore only the first term in formula (1) increases with the increase of, whereas the energies of the excited states are positive and hence the other terms in decrease with the increase of. The slope of the ear regression equals the energy of the ground state. The obtained theoretical binding energies b are listed in Tab. 1 together with the experimental values taken from the knowledge base [Zagrebaev, Denikin, Karpov, Alekseev, Naumenko, Rachkov, Samarin, Saiko]. It is clear that the theoretical values are close enough to the experimental ones, though obtaining good agreement was not the goal. The observed difference between the calculated binding energies of 3 H and 3 He is also in agreement with the experimental values. Tab. 1. Comparison of theoretical and experimental binding energies for the ground states of the studied nuclei Atomic nucleus Theoretical value, MeV xperimental value, MeV H 1.17 ± 1.5 3 H 9.9 ± 1 8.48 3 He 6.86 ± 1 7.718 4 He 6.95 ± 1 8.96 The code implementing Feynman's continual integrals method was initially written for CPU. The comparison of the calculation time of the ground state energy for 3 He using Intel Core i5 347 (double precision) and NVIDIA Tesla K4 (single precision) with different statistics is shown in Tab.. ven taking into account that the code for CPU used only one thread, double precision and a different random number generator, the time difference is impressive. This fact allows us to increase the statistics and the accuracy of calculations in the case of using NVIDIA CUDA technology. The comparison of the calculation time of the square modulus of the wave function for the ground state of 3 He using Intel Core i5 347 and NVIDIA Tesla K4 with the statistics 1 6 for every point in the space and the total number of points 43 is shown in Tab. 3. The value ~ 177 days for 379

CPU is an estimation based on the performance gain in the calculation of the ground state energy. It is evident that beside the performance gain the use of NVIDIA CUDA technology in certain cases may enable calculations impossible before. Tab.. Comparison of the calculation time of the ground state energy for 3 He nucleus Statistics, n Intel Core i5 347 (1 thread, double precision), sec NVIDIA Tesla K4, (single precision), sec Performance gain, times 1 5 ~ 1854 ~ 8 ~ 3 1 6 ~ 18377 ~ 47 ~ 391 5 1 6 ~ 1 1 7 ~ 439 Tab. 3. Comparison of the calculation time of the square modulus of the wave function for the ground state of 3 He nucleus 3. Conclusion Statistics, Intel Core i5 347 NVIDIA Tesla K4, n (1 thread, double precision, estimation) (single precision) 1 6 ~ 177 days ~ 11 hours In this work an attempt is made to use modern parallel computing solutions to speed up the calculations of ground states of few-body nuclei by Feynman s continual integrals method. The algorithm allowing us to perform calculations directly on GPU was developed and implemented in C++ programming language. The method was applied to the nuclei consisting of nucleons, but it may also be applied to the calculation of cluster nuclei. The results show that the use of GPGPU significantly increases the speed of calculations. This allows us to increase the statistics and the accuracy of calculations as well as reduce the space step in calculations of wave functions. It also greatly simplifies the process of debugging and testing. In certain cases, the use of NVIDIA CUDA enables calculations impossible before. References Heterogeneous Cluster of LIT, JINR [lectronic resource]: http://hybrilit.inr.ru/. Naumenko M. A., Samarin V. V. Application of CUDA technology to calculation of ground states of few-body nuclei by Feynman s continual integrals method // Supercomputing frontiers and innovations. 16. Vol. 3, No.. P. 8 95. NVIDIA CUDA [lectronic resource]: http://developer.nvidia.com/cuda-zone/. Samarin V. V. Quantum Description of Coupg to Neutron-Rearrangement Channels in Fusion Reactions near the Coulomb Barrier // Phys. At. Nucl. 15. Vol. 78, No. 7 P. 861 87. Samarin V. V., Naumenko M. A. Study of Ground States of 3,4,6 He Nuclides by Feynman s Continual Integrals Method // Bull. Russ. Acad. Sci. Phys. 16. Vol. 8, No. 3 P. 83 89. Sanders J., Kandrot. CUDA by xample: An Introduction to General-Purpose GPU Programming. New York: Addison-Wesley, 11. Shuryak. V., Zhirov O. V. Testing Monte Carlo Methods for Path Integrals in Some Quantum Mechanical Problems // Nucl. Phys. B. 1984. Vol. 4. P. 393 46. Zagrebaev V. I., Denikin A. S., Karpov A. V., Alekseev A. P., Naumenko M. A., Rachkov V. A., Samarin V. V., Saiko V. V. NRV web knowledge base on low-energy nuclear physics [lectronic resource]: http://nrv.inr.ru/ 38