Graphics Card Computing for Materials Modelling
|
|
- Oswald Ross
- 5 years ago
- Views:
Transcription
1 Graphics Card Computing for Materials Modelling Case study: Analytic Bond Order Potentials B. Seiser, T. Hammerschmidt, R. Drautz, D. Pettifor Funded by EPSRC within the collaborative multi-scale project Alloys By Design: Nickel-base superalloys
2 Alloys by Design Materials for gas turbine blades: Challenge: CREEP RESISTANT STABLE 0.5 μm 2.5 μm Dislocation creep COATABLE Precipitation of detrimental phases CASTABLE Titanium Steel Nickel Aluminium 25 μm 25 cm Ni-based superalloys: Cr, Co, Mo, W, Al, Ti, Ta, Re, Ru, Hf, C, B (<10 wt%) alloy design still empirically rather than theoretically expensive, time-consuming, non-optimized alloys Reaction with coatings Freckling instabilities Need multi-scale modelling for alloy design
3 Materials Modelling with GPUs Molecular dynamics GPU codes Hierarchy in Materials Modelling AceMD (the biomolecular MD package used by GPUGRID) Ascalaph (molecular modelling suite) HOOMD (Highly Optimized Object Oriented Molecular Dynamics) VMD & NAMD (Visual Molecular Dynamics) Density functional theory codes TeraChem (GTO, J. Chem. Theory Comput., 2008, 4 (2), pp ) Single precision: x speed up BIGDFT (WL, see Journal of Chemical Physics 131, , 2009) Dwarfs are essential for most electronic structure calculation methods
4 Tight-binding method Total energy: Repulsive energy: E = E rep + E bond Summation of pair-wise interactions Bond energy: Bond integral: H kl k H = l H ik i H jl H ij j E F E bond = n(e) E de n(e) Density of states H ii H ij H ik 0 H ji H jj 0 H jl H ki 0 H kk H kl H ij = < i H j> = R T Hv = Ev x ppσ (r ij ) ppπ (r ij ) ppπ (r ij ) Matrices dimension depending on number of orbitals Lapack Scalapack Hv = Ev periodic crystal E x R E F 0 H lj H lk H ll Jacket n(e)
5 Analytic Bond Order potentials Moments of density of states: Moment theorem: Cyrot-Lackmann (1967) = 1 = centre of gravity = RMS width = skewness = bimodality Bond integral Interference path between atom i and j Bond order potential (BOP) bond energy: n = 3 Drautz and Pettifor (2006) n = 4 n = 5 where g n and is n th moment E f
6 BOPfox BOPfox tool (Fortran 90): Tight-binding, EAM, BOP -> Molecular dynamics, kmc Benchmark for fcc with 864 W atoms, 12 moments [s] [%] initialization neighbour lists bond matrix evaluate moments evaluate ainf,binf forces EAM Fermi level search self-consistency total % matrix multiplications rest is spent on path finding
7 Interference paths Calculation of interference paths: Length (n) = 2 l ( ) = ( ) ( ) li start and end on atom i lj ji + ( ) ( ) start and lk end ki 2 nd moment of atom i = sum of paths (n=2) that 4 nd moment of atom i = sum of paths (n=4) that on atom i j i k T ( ) ii = ( ) li ( ) li EP Set of end points
8 Interference paths Calculation of interference paths: Length = 3 ( ) = ( )( ) j k + ( )( ) +... i
9 Density of of states Number of matrix multiplications /atom Matrix multiplications EAM/PP TB 20 7x x10 4 5x10 4 4x10 4 3x10 4 2x10 4 1x Energy Number of moments Accuarcy Number of matrix multiplications scales linearly with number of atoms!
10 BOPfox goes GPU BOPfox tool (Fortran 90): Tight-binding, EAM, BOP -> Molecular dynamics, kmc Benchmark for fcc with 864 W atoms, 12 moments [s] [%] initialization neighbour lists bond matrix evaluate moments evaluate ainf,binf forces EAM Fermi level search self-consistency total hosttogpu_uploadatomicpositions(); hosttogpu_uploadneighbourlist(); gpu_gettodolist(); //Get list of matrix calculations gpu_calculatebondintegrals(); //r ik -> H ik for (i = 2; i <= ninterferencemax; i++){ gpu_matrixmultiplication(); gpu_matrixaddition(); gpu_momentcalculation(); gputohost_moments(); }
11 Graphics Card Computing for Materials Modelling BOPfox and BOPC BOPfox (CPU) Hardware Intel Core2 Dual CPU E GHz 4 GB memory Compiler options Gfortran Release modus (-03) BOPC (GPU) Hardware nvidia GeForce GTX multiprocessors 216 cores 1.5 Ghz Compiler options Nvcc release modus (-03), CUDA 2.0 Benchmark of BOPfox vs BOPC Task BOPfox (CPU) [ms] BOPC (GPU) [ms] Factor (Speed up) Calculation of matrices ~22 Path finding ~44 Matrix multiplication ~19 24 x overall speed up
12 Conclusions Materials modelling can benefit significantly from GPU parallelization Linear algebra and FFT are essential for most electronic structure calculation methods Models like analytic bond order potentials try to avoid expensive LA/FFT routines significant speed up possible
Introduction to numerical computations on the GPU
Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming
More informationA CUDA Solver for Helmholtz Equation
Journal of Computational Information Systems 11: 24 (2015) 7805 7812 Available at http://www.jofcis.com A CUDA Solver for Helmholtz Equation Mingming REN 1,2,, Xiaoguang LIU 1,2, Gang WANG 1,2 1 College
More informationA model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)
A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationTips Geared Towards R. Adam J. Suarez. Arpil 10, 2015
Tips Geared Towards R Departments of Statistics North Carolina State University Arpil 10, 2015 1 / 30 Advantages of R As an interpretive and interactive language, developing an algorithm in R can be done
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationKlaus Schulten Department of Physics and Theoretical and Computational Biophysics Group University of Illinois at Urbana-Champaign
Klaus Schulten Department of Physics and Theoretical and Computational Biophysics Group University of Illinois at Urbana-Champaign GTC, San Jose Convention Center, CA Sept. 20 23, 2010 GPU and the Computational
More informationA Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationPopulation annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice
Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice Michal Borovský Department of Theoretical Physics and Astrophysics, University of P. J. Šafárik in Košice,
More informationMorse index of figure-eight choreography for equal mass three-body problem
Morse index of figure-eight choreography for equal mass three-body problem Hiroshi Fukuda College of Liberal Arts and Sciences, Kitasato University (2017.3.22) In this talk We show Morse index of Figure-eight
More informationMulticore Parallelization of Determinant Quantum Monte Carlo Simulations
Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Andrés Tomás, Che-Rung Lee, Zhaojun Bai, Richard Scalettar UC Davis SIAM Conference on Computation Science & Engineering Reno, March
More informationGPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic
GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago
More informationThe Fast Multipole Method in molecular dynamics
The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.
More informationAccelerating Model Reduction of Large Linear Systems with Graphics Processors
Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex
More informationAn FPGA Implementation of Reciprocal Sums for SPME
An FPGA Implementation of Reciprocal Sums for SPME Sam Lee and Paul Chow Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto Objectives Accelerate part of Molecular
More informationMAGMA. Matrix Algebra on GPU and Multicore Architectures. Mark Gates. February 2012
MAGMA Matrix Algebra on GPU and Multicore Architectures Mark Gates February 2012 1 Hardware trends Scale # cores instead of clock speed Hardware issue became software issue Multicore Hybrid 1.E+07 1e7
More informationarxiv: v1 [physics.comp-ph] 30 Oct 2017
An efficient GPU algorithm for tetrahedron-based Brillouin-zone integration Daniel Guterding 1, and Harald O. Jeschke 1 Lucht Probst Associates, Große Gallusstraße 9, 011 Frankfurt am Main, Germany, European
More informationHybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures
More informationWhat I Did Last Summer
What I Did Last Summer LINGOs, GPUs, and Monitoring Vertex Imran Haque Department of Computer Science Pande Lab, Stanford University http://cs.stanford.edu/people/ihaque http://folding.stanford.edu ihaque@cs.stanford.edu
More informationCRYPTOGRAPHIC COMPUTING
CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,
More informationJulian Merten. GPU Computing and Alternative Architecture
Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg
More informationUNMIXING 4-D PTYCHOGRAPHIC IMAGES
UNMIXING 4-D PTYCHOGRAPHIC IMAGES Mentors: Dr. Rick Archibald(ORNL), Dr. Azzam Haidar(UTK), Dr. Stanimire Tomov(UTK), and Dr. Kwai Wong(UTK) PROJECT BY: MICHAELA SHOFFNER(UTK) ZHEN ZHANG(CUHK) HUANLIN
More informationParallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29
Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing
More informationarxiv: v1 [hep-lat] 10 Jul 2012
Hybrid Monte Carlo with Wilson Dirac operator on the Fermi GPU Abhijit Chakrabarty Electra Design Automation, SDF Building, SaltLake Sec-V, Kolkata - 700091. Pushan Majumdar Dept. of Theoretical Physics,
More informationReal-time signal detection for pulsars and radio transients using GPUs
Real-time signal detection for pulsars and radio transients using GPUs W. Armour, M. Giles, A. Karastergiou and C. Williams. University of Oxford. 15 th July 2013 1 Background of GPUs Why use GPUs? Influence
More informationUni10 The Universal Tensor Network Library
Uni0 The Universal Tensor Network Library Ying-Jer Kao Department of Physics National Taiwan University National Center for Theoretical Sciences http://www.uni0.org TNQMP 06, ISSP Graphical Representation
More informationAdaptive Heterogeneous Computing with OpenCL: Harnessing hundreds of GPUs and CPUs
Adaptive Heterogeneous Computing with OpenCL: Harnessing hundreds of GPUs and CPUs Simon McIntosh-Smith simonm@cs.bris.ac.uk Head of Microelectronics Research University of Bristol, UK 1 ! Collaborators
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationExplore Computational Power of GPU in Electromagnetics and Micromagnetics
Explore Computational Power of GPU in Electromagnetics and Micromagnetics Presenter: Sidi Fu, PhD candidate, UC San Diego Advisor: Prof. Vitaliy Lomakin Center of Magnetic Recording Research, Department
More informationBlock AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark
Block AIR Methods For Multicore and GPU Per Christian Hansen Hans Henrik B. Sørensen Technical University of Denmark Model Problem and Notation Parallel-beam 3D tomography exact solution exact data noise
More informationParallelization of the Molecular Orbital Program MOS-F
Parallelization of the Molecular Orbital Program MOS-F Akira Asato, Satoshi Onodera, Yoshie Inada, Elena Akhmatskaya, Ross Nobes, Azuma Matsuura, Atsuya Takahashi November 2003 Fujitsu Laboratories of
More informationModèle de liaisons fortes au 4ème moment pour traiter l ordre-désordre dans les alliages
Modèle de liaisons fortes au 4ème moment pour traiter l ordre-désordre dans les alliages Jan Los, Christine Mottet, Guy Tréglia CINaM, Marseille Christine Goyhenex IPCMS, Strasbourg Outline Context Tight
More informationPractical Combustion Kinetics with CUDA
Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides
More informationMolecular dynamics simulation. CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror
Molecular dynamics simulation CS/CME/BioE/Biophys/BMI 279 Oct. 5 and 10, 2017 Ron Dror 1 Outline Molecular dynamics (MD): The basic idea Equations of motion Key properties of MD simulations Sample applications
More informationSP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay
SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain
More informationShortest Lattice Vector Enumeration on Graphics Cards
Shortest Lattice Vector Enumeration on Graphics Cards Jens Hermans 1 Michael Schneider 2 Fréderik Vercauteren 1 Johannes Buchmann 2 Bart Preneel 1 1 K.U.Leuven 2 TU Darmstadt SHARCS - 10 September 2009
More informationMachine learning the Born-Oppenheimer potential energy surface: from molecules to materials. Gábor Csányi Engineering Laboratory
Machine learning the Born-Oppenheimer potential energy surface: from molecules to materials Gábor Csányi Engineering Laboratory Interatomic potentials for molecular dynamics Transferability biomolecular
More informationSymmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano
Symmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano Introduction Introduction We wanted to parallelize a serial algorithm for the pivoted Cholesky factorization
More informationMolecular Clustering and Velocity Increase in Converging-Diverging Nozzle in MD Simulation
Molecular Clustering and Velocity Increase in Converging-Diverging Nozzle in MD Simulation Jeoungsu Na 1, Jaehawn Lee 2, Changil Hong 2, Suhee Kim 1 R&D Department of NaJen, Co. LTD, Korea 2 Dept. of Computer
More informationCosmology with Galaxy Clusters: Observations meet High-Performance-Computing
Cosmology with Galaxy Clusters: Observations meet High-Performance-Computing Julian Merten (ITA/ZAH) Clusters of galaxies GPU lensing codes Abell 2744 CLASH: A HST/MCT programme Clusters of galaxies DM
More informationThe Lattice Boltzmann Method for Laminar and Turbulent Channel Flows
The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows Vanja Zecevic, Michael Kirkpatrick and Steven Armfield Department of Aerospace Mechanical & Mechatronic Engineering The University of
More informationRandom Sampling for Short Lattice Vectors on Graphics Cards
Random Sampling for Short Lattice Vectors on Graphics Cards Michael Schneider, Norman Göttert TU Darmstadt, Germany mischnei@cdc.informatik.tu-darmstadt.de CHES 2011, Nara September 2011 Michael Schneider
More informationAlgorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method
Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk
More informationAccelerating Three-Body Molecular Dynamics Potentials Using NVIDIA Tesla K20X GPUs. GE Global Research Masako Yamada
Accelerating Three-Body Molecular Dynamics Potentials Using NVIDIA Tesla K20X GPUs GE Global Research Masako Yamada Overview of MD Simulations Non-Icing Surfaces for Wind Turbines Large simulations ~ 1
More informationParallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)
Bulletin of Networking, Computing, Systems, and Software www.bncss.org, ISSN 2186-5140 Volume 7, Number 1, pages 28 32, January 2018 Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)
More informationIntroduction to Benchmark Test for Multi-scale Computational Materials Software
Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationPART 1 Introduction to Theory of Solids
Elsevier UK Job code: MIOC Ch01-I044647 9-3-2007 3:03p.m. Page:1 Trim:165 240MM TS: Integra, India PART 1 Introduction to Theory of Solids Elsevier UK Job code: MIOC Ch01-I044647 9-3-2007 3:03p.m. Page:2
More informationINITIAL INTEGRATION AND EVALUATION
INITIAL INTEGRATION AND EVALUATION OF SLATE PARALLEL BLAS IN LATTE Marc Cawkwell, Danny Perez, Arthur Voter Asim YarKhan, Gerald Ragghianti, Jack Dongarra, Introduction The aim of the joint milestone STMS10-52
More informationCrystal-Structure Analysis with Moments of the Density-of-States: Application to Intermetallic Topologically Close-Packed Phases
Article Crystal-Structure Analysis with Moments of the Density-of-States: Application to Intermetallic Topologically Close-Packed Phases Thomas Hammerschmidt *, Alvin Noe Ladines, Jörg Koßmann and Ralf
More informationarxiv: v1 [hep-lat] 7 Oct 2010
arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA
More informationTP 1: Euler s Algorithm-Air Resistance-Introduction to Fortran
TP 1: Euler s Algorithm-Air Resistance-Introduction to Fortran December 10, 2009 1 References N.J.Giordano, Computational Physics. R.H.Landau, M.J.Paez, C.C.Bordeianu, Computational Physics. H.Gould, J.Tobochnick,
More informationNIH Center for Macromolecular Modeling and Bioinformatics Developer of VMD and NAMD. Beckman Institute
NIH Center for Macromolecular Modeling and Bioinformatics Developer of VMD and NAMD 5 faculty members (2 physics, 1 chemistry, 1 biochemistry, 1 computer science); 8 developers; 1 system admin; 15 post
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationClassical potentials for metals
Classical potentials for metals About 80 % of all elements are metals. The crystal structures of the elements are distributed as follows: FCC 15 HCP 26 BCC 16 All other metals 13 So if we can describe
More informationCP2K: Past, Present, Future. Jürg Hutter Department of Chemistry, University of Zurich
CP2K: Past, Present, Future Jürg Hutter Department of Chemistry, University of Zurich Outline Past History of CP2K Development of features Present Quickstep DFT code Post-HF methods (RPA, MP2) Libraries
More informationAPPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD
APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia
More information5 questions, 3 points each, 15 points total possible. 26 Fe Cu Ni Co Pd Ag Ru 101.
Physical Chemistry II Lab CHEM 4644 spring 2017 final exam KEY 5 questions, 3 points each, 15 points total possible h = 6.626 10-34 J s c = 3.00 10 8 m/s 1 GHz = 10 9 s -1. B= h 8π 2 I ν= 1 2 π k μ 6 P
More informationA CPU-GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method
A CPU-GPU Hybrid Implementation and Model-Driven Scheduling of the Fast Multipole Method Jee Choi 1, Aparna Chandramowlishwaran 3, Kamesh Madduri 4, and Richard Vuduc 2 1 ECE, Georgia Tech 2 CSE, Georgia
More informationUsing AmgX to accelerate a PETSc-based immersed-boundary method code
29th International Conference on Parallel Computational Fluid Dynamics May 15-17, 2017; Glasgow, Scotland Using AmgX to accelerate a PETSc-based immersed-boundary method code Olivier Mesnard, Pi-Yueh Chuang,
More informationECS 178 Course Notes QUATERNIONS
ECS 178 Course Notes QUATERNIONS Kenneth I. Joy Institute for Data Analysis and Visualization Department of Computer Science University of California, Davis Overview The quaternion number system was discovered
More informationAvailable online at ScienceDirect. Procedia Engineering 61 (2013 ) 94 99
Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 6 (203 ) 94 99 Parallel Computational Fluid Dynamics Conference (ParCFD203) Simulations of three-dimensional cavity flows with
More informationS0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA
S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services,
More informationInteratomic potentials with error bars. Gábor Csányi Engineering Laboratory
Interatomic potentials with error bars Gábor Csányi Engineering Laboratory What makes a potential Ingredients Desirable properties Representation of atomic neighbourhood smoothness, faithfulness, continuity
More informationJacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA
Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is
More informationFENZI: GPU-enabled Molecular Dynamics Simulations of Large Membrane Regions based on the CHARMM force field and PME
211 IEEE International Parallel & Distributed Processing Symposium : GPU-enabled Molecular Dynamics Simulations of Large Membrane Regions based on the force field and PME Narayan Ganesan, Michela Taufer
More informationLevel-3 BLAS on a GPU
Level-3 BLAS on a GPU Picking the Low Hanging Fruit Francisco Igual 1 Gregorio Quintana-Ortí 1 Robert A. van de Geijn 2 1 Departamento de Ingeniería y Ciencia de los Computadores. University Jaume I. Castellón
More informationChapter 2. Atomic Structure
Chapter 2 Atomic Structure 2 6 (a) Aluminum foil used for storing food weighs about 0. g per square cm. How many atoms of aluminum are contained in one 6.25 cm 2 size of foil? (b) Using the densities and
More informationOn pairwise comparison matrices that can be made consistent by the modification of a few elements
Noname manuscript No. (will be inserted by the editor) On pairwise comparison matrices that can be made consistent by the modification of a few elements Sándor Bozóki 1,2 János Fülöp 1,3 Attila Poesz 2
More informationResearch on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
NUCLEAR SCIENCE AND TECHNIQUES 25, 0501 (14) Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method XU Qi ( 徐琪 ), 1, YU Gang-Lin ( 余纲林 ), 1 WANG Kan ( 王侃 ),
More informationModeling and visualization of molecular dynamic processes
JASS 2009 Konstantin Shefov Modeling and visualization of molecular dynamic processes St-Petersburg State University Physics faculty Department of Computational Physics Supervisor PhD Stepanova Margarita
More informationarxiv: v1 [physics.comp-ph] 22 Nov 2012
A Customized 3D GPU Poisson Solver for Free BCs Nazim Dugan a, Luigi Genovese b, Stefan Goedecker a, a Department of Physics, University of Basel, Klingelbergstr. 82, 4056 Basel, Switzerland b Laboratoire
More informationEnvironmentally dependent bond-order potentials: New developments and applications
Bull. Mater. Sci., Vol. 6, No., January 003, pp. 43 5. Indian Academy of Sciences. Environmentally dependent bond-order potentials: New developments and applications D NGUYEN-MANH*, D G PETTIFOR, D J H
More informationPorting a Sphere Optimization Program from lapack to scalapack
Porting a Sphere Optimization Program from lapack to scalapack Paul C. Leopardi Robert S. Womersley 12 October 2008 Abstract The sphere optimization program sphopt was originally written as a sequential
More informationQ-Chem 4.0: Expanding the Frontiers. Jing Kong Q-Chem Inc. Pittsburgh, PA
Q-Chem 4.0: Expanding the Frontiers Jing Kong Q-Chem Inc. Pittsburgh, PA Q-Chem: Profile Q-Chem is a high performance quantum chemistry program; Contributed by best quantum chemists from 40 universities
More informationCOMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD
XVIII International Conference on Water Resources CMWR 2010 J. Carrera (Ed) c CIMNE, Barcelona, 2010 COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD James.E. McClure, Jan F. Prins
More informationDesigning Survivable Networks: A Flow Based Approach
Designing Survivable Networks: A Flow Based Approach Prakash Mirchandani 1 University of Pittsburgh This is joint work with Anant Balakrishnan 2 of the University of Texas at Austin and Hari Natarajan
More informationParallel sparse direct solvers for Poisson s equation in streamer discharges
Parallel sparse direct solvers for Poisson s equation in streamer discharges Margreet Nool, Menno Genseberger 2 and Ute Ebert,3 Centrum Wiskunde & Informatica (CWI), P.O.Box 9479, 9 GB Amsterdam, The Netherlands
More informationThe Augmented Spherical Wave Method
Introduction Institut für Physik, Universität Augsburg Electronic Structure in a Nutshell Outline 1 Fundamentals Generations 2 Outline 1 Fundamentals Generations 2 Outline Fundamentals Generations 1 Fundamentals
More informationPart III: Theoretical Surface Science Adsorption at Surfaces
Technische Universität München Part III: Theoretical Surface Science Adsorption at Surfaces Karsten Reuter Lecture course: Solid State Theory Adsorption at surfaces (T,p) Phase II Phase I Corrosion Growth
More informationNIH Center for Macromolecular Modeling and Bioinformatics Developer of VMD and NAMD. Beckman Institute
NIH Center for Macromolecular Modeling and Bioinformatics Developer of VMD and NAMD 5 faculty members (2 physics, 1 chemistry, 1 biochemistry, 1 computer science); 8 developers; 1 system admin; 15 post
More informationS Subdivide, Preprocess and Conquer: Micromagnetism FEM/BEM-Simulations on Single-Node/Multi-GPU Systems
S4283 - Subdivide, : Micromagnetism FEM/BEM-Simulations on Single-Node/Multi-GPU Systems Elmar Westphal - Forschungszentrum Jülich GmbH 1 Contents Micromagnetism TetraMag, a FEM/BEM Micromagnetism Simulator
More informationGPU accelerated Monte Carlo simulations of lattice spin models
Available online at www.sciencedirect.com Physics Procedia 15 (2011) 92 96 GPU accelerated Monte Carlo simulations of lattice spin models M. Weigel, T. Yavors kii Institut für Physik, Johannes Gutenberg-Universität
More informationCHEM1902/ N-2 November 2014
CHEM1902/4 2014-N-2 November 2014 The cubic form of boron nitride (borazon) is the second-hardest material after diamond and it crystallizes with the structure shown below. The large spheres represent
More informationEA = I 3 = E = i=1, i k
MTH5 Spring 7 HW Assignment : Sec.., # (a) and (c), 5,, 8; Sec.., #, 5; Sec.., #7 (a), 8; Sec.., # (a), 5 The due date for this assignment is //7. Sec.., # (a) and (c). Use the proof of Theorem. to obtain
More informationAlgorithmic Challenges in Photodynamics Simulations
Algorithmic Challenges in Photodynamics Simulations Felix Plasser González Research Group Institute for Theoretical Chemistry, University of Vienna, Austria Grundlsee, 24 th February 2016 Photodynamics
More informationPorting a sphere optimization program from LAPACK to ScaLAPACK
Porting a sphere optimization program from LAPACK to ScaLAPACK Mathematical Sciences Institute, Australian National University. For presentation at Computational Techniques and Applications Conference
More informationWRF performance tuning for the Intel Woodcrest Processor
WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,
More informationMaking electronic structure methods scale: Large systems and (massively) parallel computing
AB Making electronic structure methods scale: Large systems and (massively) parallel computing Ville Havu Department of Applied Physics Helsinki University of Technology - TKK Ville.Havu@tkk.fi 1 Outline
More informationVector Analysis HOMEWORK IX Solution. 1. If T Λ k (V ), v 1,..., v k is a set of k linearly dependent vectors on V, prove
1. If T Λ k (V ), v 1,..., v k is a set of k linearly dependent vectors on V, prove T ( v 1,..., v k ) = 0 Since v 1,..., v k is a set of k linearly dependent vectors, there exists a 1,..., a k F such
More informationExperiences with Self-Consistent Tight Binding and ELSI
Experiences with Self-Consistent Tight Binding and ELSI Ben Hourahine benjamin.hourahine@strath.ac.uk Motivation via computational cost Method Eform. (ev) Time Orthogonal TB 3.2 1 Self-Consistent Orthogonal
More informationSparse BLAS-3 Reduction
Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc
More informationSuggested Reading. Pages in Engler and Randle
The Structure Factor Suggested Reading Pages 303-312312 in DeGraef & McHenry Pages 59-61 in Engler and Randle 1 Structure Factor (F ) N i1 1 2 i( hu kv lw ) F fe i i j i Describes how atomic arrangement
More informationab initio Electronic Structure Calculations
ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab
More informationLight curve modeling of eclipsing binary stars
Light curve modeling of eclipsing binary stars Gábor Marschalkó Baja Observatory of University of Szeged Wigner Research Centre for Physics Binary stars physical variables pulsating stars mass, radius,
More informationA Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters
A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!
More informationEvaluation and Benchmarking of Highly Scalable Parallel Numerical Libraries
Evaluation and Benchmarking of Highly Scalable Parallel Numerical Libraries Christos Theodosiou (ctheodos@grid.auth.gr) User and Application Support Scientific Computing Centre @ AUTH Presentation Outline
More informationSolving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI *
Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI * J.M. Badía and A.M. Vidal Dpto. Informática., Univ Jaume I. 07, Castellón, Spain. badia@inf.uji.es Dpto. Sistemas Informáticos y Computación.
More information