A quick introduction to MPI (Message Passing Interface)
|
|
- Wendy Atkins
- 5 years ago
- Views:
Transcription
1 A quick introduction to MPI (Message Passing Interface) M1IF - APPD Oguz Kaya Pierre Pradic École Normale Supérieure de Lyon, France 1 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
2 Introduction 2 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
3 What is MPI? Standardized and portable message-passing system. Started in the 90 s, still used today in research and industry. Good theoretical model. Good performances on HPC networks (InfiniBand...). 3 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
4 What is MPI? De facto standard for communications in HPC applications. 4 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
5 Programming model APIs: C and Fortran APIs. C++ API deprecated by MPI-3 (2008). Environment: Many implementations of the standard (mainly OpenMPI and MPICH) Compiler (wrappers around gcc) Runtime (mpirun) 5 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
6 Programming model Compiling: gcc mpicc g++ mpic++ / mpicxx gfortran mpifort Executing: mpirun -n <nb procs> <executable> <args> ex : mpirun -n 10./a.out note: mpiexec and orterun are synonyms of mpirun see man mpirun for more details 6 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
7 MPI context Context limits All MPI call must be nested in the MPI context delimited by MPI_Init and MPI_Finalize. 1 #i n c l u d e <mpi. h> 2 3 i n t main ( i n t argc, char a r g v [ ] ) 4 { 5 MPI_Init(& argc, &a r g v ) ; 6 7 // MPI_Finalize ( ) ; r e t u r n 0 ; 12 } 7 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
8 MPI context Hello World 1 #i n c l u d e <s t d i o. h> 2 #i n c l u d e <mpi. h> 3 4 i n t main ( i n t argc, char a r g v [ ] ) 5 { 6 i n t rank, s i z e ; 7 8 MPI_Init(& argc, &a r g v ) ; 9 10 MPI_Comm_rank(MPI_COMM_WORLD, &r a n k ) ; 11 MPI_Comm_size (MPI_COMM_WORLD, & s i z e ) ; p r i n t f ( " H e l l o from proc %d / %d\n", rank, s i z e ) ; MPI_Finalize ( ) ; r e t u r n 0 ; 8 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
9 Synchronization Code: p r i n t f ( "[%d ] s t e p 1\n", rank ) ; MPI_ Barrier (MPI_COMM_WORLD) ; p r i n t f ( "[%d ] s t e p 2\n", rank ) ; Output: [ 0 ] s t e p 1 [ 1 ] s t e p 1 [ 2 ] s t e p 1 [ 3 ] s t e p 1 [ 3 ] s t e p 2 [ 0 ] s t e p 2 [ 2 ] s t e p 2 [ 1 ] s t e p 2 9 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
10 Point-to-point communication 10 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
11 Send and Receive Sending data: i n t MPI_Send ( c o n s t v o i d data, i n t count, MPI_Datatype datatype, i n t d e s t i n a t i o n, i n t tag, MPI_Comm communicator ) ; Receiving data: i n t MPI_Recv ( v o i d data, i n t count, MPI_Datatype datatype, i n t s ource, i n t tag, MPI_Comm communicator, MPI_Status s t a t u s ) ; 11 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
12 Example 1 i n t rank, s i z e ; 2 MPI_Comm_rank(MPI_COMM_WORLD, &r a n k ) ; 3 MPI_Comm_size (MPI_COMM_WORLD, & s i z e ) ; 4 5 i n t number ; 6 s w i t c h ( rank ) 7 { 8 case 0 : 9 number = 1; 10 MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD) ; 11 break ; 12 case 1 : 13 MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, 14 MPI_STATUS_IGNORE ) ; 15 p r i n t f ( " r e c e i v e d number : %d\n", number ) ; 16 break ; 17 } 12 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
13 Asynchronious communications Sending data: i n t MPI_Isend ( c o n s t v o i d data, i n t count, MPI_Datatype datatype, i n t d e s t i n a t i o n, i n t tag, MPI_Comm communicator, MPI_Request r e q u e s t ) ; Receiving data: i n t MPI_Irecv ( v o i d data, i n t count, MPI_Datatype datatype, i n t s ource, i n t tag, MPI_Comm communicator, MPI_Request r e q u e s t ) ; 13 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
14 Other functions MPI_Probe, MPI_Iprobe MPI_Test, MPI_Testany, MPI_Testall MPI_Cancel MPI_Wtime, MPI_Wtick 14 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
15 Simple datatypes MPI_SHORT MPI_INT MPI_LONG MPI_LONG_LONG MPI_UNSIGNED_CHAR MPI_UNSIGNED_SHORT MPI_UNSIGNED MPI_UNSIGNED_LONG MPI_UNSIGNED_LONG_LONG MPI_FLOAT MPI_DOUBLE MPI_LONG_DOUBLE MPI_BYTE short int int long int long long int unsigned char unsigned short int unsigned int unsigned long int unsigned long long int float double long double char 15 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
16 Composed datatypes Composed structure: Structures; Array. Possibilities are almost limitless but sometimes difficult to setup. 16 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
17 Collective communications 17 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
18 One-to-all communication Broadcast 18 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
19 One-to-all communication Broadcast i n t MPI_Bcast ( v o i d data, i n t count, MPI_Datatype datatype, i n t root, MPI_Comm communicator ) ; 19 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
20 One-to-all communication Reduce 20 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
21 One-to-all communication Reduce i n t MPI_Reduce ( c o n s t v o i d sendbuf, v o i d r e c v b u f, i n t count, MPI_Datatype datatype, MPI_Op o p e r a t o r, i n t root, MPI_Comm communicator ) ; 21 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
22 One-to-all communication Scatter 22 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
23 One-to-all communication Scatter i n t MPI_Scatter ( c o n s t v o i d sendbuf, i n t sendcount, MPI_Datatype sendtype, v o i d r e c v b u f, i n t r e c v c o u n t, MPI_Datatype r e c v t y p e, i n t root, MPI_Comm communicator ) ; 23 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
24 One-to-all communication Gather 24 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
25 One-to-all communication Gather i n t MPI_Gather ( c o n s t v o i d sendbuf, i n t sendcount, MPI_Datatype sendtype, v o i d r e c v b u f, i n t r e c v c o u n t, MPI_Datatype r e c v t y p e, i n t root, MPI_Comm communicator ) ; 25 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
26 All-to-all communication Allreduce 26 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
27 All-to-all communication AllReduce i n t MPI_Allreduce ( c o n s t v o i d sendbuf, v o i d r e c v b u f, i n t count, MPI_Datatype datatype, MPI_Op o p e r a t o r, MPI_Comm communicator ) ; 27 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
28 All-to-all communication Allgather 28 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
29 All-to-all communication AllGather i n t MPI_Allgather ( c o n s t v o i d sendbuf, i n t sendcount, MPI_Datatype sendtype, v o i d r e c v b u f, i n t r e c v c o u n t, MPI_Datatype r e c v t y p e, MPI_Comm communicator ) ; 29 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
30 All-to-all communication Alltoall 30 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
31 All-to-all communication Alltoall i n t M P I _ A l l t o a l l ( c o n s t v o i d sendbuf, i n t sendcount, MPI_Datatype sendtype, v o i d r e c v b u f, i n t r e c v c o u n t, MPI_Datatype r e c v t y p e, MPI_Comm communicator ) ; 31 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
32 Custom communicators 32 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
33 MPI_COMM_WORLD can be split into smaller, more appropriate communicators. i n t MPI_Comm_split (MPI_Comm communicator, i n t c o l o r, i n t key, MPI_Comm newcommunicator ) ; 33 / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
34 Example i n t rank, s i z e ; 3 MPI_Init(& argc, &a r g v ) ; 4 MPI_Comm_rank(MPI_COMM_WORLD, &r a n k ) ; 5 MPI_Comm_size (MPI_COMM_WORLD, & s i z e ) ; 6 7 i n t hrank, vrank ; 8 i n t h s i z e, v s i z e ; 9 MPI_Comm hcomm, vcomm ; 10 MPI_Comm_split (MPI_COMM_WORLD, r a n k%p, rank, &vcomm ) ; 11 MPI_Comm_split (MPI_COMM_WORLD, r a n k /p, rank, &hcomm ) ; 12 MPI_Comm_rank(hcomm, &hrank ) ; 13 MPI_Comm_size (hcomm, &h s i z e ) ; 14 MPI_Comm_rank(vcomm, &vrank ) ; 15 MPI_Comm_size (vcomm, &v s i z e ) ; / 34 Oguz Kaya, Pierre Pradic M1IF - Presentation MPI
Lecture 24 - MPI ECE 459: Programming for Performance
ECE 459: Programming for Performance Jon Eyolfson March 9, 2012 What is MPI? Messaging Passing Interface A language-independent communation protocol for parallel computers Run the same code on a number
More informationThe Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing
The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF February 9. 2018 1 Recap: Parallelism with MPI An MPI execution is started on a set of processes P with: mpirun -n N
More information1 / 28. Parallel Programming.
1 / 28 Parallel Programming pauldj@aices.rwth-aachen.de Collective Communication 2 / 28 Barrier Broadcast Reduce Scatter Gather Allgather Reduce-scatter Allreduce Alltoall. References Collective Communication:
More informationAntonio Falabella. 3 rd nternational Summer School on INtelligent Signal Processing for FrontIEr Research and Industry, September 2015, Hamburg
INFN - CNAF (Bologna) 3 rd nternational Summer School on INtelligent Signal Processing for FrontIEr Research and Industry, 14-25 September 2015, Hamburg 1 / 44 Overview 1 2 3 4 5 2 / 44 to Computing The
More informationThe Sieve of Erastothenes
The Sieve of Erastothenes Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 25, 2010 José Monteiro (DEI / IST) Parallel and Distributed
More information( ) ? ( ) ? ? ? l RB-H
1 2018 4 17 2 30 4 17 17 8 1 1. 4 10 ( ) 2. 4 17 l 3. 4 24 l 4. 5 1 l ( ) 5. 5 8 l 2 6. 5 15 l - 2018 8 6 24 7. 5 22 l 8. 6 5 l - 9. 6 12 l 10. 6 19? l l 11. 7 3? l 12. 7 10? l 13. 7 17? l RB-H 3 4 5 6
More information2018/10/
1 2018 10 2 2 30 10 2 17 8 1 1. 9 25 2. 10 2 l 3. 10 9 l 4. 10 16 l 5. 10 23 l 2 6. 10 30 l - 2019 2 4 24 7. 11 6 l 8. 11 20 l - 9. 11 27 l 10. 12 4 l l 11. 12 11 l 12. 12 18?? l 13. 1 8 l RB-H 3 4 5 6
More information2.10 Packing. 59 MPI 2.10 Packing MPI_PACK( INBUF, INCOUNT, DATATYPE, OUTBUF, OUTSIZE, POSITION, COMM, IERR )
59 MPI 2.10 Packing 2.10 Packing MPI_PACK( INBUF, INCOUNT, DATATYPE, OUTBUF, OUTSIZE, POSITION, COMM, IERR ) : : INBUF ( ), OUTBUF( ) INTEGER : : INCOUNT, DATATYPE, OUTSIZE, POSITION, COMM, IERR
More informationCEE 618 Scientific Parallel Computing (Lecture 4): Message-Passing Interface (MPI)
1 / 48 CEE 618 Scientific Parallel Computing (Lecture 4): Message-Passing Interface (MPI) Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole Street,
More informationDistributed Memory Programming With MPI
John Interdisciplinary Center for Applied Mathematics & Information Technology Department Virginia Tech... Applied Computational Science II Department of Scientific Computing Florida State University http://people.sc.fsu.edu/
More informationIntroductory MPI June 2008
7: http://people.sc.fsu.edu/ jburkardt/presentations/ fdi 2008 lecture7.pdf... John Information Technology Department Virginia Tech... FDI Summer Track V: Parallel Programming 10-12 June 2008 MPI: Why,
More informationProving an Asynchronous Message Passing Program Correct
Available at the COMP3151 website. Proving an Asynchronous Message Passing Program Correct Kai Engelhardt Revision: 1.4 This note demonstrates how to construct a correctness proof for a program that uses
More informationModelling and implementation of algorithms in applied mathematics using MPI
Modelling and implementation of algorithms in applied mathematics using MPI Lecture 3: Linear Systems: Simple Iterative Methods and their parallelization, Programming MPI G. Rapin Brazil March 2011 Outline
More informationParallel Program Performance Analysis
Parallel Program Performance Analysis Chris Kauffman CS 499: Spring 2016 GMU Logistics Today Final details of HW2 interviews HW2 timings HW2 Questions Parallel Performance Theory Special Office Hours Mon
More informationFinite difference methods. Finite difference methods p. 1
Finite difference methods Finite difference methods p. 1 Overview 1D heat equation u t = κu xx +f(x,t) as a motivating example Quick intro of the finite difference method Recapitulation of parallelization
More informationParallel Programming. Parallel algorithms Linear systems solvers
Parallel Programming Parallel algorithms Linear systems solvers Terminology System of linear equations Solve Ax = b for x Special matrices Upper triangular Lower triangular Diagonally dominant Symmetric
More informationPanorama des modèles et outils de programmation parallèle
Panorama des modèles et outils de programmation parallèle Sylvain HENRY sylvain.henry@inria.fr University of Bordeaux - LaBRI - Inria - ENSEIRB April 19th, 2013 1/45 Outline Introduction Accelerators &
More informationBayesian Computing for Astronomical Data Analysis
Eberly College of Science Center for Astrostatistics Bayesian Computing for Astronomical Data Analysis June 11-13, 2014 Bayesian Computing for Astronomical Data Analysis (June 11-13, 2014) Contents Title
More informationLecture 4. Writing parallel programs with MPI Measuring performance
Lecture 4 Writing parallel programs with MPI Measuring performance Announcements Wednesday s office hour moved to 1.30 A new version of Ring (Ring_new) that handles linear sequences of message lengths
More informationFactoring large integers using parallel Quadratic Sieve
Factoring large integers using parallel Quadratic Sieve Olof Åsbrink d95-oas@nada.kth.se Joel Brynielsson joel@nada.kth.se 14th April 2 Abstract Integer factorization is a well studied topic. Parts of
More informationHigh-performance processing and development with Madagascar. July 24, 2010 Madagascar development team
High-performance processing and development with Madagascar July 24, 2010 Madagascar development team Outline 1 HPC terminology and frameworks 2 Utilizing data parallelism 3 HPC development with Madagascar
More informationThe equation does not have a closed form solution, but we see that it changes sign in [-1,0]
Numerical methods Introduction Let f(x)=e x -x 2 +x. When f(x)=0? The equation does not have a closed form solution, but we see that it changes sign in [-1,0] Because f(-0.5)=-0.1435, the root is in [-0.5,0]
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 16/17 Communicators MPI_Comm_split( MPI_Comm comm, int color, int key, MPI_Comm* newcomm)
More informationParallelization of the Dirac operator. Pushan Majumdar. Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata
Parallelization of the Dirac operator Pushan Majumdar Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata Outline Introduction Algorithms Parallelization Comparison of performances Conclusions
More informationParallel Computing with R
Parallel 1 / 107 Parallel Computing with R Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas 2017 Parallel 2 / 107 Outline 1 Bookmark:
More informationParallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco
Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and
More informationIntroduction to MPI. School of Computational Science, 21 October 2005
Introduction to MPI John Burkardt School of Computational Science Florida State University... http://people.sc.fsu.edu/ jburkardt/presentations/ mpi 2005 fsu.pdf School of Computational Science, 21 October
More informationHierarchical Clock Synchronization in MPI
Hierarchical Clock Synchronization in MPI Sascha Hunold TU Wien, Faculty of Informatics Vienna, Austria Email: hunold@par.tuwien.ac.at Alexandra Carpen-Amarie Fraunhofer ITWM Kaiserslautern, Germany Email:
More informationTrivially parallel computing
Parallel Computing After briefly discussing the often neglected, but in praxis frequently encountered, issue of trivially parallel computing, we turn to parallel computing with information exchange. Our
More informationUsing Happens-Before Relationship to debug MPI non-determinism. Anh Vo and Alan Humphrey
Using Happens-Before Relationship to debug MPI non-determinism Anh Vo and Alan Humphrey {avo,ahumphre}@cs.utah.edu Distributed event ordering is crucial Bob receives two undated letters from his dad One
More informationAlgorithms for Collective Communication. Design and Analysis of Parallel Algorithms
Algorithms for Collective Communication Design and Analysis of Parallel Algorithms Source A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing, Chapter 4, 2003. Outline One-to-all
More informationParallel Sparse Matrix - Vector Product
IMM Technical Report 2012-10 Parallel Sparse Matrix - Vector Product - Pure MPI and hybrid MPI-OpenMP implementation - Authors: Joe Alexandersen Boyan Lazarov Bernd Dammann Abstract: This technical report
More informationScalable Tools for Debugging Non-Deterministic MPI Applications
Scalable Tools for Debugging Non-Deterministic MPI Applications ReMPI: MPI Record-and-Replay tool Scalable Tools Workshop August 2nd, 2016 Kento Sato, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee, Mar>n
More informationCOSC 4397 Parallel Computation
COSC 4397 Solvng the Laplace Equaton wth MPI Sprng Numercal dfferentaton forward dfference formula y From the defnton of dervatves f( x+ f( f ( = lm h h one can derve an approxmaton for the st dervatve
More informationB629 project - StreamIt MPI Backend. Nilesh Mahajan
B629 project - StreamIt MPI Backend Nilesh Mahajan March 26, 2013 Abstract StreamIt is a language based on the dataflow model of computation. StreamIt consists of computation units called filters connected
More informationarxiv: v2 [cs.se] 27 Nov 2016
Static Analysis of Communicating Processes using Symbolic Transducers (Extended version) arxiv:1611.07812v2 [cs.se] 27 Nov 2016 Vincent Botbol 1,2 vincent.botbol@cea.fr, Emmanuel Chailloux 1 emmanuel.chailloux@lip6.fr,
More informationParallelism in Magnetic Force Microscopy Studies Algorithm and Optimization
Parallelism in Magnetic Force Microscopy Studies Algorithm and Optimization Peretyatko 1 A.A., Nefedev 1,2 K.V. 1 Far Eastern Federal University, School of Natural Sciences, Department of Theoretical and
More informationType-Based Verification of Message-Passing Parallel Programs
Type-Based Verification of Message-Passing Parallel Programs Vasco Thudichum Vasconcelos, Francisco Martins, Eduardo R. B. Marques, Hugo A. López, César Santos, and Nobuko Yoshida DI FCUL TR 2014 4 DOI:10455/6902
More informationPractical Model Checking Method for Verifying Correctness of MPI Programs
Practical Model Checking Method for Verifying Correctness of MPI Programs Salman Pervez 1, Ganesh Gopalakrishnan 1, Robert M. Kirby 1, Robert Palmer 1, Rajeev Thakur 2, and William Gropp 2 1 School of
More informationParallel Programming & Cluster Computing Transport Codes and Shifting Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa
Parallel Programming & Cluster Computing Transport Codes and Shifting Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education Program s Workshop on Parallel & Cluster
More informationA.1 Newton s, Hamilton s, and Euler-Lagrange s Equations
A Appendix A.1 Newton s, Hamilton s, and Euler-Lagrange s Equations We consider the equations of classical mechanics for a system in R d which consists of N particles with the masses {m 1,...,m N }.Wedenotetheposition
More informationPLANETESIM: Fourier transformations in the Pencil Code
Available on-line at www.prace-ri.eu Partnership for Advanced Computing in Europe PLANETESIM: Fourier transformations in the Pencil Code Joachim Hein a,, Anders Johanson b a Centre of Mathematical Sciences
More informationCurrent Developments and Applications related to the Discrete Adjoint Solver in SU2
Current Developments and Applications related to the Discrete Adjoint Solver in SU2 Tim Albring, Max Sagebaum, Ole Burghardt, Lisa Kusch, Nicolas R. Gauger Chair for Scientific Computing TU Kaiserslautern
More informationBenchmarking Super Computers
Benchmarking Super Computers Benchmarks of Clustis3 and Numascale Elisabeth Solheim Master of Science in Informatics Submission date: Januar 2015 Supervisor: Anne Cathrine Elster, IDI Norwegian University
More informationInverse problems. High-order optimization and parallel computing. Lecture 7
Inverse problems High-order optimization and parallel computing Nikolai Piskunov 2014 Lecture 7 Non-linear least square fit The (conjugate) gradient search has one important problem which often occurs
More informationUNMIXING 4-D PTYCHOGRAPHIC IMAGES
UNMIXING 4-D PTYCHOGRAPHIC IMAGES Mentors: Dr. Rick Archibald(ORNL), Dr. Azzam Haidar(UTK), Dr. Stanimire Tomov(UTK), and Dr. Kwai Wong(UTK) PROJECT BY: MICHAELA SHOFFNER(UTK) ZHEN ZHANG(CUHK) HUANLIN
More informationModel Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University
Model Order Reduction via Matlab Parallel Computing Toolbox E. Fatih Yetkin & Hasan Dağ Istanbul Technical University Computational Science & Engineering Department September 21, 2009 E. Fatih Yetkin (Istanbul
More informationCS429: Computer Organization and Architecture
CS429: Computer Organization and Architecture Warren Hunt, Jr. and Bill Young Department of Computer Sciences University of Texas at Austin Last updated: September 3, 2014 at 08:38 CS429 Slideset C: 1
More informationAn Integrative Model for Parallelism
An Integrative Model for Parallelism Victor Eijkhout ICERM workshop 2012/01/09 Introduction Formal part Examples Extension to other memory models Conclusion tw-12-exascale 2012/01/09 2 Introduction tw-12-exascale
More informationThe Performance Evolution of the Parallel Ocean Program on the Cray X1
The Performance Evolution of the Parallel Ocean Program on the Cray X1 Patrick H. Worley Oak Ridge National Laboratory John Levesque Cray Inc. 46th Cray User Group Conference May 18, 2003 Knoxville Marriott
More informationData byte 0 Data byte 1 Data byte 2 Data byte 3 Data byte 4. 0xA Register Address MSB data byte Data byte Data byte LSB data byte
SFP200 CAN 2.0B Protocol Implementation Communications Features CAN 2.0b extended frame format 500 kbit/s Polling mechanism allows host to determine the rate of incoming data Registers The SFP200 provides
More informationWeather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012
Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationProjectile Motion Slide 1/16. Projectile Motion. Fall Semester. Parallel Computing
Projectile Motion Slide 1/16 Projectile Motion Fall Semester Projectile Motion Slide 2/16 Topic Outline Historical Perspective ABC and ENIAC Ballistics tables Projectile Motion Air resistance Euler s method
More informationScheduling divisible loads with return messages on heterogeneous master-worker platforms
Scheduling divisible loads with return messages on heterogeneous master-worker platforms Olivier Beaumont, Loris Marchal, Yves Robert Laboratoire de l Informatique du Parallélisme École Normale Supérieure
More informationComputer Science Introductory Course MSc - Introduction to Java
Computer Science Introductory Course MSc - Introduction to Java Lecture 1: Diving into java Pablo Oliveira ENST Outline 1 Introduction 2 Primitive types 3 Operators 4 5 Control Flow
More informationDomain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016
Domain specific libraries Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Part 1: Introduction Kohn-Shame equations 1 2 Eigen-value problem + v eff (r) j(r)
More informationPerformance of WRF using UPC
Performance of WRF using UPC Hee-Sik Kim and Jong-Gwan Do * Cray Korea ABSTRACT: The Weather Research and Forecasting (WRF) model is a next-generation mesoscale numerical weather prediction system. We
More informationDistributed Memory Parallelization in NGSolve
Distributed Memory Parallelization in NGSolve Lukas Kogler June, 2017 Inst. for Analysis and Scientific Computing, TU Wien From Shared to Distributed Memory Shared Memory Parallelization via threads (
More informationLightweight Superscalar Task Execution in Distributed Memory
Lightweight Superscalar Task Execution in Distributed Memory Asim YarKhan 1 and Jack Dongarra 1,2,3 1 Innovative Computing Lab, University of Tennessee, Knoxville, TN 2 Oak Ridge National Lab, Oak Ridge,
More informationEfficient algorithms for symmetric tensor contractions
Efficient algorithms for symmetric tensor contractions Edgar Solomonik 1 Department of EECS, UC Berkeley Oct 22, 2013 1 / 42 Edgar Solomonik Symmetric tensor contractions 1/ 42 Motivation The goal is to
More informationScalable Non-blocking Preconditioned Conjugate Gradient Methods
Scalable Non-blocking Preconditioned Conjugate Gradient Methods Paul Eller and William Gropp University of Illinois at Urbana-Champaign Department of Computer Science Supercomputing 16 Paul Eller and William
More informationMPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory
MAX PLANCK INSTITUTE November 5, 2010 MPI at MPI Jens Saak Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory FOR DYNAMICS OF COMPLEX TECHNICAL
More informationNumerically Solving Partial Differential Equations
Numerically Solving Partial Differential Equations Michael Lavell Department of Applied Mathematics and Statistics Abstract The physics describing the fundamental principles of fluid dynamics can be written
More informationLinking the Continuous and the Discrete
Linking the Continuous and the Discrete Coupling Molecular Dynamics to Continuum Fluid Mechanics Edward Smith Working with: Prof D. M. Heyes, Dr D. Dini, Dr T. A. Zaki & Mr David Trevelyan Mechanical Engineering
More informationINTRODUCTION. This is not a full c-programming course. It is not even a full 'Java to c' programming course.
C PROGRAMMING 1 INTRODUCTION This is not a full c-programming course. It is not even a full 'Java to c' programming course. 2 LITTERATURE 3. 1 FOR C-PROGRAMMING The C Programming Language (Kernighan and
More informationA Numerical QCD Hello World
A Numerical QCD Hello World Bálint Thomas Jefferson National Accelerator Facility Newport News, VA, USA INT Summer School On Lattice QCD, 2007 What is involved in a Lattice Calculation What is a lattice
More informationOverview: Synchronous Computations
Overview: Synchronous Computations barriers: linear, tree-based and butterfly degrees of synchronization synchronous example 1: Jacobi Iterations serial and parallel code, performance analysis synchronous
More informationBarrier. Overview: Synchronous Computations. Barriers. Counter-based or Linear Barriers
Overview: Synchronous Computations Barrier barriers: linear, tree-based and butterfly degrees of synchronization synchronous example : Jacobi Iterations serial and parallel code, performance analysis synchronous
More informationScalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems
Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt
More informationOptimization of a parallel 3d-FFT with non-blocking collective operations
Optimization of a parallel 3d-FFT with non-blocking collective operations Chair of Computer Architecture Technical University of Chemnitz Département de Physique Théorique et Appliquée Commissariat à l
More informationParticle in cell simulations
Particle in cell simulations Part III: Boundary conditions and parallelization Benoît Cerutti IPAG, CNRS, Université Grenoble Alpes, Grenoble, France. 1 Astrosim, Lyon, June 26 July 7, 2017. Plan of the
More informationPARALLEL SOLVER OF LARGE SYSTEMS OF LINEAR INEQUALITIES USING FOURIER-MOTZKIN ELIMINATION
Computing and Informatics, Vol. 35, 2016, 1307 1337 PARALLEL SOLVER OF LARGE SYSTEMS OF LINEAR INEQUALITIES USING FOURIER-MOTZKIN ELIMINATION Ivan Šimeček, Richard Fritsch, Daniel Langr, Róbert Lórencz
More informationSC09 HPC Challenge Submission for XcalableMP
SC09 HPC Challenge Submission for XcalableMP Jinpil Lee Graduate School of Systems and Information Engineering University of Tsukuba jinpil@hpcs.cs.tsukuba.ac.jp Mitsuhisa Sato Center for Computational
More informationQuantum ESPRESSO Performance Benchmark and Profiling. February 2017
Quantum ESPRESSO Performance Benchmark and Profiling February 2017 2 Note The following research was performed under the HPC Advisory Council activities Compute resource - HPC Advisory Council Cluster
More informationNuclear Engineering Analysis
Chapter! Nuclear Engineering Analysis Nuclear engineering analysis is a very broad subject title. Nuclear engineering and medical physics are disciplines that are ultimately very applied; while the "pure
More informationParallel Algorithms. A. Legrand. Arnaud Legrand, CNRS, University of Grenoble. LIG laboratory, October 2, / 235
Parallel Arnaud Legrand, CNRS, University of Grenoble LIG laboratory, arnaud.legrand@imag.fr October 2, 2011 1 / 235 Outline Parallel Part I Models Part II Communications on a Ring Part III Speedup and
More informationCOSC 6374 Parallel Computation
COSC 67 Parallel Comptaton Partal Derental Eqatons Edgar Gabrel Fall 0 Nmercal derentaton orward derence ormla From te denton o dervatves one can derve an appromaton or te st dervatve Te same ormla can
More informationA New Dominant Point-Based Parallel Algorithm for Multiple Longest Common Subsequence Problem
A New Dominant Point-Based Parallel Algorithm for Multiple Longest Common Subsequence Problem Dmitry Korkin This work introduces a new parallel algorithm for computing a multiple longest common subsequence
More informationIntroduction to Programming (Java) 3/12
Introduction to Programming (Java) 3/12 Michal Krátký Department of Computer Science Technical University of Ostrava Introduction to Programming (Java) 2008/2009 c 2006 2008 Michal Krátký Introduction
More informationImage Reconstruction And Poisson s equation
Chapter 1, p. 1/58 Image Reconstruction And Poisson s equation School of Engineering Sciences Parallel s for Large-Scale Problems I Chapter 1, p. 2/58 Outline 1 2 3 4 Chapter 1, p. 3/58 Question What have
More informationDigital Systems EEE4084F. [30 marks]
Digital Systems EEE4084F [30 marks] Practical 3: Simulation of Planet Vogela with its Moon and Vogel Spiral Star Formation using OpenGL, OpenMP, and MPI Introduction The objective of this assignment is
More informationGo Tutorial. Ian Lance Taylor. Introduction. Why? Language. Go Tutorial. Ian Lance Taylor. GCC Summit, October 27, 2010
GCC Summit, October 27, 2010 Go Go is a new experimental general purpose programming language. Main developers are: Russ Cox Robert Griesemer Rob Pike Ken Thompson It was released as free software in November
More informationSearching, mainly via Hash tables
Data structures and algorithms Part 11 Searching, mainly via Hash tables Petr Felkel 26.1.2007 Topics Searching Hashing Hash function Resolving collisions Hashing with chaining Open addressing Linear Probing
More informationCOMS 6100 Class Notes
COMS 6100 Class Notes Daniel Solus September 20, 2016 1 General Remarks The Lecture notes submitted by the class have been very good. Integer division seemed to be a common oversight when working the Fortran
More informationAnatomy of SINGULAR talk at p. 1
Anatomy of SINGULAR talk at MaGiX@LIX 2011- p. 1 Anatomy of SINGULAR talk at MaGiX@LIX 2011 - Hans Schönemann hannes@mathematik.uni-kl.de Department of Mathematics University of Kaiserslautern Anatomy
More informationComputational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation
Computational Numerical Integration for Spherical Quadratures Verified by the Boltzmann Equation Huston Rogers. 1 Glenn Brook, Mentor 2 Greg Peterson, Mentor 2 1 The University of Alabama 2 Joint Institute
More informationBranch Prediction based attacks using Hardware performance Counters IIT Kharagpur
Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur March 19, 2018 Modular Exponentiation Public key Cryptography March 19, 2018 Branch Prediction Attacks 2 / 54 Modular Exponentiation
More informationScalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults
Scalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults K.Morris, F.Rizzi, K.Sargsyan, K.Dahlgren, P.Mycek, C.Safta, O.LeMaitre, O.Knio, B.Debusschere Sandia National
More informationEfficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation
Efficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation Ryoko Hayashi and Susumu Horiguchi School of Information Science, Japan Advanced Institute of Science
More informationSimulation und wissenschaftliches. Rechnen (SiwiR)
Simulation und wissenschaftliches Rechnen (SiwiR) Christoph Pflaum p. 1/249 Content of the Lecture Why performance optimization? Short introduction to computer architecture and performance problems Performance
More informationTopology aware Cartesian grid mapping with MPI
Topology aware Cartesian gri mapping with MPI Christoph Niethammer niethammer@hlrs.e Rolf Rabenseifner rabenseifner@hlrs.e High Performance Computing Center (HLRS), University of Stuttgart, Germany HLRS,
More informationarxiv:astro-ph/ v1 12 Apr 1995
arxiv:astro-ph/9504040v1 12 Apr 1995 Parallel Linear General Relativity and CMB Anisotropies A Technical Paper submitted to Supercomputing 95 in HTML format Paul W. Bode bode@alcor.mit.edu Edmund Bertschinger
More informationAPPENDIX A FASTFLOW A.1 DESIGN PRINCIPLES
APPENDIX A FASTFLOW A.1 DESIGN PRINCIPLES FastFlow 1 has been designed to provide programmers with efficient parallelism exploitation patterns suitable to implement (fine grain) stream parallel applications.
More informationComp 11 Lectures. Mike Shah. June 20, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures June 20, / 38
Comp 11 Lectures Mike Shah Tufts University June 20, 2017 Mike Shah (Tufts University) Comp 11 Lectures June 20, 2017 1 / 38 Please do not distribute or host these slides without prior permission. Mike
More informationACCELERATING WEATHER PREDICTION WITH NVIDIA GPUS
ACCELERATING WEATHER PREDICTION WITH NVIDIA GPUS Alan Gray, Developer Technology Engineer, NVIDIA ECMWF 18th Workshop on high performance computing in meteorology, 28 th September 2018 ESCAPE NVIDIA s
More informationConvergence Models and Surprising Results for the Asynchronous Jacobi Method
Convergence Models and Surprising Results for the Asynchronous Jacobi Method Jordi Wolfson-Pou School of Computational Science and Engineering Georgia Institute of Technology Atlanta, Georgia, United States
More informationEntropy of Parallel Execution and Communication
International Journal of Networking and Computing www.ijnc.org ISSN 2185-2839 (print) ISSN 2185-2847 (online) Volume 7, Number 1, pages 2 28, January 2017 Ernesto Gomez School of Engineering and Computer
More informationImplementation of a preconditioned eigensolver using Hypre
Implementation of a preconditioned eigensolver using Hypre Andrew V. Knyazev 1, and Merico E. Argentati 1 1 Department of Mathematics, University of Colorado at Denver, USA SUMMARY This paper describes
More information3D Parallel FEM (I) Kengo Nakajima. Programming for Parallel Computing ( ) Seminar on Advanced Computing ( )
3D Parallel FEM (I) Kengo Nakajima Programming for Parallel Computing (616-2057) Seminar on Advanced Computing (616-4009) pfem3d-1 2 Target Application Parallel version of heat3d Using MPI pfem3d-1 3 Installation
More informationThree-Dimensional Explicit Parallel Finite Element Analysis of Functionally Graded Solids under Impact Loading. Ganesh Anandakumar and Jeong-Ho Kim
Three-Dimensional Eplicit Parallel Finite Element Analsis of Functionall Graded Solids under Impact Loading Ganesh Anandaumar and Jeong-Ho Kim Department of Civil and Environmental Engineering, Universit
More information