Hydra. A library for data analysis in massively parallel platforms. A. Augusto Alves Jr and Michael D. Sokoloff
|
|
- Lorena Page
- 5 years ago
- Views:
Transcription
1 Hydra A library for data analysis in massively parallel platforms A. Augusto Alves Jr and Michael D. Sokoloff University of Cincinnati aalvesju@cern.ch Presented at NVIDIA s GPU Technology Conference, May 8-11, Silicon Valley, US A. Augusto Alves Jr. Hydra May 7, / 23
2 Outline Design and goals of Hydra Basic functionalities and main algorithms Performance Multidimensional numerical integration Phase-space Monte Carlo generation Interface to ROOT::Minuit2 and fitting Summary A. Augusto Alves Jr. Hydra May 7, / 23
3 Motivation The Large Hadron Collider (LHC) and other facilities acquire 10 s petabytes of data anually. The collective effort to analyze this amount data requires state-of-the-art software tools that: Scale efficiently to face the increasing statistics from the experiments. Meet the high precision requirements typically necessary to address High Energy Physics (HEP) problems. Are efficient and flexible enough to face the different conditions of specific HEP experiments. Are portable, scalable, compatible with existing software and hardware standards. A. Augusto Alves Jr. Hydra May 7, / 23
4 Hydra Hydra is a header only templated C++ library designed to perform common HEP data analyses on massively parallel platforms. It is implemented on top of the C++11 Standard Library and a variadic version of the Thrust library. Hydra is designed to run on Linux systems and to use OpenMP, CUDA and TBB enabled devices. It is focused on portability, usability, performance and precision. A. Augusto Alves Jr. Hydra May 7, / 23
5 Design and features The main design features are: The library is structured using static polymorphism. There is absolutely no need to write explicit back-end oriented code. Clean and concise semantics. Interfaces are easy to use correctly and hard to use incorrectly. The same source files written using Hydra and standard C++ compile for GPU or CPU, just exchanging the extension from.cu to.cpp and one or two compiler flags. A. Augusto Alves Jr. Hydra May 7, / 23
6 Features Generation of Phase-space Monte Carlo samples. Sampling of multidimensional probability density functions. Data fitting using binned and unbinned multidimensional datasets. Evaluation of multidimensional functions over heterogeneous data sets. Numerical integration of multidimensional functions. A. Augusto Alves Jr. Hydra May 7, / 23
7 Functors Hydra adds features and type information to generic functors using the CRTP idiom. A generic functor with N parameters is represented like this: s t r u c t MyFunctor : p u b l i c hydra : : BaseFunctor <MyFunctor, double, N> { // MyFunctor c o n s t r u c t o r and o t h e r i m p l e m e n t a t i o n d e t a i l s... // User a l w a y s need to implement t h e E v a l u a t e ( ) method template<typename T> host device i n l i n e double E v a l u a t e (T x ) { // a c t u a l c a l c u l a t i o n } } ; All functors deriving from hydra::basefunctor<func,returntype,npars> can be cached, used to perform fits and to compose more complex mathematical expressions. A. Augusto Alves Jr. Hydra May 7, / 23
8 Arithmetic operations and composition with functors All the basic arithmetic operators are overloaded. Composition is also possible. If A, B and C are Hydra functors, the code below is completely legal.... // b a s i c a r i t h m e t i c o p e r a t i o n s auto A_plus_B = A + B ; auto A_minus_B = A B ; auto A_times_B = A B ; auto A_per_B = A/B ; // any c o m p o s i t i o n o f b a s i c o p e r a t i o n s auto any_functor = (A B) (A + B) (A/C ) ; // C(A,B) i s r e p r e s e n t e d by : auto compose_ functor = hydra : : compose (C, A, B)... The functors resulting from arithmetic operations and composition can be cached as well. No intrinsic limit on the number of functors participating on arithmetic or composition mathematical expressions. A. Augusto Alves Jr. Hydra May 7, / 23
9 Support for C++11 lambdas Lambda functions are fully supported in Hydra. The user can define a C++11 lambda function and convert it into a Hydra functor using hydra::wrap_lambda():... double two = 2. 0 ; // d e f i n e a s i m p l e lambda and c a p t u r e "two" auto my_lambda = [ ] host device ( double x ) { r e t u r n two s i n ( x [ 0 ] ) ; } ; // c o n v e r t i s i n t o a Hydra f u n c t o r auto my_lamba_wrapped = hydra : : wrap_lambda ( my_lambda ) ;... CUDA 8.0 supports lambda functions in device and host code. A. Augusto Alves Jr. Hydra May 7, / 23
10 Data containers hydra::point represents multidimensional data points including its coordinates, value and errors. hydra::pointvector Looks like an array of structs, but data is stored in structure of arrays. // two d i m e n s i o n a l p o i n t typedef hydra : : Point <GReal_t, 2> point_ t ; // two d i m e n s i o n a l data s e t on t h e d e v i c e hydra : : P o i n t V e c t o r <point_t, d e v i c e > data_d (1 e6 ) ;... // g e t data from d e v i c e hydra : : P o i n t V e c t o r <point_t, host > data_h ( data_d ) ; // f i l l a ROOT 2D h i s t o g r a m TH2D h i s t ( " h i s t ", "my h i s t o g r a m ", 100, min, max ) ; f o r ( auto row : data_h ){ auto p o i n t ( row ) ; h i s t. F i l l ( p o i n t. G e t C o o r d i n a t e ( 0 ), p o i n t. G e t C o o r d i n a t e ( 1 ) ) ; } A. Augusto Alves Jr. Hydra May 7, / 23
11 Functionalities Data fitting and Monte Carlo generation Interface to ROOT::Minuit2 minimization package. Phase-space generator. Multidimensional p.d.f. sampling. Parallel function evaluation over multidimensional datasets Numerical integration Flat Monte Carlo sampling. Vegas-like self-adaptive importance sampling (Monte Carlo). Gauss-Kronrod one-dimensional quadrature. Genz-Malik multidimesional quadrature. A. Augusto Alves Jr. Hydra May 7, / 23
12 Vegas-like multidimensional numerical integration The VEGAS algorithm is based on importance sampling. It samples the integrand and adapts itself, so that the points are concentrated in the regions that make the largest contribution to the integral. Hydra implementation follows the corresponding GSL algorithm. No limit in the number of dimensions. // V e g a s S t a t e h o l d r e s o u r c e s and c o n f i g u r a t i o n s VegasState <N, d e v i c e > State_d ( _min, _max ) ; State_d. S e t I t e r a t i o n s ( i t e r a t i o n s ) ; State_d. SetMaxError ( max_error ) ; State_d. S e t C a l l s ( c a l l s ) ; State_d. S e t T r a i n i n g C a l l s ( t c a l l s ) ; State_d. S e t T r a i n i n g I t e r a t i o n s ( 1 ) ; // Vegas i n t e g r a t o r o b j e c t Vegas<N, d e v i c e > Vegas_d ( State_d ) ; // i n t e g r a t e a G a u s s i a n Vegas_d. I n t e g r a t e ( G a u s s i a n ) ; A. Augusto Alves Jr. Hydra May 7, / 23
13 Vegas-like multidimensional numerical integration Processing a Gaussian distribution in 10 dimensions. Integral result Duration [ms] Speed-up GPU vs CPU GPU Iteration result Cumulative result Iteration System configuration: GPU model: Tesla K40c CPU: Intel R Xeon(R) CPU E GHz (one thread) GPU CPU speed-up Number of samples A. Augusto Alves Jr. Hydra May 7, /
14 Phase-Space Monte Carlo Describes the kinematics of a particle with a given four-momentum decaying to N-particle final state. No limitation on the number of particles in the final state. Support the generation of sequential decays. Generation of weighted and unweighted samples. // Masses o f t h e p a r t i c l e s hydra : : Vector4R Mother ( mother_mass, 0. 0, 0. 0, 0. 0 ) ; double Daughter_Masses [ 3 ] { daughter1_mass, daughter2_mass, daughter3_mass } ; // C r e a t e PhaseSpace o b j e c t hydra : : PhaseSpace <3> phsp ( Mother_mass, Daughter_Masses ) ; // A l l o c a t e t h e c o n t a i n e r f o r t h e e v e n t s hydra : : Events <3, d e v i c e > e v e n t s ( n d e c a y s ) ; // G e n e r a t e phsp. G e n e r a t e ( Mother, e v e n t s. b e g i n ( ), e v e n t s. end ( ) ) ; A. Augusto Alves Jr. Hydra May 7, / 23
15 Phase-Space Monte Carlo M(J/Ψπ) dalitz Entries 1e Mean x Mean y Std Dev x Std Dev y Duration [ms] Speed-up GPU vs CPU M(Kπ) System configuration: GPU model: Tesla K40c CPU: Intel R Xeon(R) CPU E GHz (one thread) GPU CPU speed-up Number of events A. Augusto Alves Jr. Hydra May 7, / 23 50
16 Interface to Minuit2 ROOT::Minuit2 is widely used in particle physics to find the minimum value of a multi-parameter function (FCN) and analyze the shape of the function around the minimum, and so to compute model s best-fit parameter values and uncertainties. Hydra implements an interface to ROOT::Minuit2 that parallelizes the FCN calculation. This dramatically accelerates the calculation over large datasets. The PDFs are normalized on-the-fly using analytical or numerical integration algorithms provided by Hydra. Data is passed using hydra::pointvector. A. Augusto Alves Jr. Hydra May 7, / 23
17 Interface to Minuit2 Model = N g Gaussian + N e Exponential G a u s s A n a l y t i c I n t e g r a l G a u s s I n t e g r a l ( min, max ) ; E x p A n a l y t i c I n t e g r a l E x p I n t e g r a l ( min, max ) ; auto Gaussian_PDF = hydra : : make_pdf ( Gaussian, G a u s s I n t e g r a l ) ; auto Exponentia_PDF = hydra : : make_pdf ( Exponentia, E x p I n t e g r a l ) ; // add t h e pds to make a e x t e n d e d pdf model s t d : : a r r a y <hydra : : Parameter, 3> y i e l d s { NGaussian, N E x p o n e n t i a l } ; auto Model = hydra : : add_pdfs ( y i e l d s, Gaussian_PDF, Exponentia_PDF ) ; model. SetExtended ( 1 ) ; // g e t t h e FCN auto Model_FCN = hydra : : make_ loglikehood_ fcn ( Model, data_d ) ; // p a s s t h e FCN to M i n u i t 2... A. Augusto Alves Jr. Hydra May 7, / 23
18 Interface to Minuit2 20 million event maximum likelihood unbinned fit. Yield data Entries 2e+07 Mean Std Dev X Timing: Fit on GPU: seconds Fit on CPU: seconds Speed-up: 62x System configuration: GPU model: Tesla K40c CPU: Intel R Xeon(R) CPU E GHz (one thread) A. Augusto Alves Jr. Hydra May 7, / 23
19 Summary Hydra s development has been supported by the National Science Foundation under the grant number PHY The project is hosted on GitHub: The package includes a suite of examples. It is being used at CERN on analyses aiming to measure the Kaon mass using large datasets. Acknowledgments To Karen Tomko and Bradley Hittle from the Ohio Supercomputer Center. To the University of Cincinnati LHCb group. Please, visit the page of the project, try it out, report bugs, make suggestions... Thanks! A. Augusto Alves Jr. Hydra May 7, / 23
20 Backup
21 Phase-Space Monte Carlo OpenMP: scalling with number of threads Duration [ms] System configuration: CPU: Intel R Xeon(R) CPU E GHz x Number of OpenMP threads A. Augusto Alves Jr. Hydra May 7, / 23
22 Phase-Space Monte Carlo CUDA OpenMP, TBB GPU vs OpenMP GPU vs TBB Duration [ms] Speed-up GPU vs CPU Duration [ms] Speed-up GPU vs CPU GPU 4 CPU speed-up Number of events 1 GPU 4 CPU speed-up Number of events A. Augusto Alves Jr. Hydra May 7, / 23
23 Vegas-like multidimensional numerical integration OpenMP: scalling with number of threads System configuration: CPU: Intel R Xeon(R) CPU E GHz x 48 Duration [ms] Number of OpenMP threads A. Augusto Alves Jr. Hydra May 7, / 23
Hydra. A. Augusto Alves Jr and M.D. Sokoloff. University of Cincinnati
Hydra A library for data analysis in massively parallel platforms A. Augusto Alves Jr and M.D. Sokoloff University of Cincinnati aalvesju@cern.ch Presented at the Workshop Perspectives of GPU computing
More informationarxiv: v1 [physics.data-an] 19 Feb 2017
arxiv:1703.03284v1 [physics.data-an] 19 Feb 2017 Model-independent partial wave analysis using a massively-parallel fitting framework L Sun 1, R Aoude 2, A C dos Reis 2, M Sokoloff 3 1 School of Physics
More informationIntroduction to numerical computations on the GPU
Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming
More informationScalable and Power-Efficient Data Mining Kernels
Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the
More informationGPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic
GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago
More informationA Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationA Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters
A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!
More informationIntroduction to Benchmark Test for Multi-scale Computational Materials Software
Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)
More informationJulian Merten. GPU Computing and Alternative Architecture
Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg
More informationPerm State University Research-Education Center Parallel and Distributed Computing
Perm State University Research-Education Center Parallel and Distributed Computing A 25-minute Talk (S4493) at the GPU Technology Conference (GTC) 2014 MARCH 24-27, 2014 SAN JOSE, CA GPU-accelerated modeling
More informationOn Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code
On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance
More informationGPU Accelerated Markov Decision Processes in Crowd Simulation
GPU Accelerated Markov Decision Processes in Crowd Simulation Sergio Ruiz Computer Science Department Tecnológico de Monterrey, CCM Mexico City, México sergio.ruiz.loza@itesm.mx Benjamín Hernández National
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationDr. Andrea Bocci. Using GPUs to Accelerate Online Event Reconstruction. at the Large Hadron Collider. Applied Physicist
Using GPUs to Accelerate Online Event Reconstruction at the Large Hadron Collider Dr. Andrea Bocci Applied Physicist On behalf of the CMS Collaboration Discover CERN Inside the Large Hadron Collider at
More informationarxiv: v1 [hep-lat] 7 Oct 2010
arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA
More informationStudy of Diboson Physics with the ATLAS Detector at LHC
Study of Diboson Physics with the ATLAS Detector at LHC Hai-Jun Yang University of Michigan (for the ATLAS Collaboration) APS April Meeting St. Louis, April 12-15, 2008 The Large Hadron Collider at CERN
More informationCalculation of ground states of few-body nuclei using NVIDIA CUDA technology
Calculation of ground states of few-body nuclei using NVIDIA CUDA technology M. A. Naumenko 1,a, V. V. Samarin 1, 1 Flerov Laboratory of Nuclear Reactions, Joint Institute for Nuclear Research, 6 Joliot-Curie
More informationZ 0 Resonance Analysis Program in ROOT
DESY Summer Student Program 2008 23 July - 16 September 2008 Deutsches Elektronen-Synchrotron, Hamburg GERMANY Z 0 Resonance Analysis Program in ROOT Atchara Punya Chiang Mai University, Chiang Mai THAILAND
More informationA CUDA Solver for Helmholtz Equation
Journal of Computational Information Systems 11: 24 (2015) 7805 7812 Available at http://www.jofcis.com A CUDA Solver for Helmholtz Equation Mingming REN 1,2,, Xiaoguang LIU 1,2, Gang WANG 1,2 1 College
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationParallel Polynomial Evaluation
Parallel Polynomial Evaluation Jan Verschelde joint work with Genady Yoffe University of Illinois at Chicago Department of Mathematics, Statistics, and Computer Science http://www.math.uic.edu/ jan jan@math.uic.edu
More informationImplementing NNLO into MCFM
Implementing NNLO into MCFM Downloadable from mcfm.fnal.gov A Multi-Threaded Version of MCFM, J.M. Campbell, R.K. Ellis, W. Giele, 2015 Higgs boson production in association with a jet at NNLO using jettiness
More informationHistFitter: a flexible framework for statistical data analysis
: a flexible framework for statistical data analysis Fakultät für Physik, LMU München, Am Coulombwall 1, 85748 Garching, Germany, Excellence Cluster Universe, Boltzmannstr. 2, 85748 Garching, Germany E-mail:
More informationAccelerating Model Reduction of Large Linear Systems with Graphics Processors
Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex
More informationEW theoretical uncertainties on the W mass measurement
EW theoretical uncertainties on the W mass measurement Luca Barze 1, Carlo Carloni Calame 2, Homero Martinez 3, Guido Montagna 2, Oreste Nicrosini 3, Fulvio Piccinini 3, Alessandro Vicini 4 1 CERN 2 Universita
More informationDelayed and Higher-Order Transfer Entropy
Delayed and Higher-Order Transfer Entropy Michael Hansen (April 23, 2011) Background Transfer entropy (TE) is an information-theoretic measure of directed information flow introduced by Thomas Schreiber
More informationAlgorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method
Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationLattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationAPPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD
APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia
More informationA new multiplication algorithm for extended precision using floating-point expansions. Valentina Popescu, Jean-Michel Muller,Ping Tak Peter Tang
A new multiplication algorithm for extended precision using floating-point expansions Valentina Popescu, Jean-Michel Muller,Ping Tak Peter Tang ARITH 23 July 2016 AMPAR CudA Multiple Precision ARithmetic
More informationBlock AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark
Block AIR Methods For Multicore and GPU Per Christian Hansen Hans Henrik B. Sørensen Technical University of Denmark Model Problem and Notation Parallel-beam 3D tomography exact solution exact data noise
More informationMassively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem
Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing
More information11 Parallel programming models
237 // Program Design 10.3 Assessing parallel programs 11 Parallel programming models Many different models for expressing parallelism in programming languages Actor model Erlang Scala Coordination languages
More informationCOMBINED EXPLICIT-IMPLICIT TAYLOR SERIES METHODS
COMBINED EXPLICIT-IMPLICIT TAYLOR SERIES METHODS S.N. Dimova 1, I.G. Hristov 1, a, R.D. Hristova 1, I V. Puzynin 2, T.P. Puzynina 2, Z.A. Sharipov 2, b, N.G. Shegunov 1, Z.K. Tukhliev 2 1 Sofia University,
More informationRecent Progress of Parallel SAMCEF with MUMPS MUMPS User Group Meeting 2013
Recent Progress of Parallel SAMCEF with User Group Meeting 213 Jean-Pierre Delsemme Product Development Manager Summary SAMCEF, a brief history Co-simulation, a good candidate for parallel processing MAAXIMUS,
More informationAccelerated Neutrino Oscillation Probability Calculations and Reweighting on GPUs. Richard Calland
Accelerated Neutrino Oscillation Probability Calculations and Reweighting on GPUs Richard Calland University of Liverpool GPU Computing in High Energy Physics University of Pisa, 11th September 2014 Introduction
More informationACCELERATED LEARNING OF GAUSSIAN PROCESS MODELS
ACCELERATED LEARNING OF GAUSSIAN PROCESS MODELS Bojan Musizza, Dejan Petelin, Juš Kocijan, Jožef Stefan Institute Jamova 39, Ljubljana, Slovenia University of Nova Gorica Vipavska 3, Nova Gorica, Slovenia
More informationExplore Computational Power of GPU in Electromagnetics and Micromagnetics
Explore Computational Power of GPU in Electromagnetics and Micromagnetics Presenter: Sidi Fu, PhD candidate, UC San Diego Advisor: Prof. Vitaliy Lomakin Center of Magnetic Recording Research, Department
More informationΣ(1385) production in proton-proton collisions at s =7 TeV
Σ(1385) production in proton-proton collisions at s =7 TeV Enrico Fragiacomo, Massimo Venaruzzo, Giacomo Contin, Ramona Lea July 16, 2012 1 Introduction Objective of this note is to support the Σ(1385)
More informationRecent CMS results on heavy quarks and hadrons. Alice Bean Univ. of Kansas for the CMS Collaboration
Recent CMS results on heavy quarks and hadrons Alice Bean Univ. of Kansas for the CMS Collaboration July 25, 2013 Outline CMS at the Large Hadron Collider Cross section measurements Search for state decaying
More informationLearning Particle Physics by Example:
Learning Particle Physics by Example: Accelerating Science with Generative Adversarial Networks arxiv:1701.05927, arxiv:1705.02355 @lukede0 @lukedeo lukedeo@manifold.ai https://ldo.io Luke de Oliveira
More informationGPU Accelerated Reweighting Calculations for Neutrino Oscillation Analyses with the T2K Experiment
GPU Accelerated Reweighting Calculations for Neutrino Oscillation Analyses with the T2K Experiment Oxford Many Core Seminar - 19th February Richard Calland rcalland@hep.ph.liv.ac.uk Outline of Talk Neutrino
More informationParallel Sparse Tensor Decompositions using HiCOO Format
Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationFast event generation system using GPU. Junichi Kanzaki (KEK) ACAT 2013 May 16, 2013, IHEP, Beijing
Fast event generation system using GPU Junichi Kanzaki (KEK) ACAT 2013 May 16, 2013, IHEP, Beijing Motivation The mount of LHC data is increasing. -5fb -1 in 2011-22fb -1 in 2012 High statistics data ->
More informationSTCE. Adjoint Code Design Patterns. Uwe Naumann. RWTH Aachen University, Germany. QuanTech Conference, London, April 2016
Adjoint Code Design Patterns Uwe Naumann RWTH Aachen University, Germany QuanTech Conference, London, April 2016 Outline Why Adjoints? What Are Adjoints? Software Tool Support: dco/c++ Adjoint Code Design
More informationTesting Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation
Testing Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation Bruce Knuteson 1 and Ricardo Vilalta 2 1 Laboratory for Nuclear Science, Massachusetts Institute of Technology
More informationRelative branching ratio measurements of charmless B ± decays to three hadrons
LHCb-CONF-011-059 November 10, 011 Relative branching ratio measurements of charmless B ± decays to three hadrons The LHCb Collaboration 1 LHCb-CONF-011-059 10/11/011 Abstract With an integrated luminosity
More informationSP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay
SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain
More informationHigh-performance processing and development with Madagascar. July 24, 2010 Madagascar development team
High-performance processing and development with Madagascar July 24, 2010 Madagascar development team Outline 1 HPC terminology and frameworks 2 Utilizing data parallelism 3 HPC development with Madagascar
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture VII (26.11.07) Contents: Maximum Likelihood (II) Exercise: Quality of Estimators Assume hight of students is Gaussian distributed. You measure the size of N students.
More informationHIJING++, a Monte Carlo Jet Event Generator for the Future Collider Experiments
HIJING++, a Monte Carlo Jet Event Generator for the Future Collider Experiments Speaker: Gergely Gábor Barnaföldi, Wigner RCP of the H.A.S. Group: GGB, G. Bíró, Sz.M. Harangozó, W.T. Deng, M. Gyulassy,
More informationHASPECT in action: CLAS12 analysis of
HASPECT in action: CLAS12 analysis of Development of an analysis framework for the MesonEx experiment A. Celentano (INFN Genova and Genova University) Outline The MesonEx experiment at JLab@12 GeV in Hall
More informationTips Geared Towards R. Adam J. Suarez. Arpil 10, 2015
Tips Geared Towards R Departments of Statistics North Carolina State University Arpil 10, 2015 1 / 30 Advantages of R As an interpretive and interactive language, developing an algorithm in R can be done
More informationHybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures
More informationGeneralized Partial Wave Analysis Software for PANDA
Generalized Partial Wave Analysis Software for PANDA 39. International Workshop on the Gross Properties of Nuclei and Nuclear Excitations The Structure and Dynamics of Hadrons Hirschegg, January 2011 Klaus
More informationStudy of the Electromagnetic Dalitz decay of J/ψ e + e - π 0
Study of the Electromagnetic Dalitz decay of J/ψ e + e - π 0 Vindhyawasini Prasad Email: vindy@ustc.edu.cn Department of Modern Physics University of Science & Technology of China Hefei City, Anhui Province,
More informationRecent results from the LHCb experiment
Recent results from the LHCb experiment University of Cincinnati On behalf of the LHCb collaboration Brief intro to LHCb The Large Hadron Collider (LHC) proton-proton collisions NCTS Wksp. DM 2017, Shoufeng,
More informationarxiv: v1 [hep-lat] 10 Jul 2012
Hybrid Monte Carlo with Wilson Dirac operator on the Fermi GPU Abhijit Chakrabarty Electra Design Automation, SDF Building, SaltLake Sec-V, Kolkata - 700091. Pushan Majumdar Dept. of Theoretical Physics,
More informationMulticore Parallelization of Determinant Quantum Monte Carlo Simulations
Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Andrés Tomás, Che-Rung Lee, Zhaojun Bai, Richard Scalettar UC Davis SIAM Conference on Computation Science & Engineering Reno, March
More informationComputing least squares condition numbers on hybrid multicore/gpu systems
Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning
More informationPerformance Analysis of Lattice QCD Application with APGAS Programming Model
Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models
More informationVisualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka
Visualizing Big Data on Maps: Emerging Tools and Techniques Ilir Bejleri, Sanjay Ranka Topics Web GIS Visualization Big Data GIS Performance Maps in Data Visualization Platforms Next: Web GIS Visualization
More informationAGH-UST University of Science and Technology, Faculty of Physics and Applied Computer Science, Krakow, Poland
Central Exclusive Production at LHCb AGH-UST University of Science and Technology, Faculty of Physics and Applied Computer Science, Krakow, Poland E-mail: brachwal@agh.edu.pl The LHCb detector, with its
More informationUsing a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics
Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics Jorge González-Domínguez Parallel and Distributed Architectures Group Johannes Gutenberg University of Mainz, Germany j.gonzalez@uni-mainz.de
More informationPiz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess
Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:
More informationExercise 7: Maximum likelihood fits
Kirchhoff-Institut, Physikalisches Institut and MPI für Kernphysik Winter semester 204-205 KIP CIP Pool.40 Exercises for Statistical Methods in Particle Physics http://www.kip.uni-heidelberg.de/~obrandt/teaching/204ws/statistics/
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationS XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library
S6349 - XMP LIBRARY INTERNALS Niall Emmart University of Massachusetts Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library High Performance Modular Exponentiation A^K mod P Where A,
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationMeasurement of the baryon number transport with LHCb
Measurement of the baryon number transport with LHCb Marco Adinolfi University of Bristol On behalf of the LHCb Collaboration 13 April 2011 / DIS 2011 Marco Adinolfi DIS 2011-13 April 2011 - Newport News
More informationUTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement
UTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement Wuxi Li, Meng Li, Jiajun Wang, and David Z. Pan University of Texas at Austin wuxili@utexas.edu November 14, 2017 UT DA Wuxi Li
More informationUsing AmgX to accelerate a PETSc-based immersed-boundary method code
29th International Conference on Parallel Computational Fluid Dynamics May 15-17, 2017; Glasgow, Scotland Using AmgX to accelerate a PETSc-based immersed-boundary method code Olivier Mesnard, Pi-Yueh Chuang,
More informationINF 5860 Machine learning for image classification. Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018
INF 5860 Machine learning for image classification Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018 OUTLINE Deep learning frameworks TensorFlow TensorFlow graphs TensorFlow session
More informationPopulation annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice
Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice Michal Borovský Department of Theoretical Physics and Astrophysics, University of P. J. Šafárik in Košice,
More informationEmpowering Scientists with Domain Specific Languages
Empowering Scientists with Domain Specific Languages Julian Kunkel, Nabeeh Jum ah Scientific Computing Department of Informatics University of Hamburg SciCADE2017 2017-09-13 Outline 1 Developing Scientific
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationThe Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland. Rare B decays at CMS
Available on CMS information server CMS CR -2017/115 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 17 April 2017 (v4, 10 May 2017) Rare
More informationarxiv: v2 [hep-ex] 8 Aug 2013
-PROC-13-6 May 11, 1 Studies of charmless B decays including CP violation effects arxiv:138.7v [hep-ex] 8 Aug 13 Irina Nasteva 1 Centro Brasileiro de Pesquisas Físicas Rio de Janeiro, Brazil The latest
More informationParallel Transposition of Sparse Data Structures
Parallel Transposition of Sparse Data Structures Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng Department of Computer Science, Virginia Tech Niels Bohr Institute, University of Copenhagen Scientific Computing
More informationPoS(ACAT08)110. Standard SANC Modules. Vladimir Kolesnikov DLNP,Joint Institute for Nuclear Research (JINR)
E-mail: kolesnik@numail.jinr.ru Anton Andonov Bishop Konstantin Preslavsky University, Shoumen, Bulgaria Andrey Arbuzov BLTP,Joint Institute for Nuclear Research (JINR) Dmitry Bardin Serge Bondarenko BLTP,Joint
More informationThe Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland
Available on CMS information server CMS CR -2013/016 The Compact Muon Solenoid Experiment Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland 18 January 2013 (v2, 21 January 2013)
More informationPSEUDORANDOM numbers are very important in practice
Proceedings of the Federated Conference on Computer Science and Information Systems pp 571 578 ISBN 978-83-681-51-4 Parallel GPU-accelerated Recursion-based Generators of Pseudorandom Numbers Przemysław
More informationHigh pt Cross Sections with
DPG Frühjahrstagung 2006 High pt Cross Sections with Thomas Kluge, DESY, H1 Andreas Oehler, University of Karlsruhe, CMS Klaus Rabbertz, University of Karlsruhe, CMS Markus Wobisch, FERMILAB, D0 Klaus
More informationImproving many flavor QCD simulations using multiple GPUs
Improving many flavor QCD simulations using multiple GPUs M. Hayakawa a, b, Y. Osaki b, S. Takeda c, S. Uno a, N. Yamada de a Department of Physics, Nagoya University, Nagoya 464-8602, Japan b Department
More informationAlignment of the ATLAS Inner Detector tracking system
Alignment of the ALAS Inner Detector tracking system Oleg BRAND University of Oxford and University of Göttingen E-mail: oleg.brandt@cern.ch he Large Hadron Collider (LHC) at CERN is the world largest
More informationDetermination of the CP-violating phase φ s in
Determination of the CP-violating phase φ s in B 0 s J/ψφ decays Varvara Batozskaya on behalf of the LHCb Collaboration National Centre for Nuclear Research (NCBJ), Warsaw, Poland The determination of
More informationLogo. A Massively-Parallel Multicore Acceleration of a Point Contact Solid Mechanics Simulation DRAFT
Paper 1 Logo Civil-Comp Press, 2017 Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering, P. Iványi, B.H.V Topping and G. Várady (Editors)
More informationResearch on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method
NUCLEAR SCIENCE AND TECHNIQUES 25, 0501 (14) Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method XU Qi ( 徐琪 ), 1, YU Gang-Lin ( 余纲林 ), 1 WANG Kan ( 王侃 ),
More informationStochastic Modelling of Electron Transport on different HPC architectures
Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy
More informationPerformance and Application of Observation Sensitivity to Global Forecasts on the KMA Cray XE6
Performance and Application of Observation Sensitivity to Global Forecasts on the KMA Cray XE6 Sangwon Joo, Yoonjae Kim, Hyuncheol Shin, Eunhee Lee, Eunjung Kim (Korea Meteorological Administration) Tae-Hun
More information17/07/ Pick up Lecture Notes... WEBSITE FOR ASSIGNMENTS AND TOOLBOX DEFINITION DEFINITIONS AND CONCEPTS OF GIS
WEBSITE FOR ASSIGNMENTS AND LECTURE PRESENTATIONS www.franzy.yolasite.com Pick up Lecture Notes... LECTURE 2 PRINCIPLES OF GEOGRAPHICAL INFORMATION SYSTEMS I- GEO 362 Franz Okyere DEFINITIONS AND CONCEPTS
More informationUsing MadGraph/MadEvent/PYTHIA/PGS
Using MadGraph/MadEvent/PYTHIA/PGS Ian-Woo Kim Seoul National University SNU, Mar 20, 2007 Ian-Woo Kim (SNU) MadGraph/MadEvent/PYTHIA/PGS SNU 03/2007 1 / 18 Introduction MadGraph/MadEvent is a Feynman
More informationOutline. Working group goals B2hh selection: current status CP asymmetry fit: status and future plans Working group organization
Outline Working group goals B2hh selection: current status CP asymmetry fit: status and future plans Working group organization Goal: do the full B hh exercise! Bhh channels have been identified by the
More informationParallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors
Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1 1 Deparment of Computer
More informationRecent Heavy Flavors results from Tevatron. Aleksei Popov (Institute for High Energy Physics, Protvino) on behalf of the CDF and DØ Collaborations
Recent Heavy Flavors results from Tevatron Aleksei Popov (Institute for High Energy Physics, Protvino) on behalf of the CDF and DØ Collaborations March 27, 2017 Outline Tevatron, CDF and DØ Confirmation
More informationParallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)
Bulletin of Networking, Computing, Systems, and Software www.bncss.org, ISSN 2186-5140 Volume 7, Number 1, pages 28 32, January 2018 Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)
More informationCRYPTOGRAPHIC COMPUTING
CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,
More information