Lenstool-HPC. From scratch to supercomputers: building a large-scale strong lensing computational software bottom-up. HPC Advisory Council, April 2018
|
|
- Lucas Wright
- 5 years ago
- Views:
Transcription
1 LenstoolHPC From scratch to supercomputers: building a largescale strong lensing computational software bottomup HPC Advisory Council, April 2018 Christoph Schäfer and Markus Rexroth (LASTRO) Gilles Fourestey (SCITAS)
2 Gravitational lensing Einstein ring (credit: Nasa/Hubble)
3 Gravitational lensing Einstein ring (credit: Nasa/Hubble)
4 Gravitational lensing Light refraction caused by a distribution of matter according to Albert Einstein's general theory of relativity (1916) Article about star GR in 1936 (A. Einstein, Science) Fritz Zwicky posited in 1937 that the effect could allow galaxy clusters to act like lenses First observed in 1979 "Twin QSO" SBS Twin QSO (center), Credit: ESA/Hubble & NASA
5 Gravitational lensing Optical artefacts created by dense mass distributions Galaxies Dark matter Black holes Parametric Lens Model the ellipticity of the projected mass distribution ω the finite core radius 0 the normalized surface mass density (x,y) the lens position... Reverse engineer the lenses: Recompose faraway objects by computing the lenses mass Typical search space dimension: >1010 Using the vanilla version of lenstool requires months to find the optimal solution!
6 LenstoolHPC Motivations LenstoolHPC was developed: based on Lenstool (Pr. Kneib et al., from 1996 onwards), In 6 manmonth FTE, By two field scientists and one application expert, bottomup from scratch. No separation of concern: Field scientists define algorithmic constrains at every step Computer scientists provide the most optimized implementation on specific hardware extreme(ish) programming Performance is scaled bottomup: Focus on algorithms/kernels and data structures Performance scaling from core to full machine
7 Formalism Source s position on the source plane: 2D lensing potential: Example of gradient (SIS): The lens equation:
8 Strong Lensing Algorithm Step 0 Finite core radius Given a parametric model for all the lens types: normalized surface mass density Step 0: Compute all the gradients (~90% of TTS) DLP for each pixel of the image Mapping algorithms to the hardware: High performance data structures (SOA) Implicit and Explicit (handcoded) vectorization Ellipticity of the projected mass distribution Position
9 Strong Lensing Algorithm Step 1 Given a parametric model for all the lens types Step 0: Compute all the gradients unlensing Step 1a: unlensing (linear transformation) TLP lensing the green dots (images) to the Source plane (yellow dot) Compute the barycenter of the yellow dots Step 1b: relensing (nonlinear transformation) TLP Decompose the Image plane into triangles Lense the triangles to the Source plane If the lensed triangle includes the barycenter, a predicted image is found (red triangles in Image plane) relensing
10 Strong Lensing Algorithm Step 2 & 3 Given a parametric model for all the lens types Step 0: Compute all the gradients Step 1a: unlensing (linear transformation) lensing the green dots (images) to the Source plane (yellow dot) Compute the barycenter of the yellow dots Step 1b: relensing (nonlinear transformation) Decompose the Image plane into triangles Lense the triangles to the Source plane If the lensed triangle includes the barycenter, a predicted image is found (red triangles in Image plane) Step 2: (MPI) Compute Step 3: Pass the Chi2 to a Bayesian MCMC code (MPI) Restart with new set of parameters until close to reality
11 Performance scaling Strong Lensing Algorithm
12 Gradient Benchmark Results (Step 0) Gradient benchmark computation: 5000x5000 pixels image, 69 sources, 203 constraints AVX2* Code AVX512F* TTS Factor TTS Factor Lenstool s 1X 4.8s 1X LenstoolHPC AOS 0.8s 1.3X 5.6s 0.9X LenstoolHPC SOA 0.5s 2.0X 3.3s 1.4X LenstoolHPC SOA + DLP 0.2s 4.5X 0.4s 11.4X Performance on Broadwell: IACA: ~ 6 Flops/cycle Intel Advisor: ~25% of peak *AVX2: Broadwell Intel Xeon CPU E GHz, intel compilers 17 *AVX512F: Intel Xeon Phi CPU 1.30GHz, intel compilers 17
13 Distributed Grid Gradient Grid Gradient computation distribution (step 1): Images split into regular subdomains with MPI Subdomains are handled using OpenMP/CUDA
14 Grid Gradient Benchmark (Step 1) AVX2 Grid Gradient benchmark (TTS, in s) AVX v4 2695v4 (PizDaint) SKL Plat HT KNL (greina) P100 (greina) V100 lenstool (TLP) NA NA lenstoolhpc (SOA + TLP) NA NA lenstoolhpc (SOA + TLP + DLP/SIMT) Single node Grid Gradient benchmark computation: 6000x6000 pixels image, 69 sources, 203 constraints. SIMT TLP is giving the best bang for your bucks SOA alone gives a nice boost (and is mandatory for efficient DLP) DLP is getting better with wider vector sizes (avx512 is ~2x avx2). V100 is much faster than P100
15 Chi2 computation The Chi2 is computed by computing the distance between the original images and their computed unlensed/relensed projections from steps 1a and 1b The blue dots correspond to the same image in the source plane Each distance for the same source (in blue) are reduced to Rank 0 using MPI_Pack The Chi2 is computed on Rank 0
16 DaintGPU: Chi2 (Step 2) Strong Scaling Num. nodes Grid Gradient Comp Quadrant unlensing MPI reduction TTS Scalability of the Chi2 benchmark using a 8k x 8k image, 69 sources, 203 constraints on Piz Daint multicore, 1 MPI process and 18 threads per socket, in seconds
17 DaintMC: Chi2 (Step 2) Strong Scaling Num. nodes Grid Gradient Comp. Quadrant unlensing MPI reduction TTS Scalability of the Chi2 benchmark using a 8k x 8k image, 69 sources, 203 constraints on Piz Daint multicore, 1 MPI process and 18 threads per socket, in seconds This represents a 50X compared to Lenstool in 6 months FTE
18 Current Status and Next Steps Development: Code on c4science, with unit tests for each kernels (lensing, unlensing, Chi2 ) Large development project on CSCS Piz Daint Aries network tuning GPU tuning: lensing, unlensing and chi computation are (very) regular Development a parallel MCMC framework, could lead to a 500X speedup, e.g. Pi4u: (P. E. Hadjidoukas et al., ETHZ) Papers: High Performance Computing for gravitational lens modeling: single vs double precision on GPUs and CPUs Markus Rexroth, Christoph Schafer, Gilles Fourestey, JeanPaul Kneib To be submitted High Performance Strong Lensing Map Generation for Lenstool Christoph Schafer, Gilles Fourestey, JeanPaul Kneib In Preparation
19 Lensing Map Generation Maps based on second derivative of lensing potentials (Mass, Amplification, Shear) Used for calculation of statistical errors of the MCMC method Sampling of parameter space Compute average and standard deviation for every pixel Added to best prediction, gives asymmetric error bars Fast Map generation crucial Actual process takes months Grid Gradient 2 benchmark TTS, in s lenstool lenstoolhpc 1.3 Speedup Single node Grid Gradient benchmark computation: 4200x4200 pixels image, 201 individual lenses. x567 Lenstool: Intel(R) Xeon(R) CPU E GHz Lenstool HPC: P100
20 Brownie points Thanks to Pr. JeanPaul Kneib (LASTRO, EPFL), Pr. Jan Hesthaven and Dr. Vittoria Rezzonico (SCITAS, EPFL) Thanks to Colin McMurtrie and Hussein ElHarake from CSCS for their support using the CSCS test cluster
21 Questions?
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationMore Science per Joule: Bottleneck Computing
More Science per Joule: Bottleneck Computing Georg Hager Erlangen Regional Computing Center (RRZE) University of Erlangen-Nuremberg Germany PPAM 2013 September 9, 2013 Warsaw, Poland Motivation (1): Scalability
More informationFrom Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess
From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system Oliver Fuhrer and Thomas C. Schulthess 1 Piz Daint Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Compute node:
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationPerformance Analysis of Lattice QCD Application with APGAS Programming Model
Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models
More informationThe Fast Multipole Method in molecular dynamics
The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules
More informationHYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017
HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher
More informationWRF performance tuning for the Intel Woodcrest Processor
WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,
More informationLarge-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors
Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology
More informationMassively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling
2019 Intel extreme Performance Users Group (IXPUG) meeting Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr)
More informationPiz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess
Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:
More informationCosmological N-Body Simulations and Galaxy Surveys
Cosmological N-Body Simulations and Galaxy Surveys Adrian Pope, High Energy Physics, Argonne Na3onal Laboratory, apope@anl.gov CScADS: Scien3fic Data and Analy3cs for Extreme- scale Compu3ng, 30 July 2012
More informationWeather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012
Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationGravitational Lensing. A Brief History, Theory, and Applications
Gravitational Lensing A Brief History, Theory, and Applications A Brief History Einstein (1915): light deflection by point mass M due to bending of space-time = 2x Newtonian light tangentially grazing
More informationObserving Dark Worlds (Final Report)
Observing Dark Worlds (Final Report) Bingrui Joel Li (0009) Abstract Dark matter is hypothesized to account for a large proportion of the universe s total mass. It does not emit or absorb light, making
More informationMassively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem
Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing
More informationMassively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling
2019 Intel extreme Performance Users Group (IXPUG) meeting Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr)
More informationLattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationSome thoughts about energy efficient application execution on NEC LX Series compute clusters
Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science
More informationCluster strong lensing as a probe of the high redshift Universe
Cluster strong lensing as a probe of the high redshift Universe Jean-Paul KNEIB Laboratoire Astrophysique de Marseille (LAM) now on leave at: LASTRO - EPFL Mont-Blanc 1 Chapters Introduction on cluster
More informationDark matter: summary
Dark matter: summary Gravity and detecting Dark Matter Massive objects, even if they emit no light, exert gravitational forces on other massive objects. m 1 r 12 m 2 We study the motions (dynamics) of
More informationPerformance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville
Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms
More informationReview for the Midterm Exam
Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationPerformance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster
Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp
More informationERLANGEN REGIONAL COMPUTING CENTER
ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,
More informationThe Memory Intensive System
DiRAC@Durham The Memory Intensive System The DiRAC-2.5x Memory Intensive system at Durham in partnership with Dell Dr Lydia Heck, Technical Director ICC HPC and DiRAC Technical Manager 1 DiRAC Who we are:
More informationSome notes on efficient computing and setting up high performance computing environments
Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient
More informationExtragalactic DM Halos and QSO Properties Through Microlensing
Extragalactic DM Halos and QSO Properties Through Micro Eduardo Guerras (student) - Evencio Mediavilla (supervisor) Instituto de Astrofísica de Canarias Photon deflection by gravitating mass α = 4GM 2
More informationHigh-performance processing and development with Madagascar. July 24, 2010 Madagascar development team
High-performance processing and development with Madagascar July 24, 2010 Madagascar development team Outline 1 HPC terminology and frameworks 2 Utilizing data parallelism 3 HPC development with Madagascar
More informationVerbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen
Verbundprojekt ELPA-AEO http://elpa-aeo.mpcdf.mpg.de Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen BMBF Projekt 01IH15001 Feb 2016 - Jan 2019 7. HPC-Statustagung,
More informationScalable and Power-Efficient Data Mining Kernels
Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the
More informationIntroduction to Benchmark Test for Multi-scale Computational Materials Software
Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)
More informationGravitational Lensing: Strong, Weak and Micro
P. Schneider C. Kochanek J. Wambsganss Gravitational Lensing: Strong, Weak and Micro Saas-Fee Advanced Course 33 Swiss Society for Astrophysics and Astronomy Edited by G. Meylan, P. Jetzer and P. North
More informationAstronomical Computer Simulations. Aaron Smith
Astronomical Computer Simulations Aaron Smith 1 1. The Purpose and History of Astronomical Computer Simulations 2. Algorithms 3. Systems/Architectures 4. Simulation/Projects 2 The Purpose of Astronomical
More informationarxiv: v1 [astro-ph.im] 4 May 2015
Draft version June 20, 2018 Preprint typeset using L A TEX style emulateapj v. 5/2/11 COMPARISON OF STRONG GRAVITATIONAL LENS MODEL SOFTWARE III. DIRECT AND INDIRECT SEMI-INDEPENDENT LENS MODEL COMPARISONS
More informationThe Gravitational Microlensing Planet Search Technique from Space
The Gravitational Microlensing Planet Search Technique from Space David Bennett & Sun Hong Rhie (University of Notre Dame) Abstract: Gravitational microlensing is the only known extra-solar planet search
More informationThe phenomenon of gravitational lenses
The phenomenon of gravitational lenses The phenomenon of gravitational lenses If we look carefully at the image taken with the Hubble Space Telescope, of the Galaxy Cluster Abell 2218 in the constellation
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationStatistics of flux ratios in strong lenses: probing of dark matter on small scales
Statistics of flux ratios in strong lenses: probing of dark matter on small scales Daniel Gilman (UCLA) With: Simon Birrer, Tommaso Treu, Anna Nierenberg, Chuck Keeton, Andrew Benson image: ESA/Hubble,
More informationHubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data
HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data Sam L. Shue, Andrew R. Willis, and Thomas P. Weldon Dept. of Electrical and Computer Engineering University of North
More informationJacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA
Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is
More informationGPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic
GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago
More informationPractical Combustion Kinetics with CUDA
Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides
More information上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose
上海超级计算中心 Shanghai Supercomputer Center Lei Xu Shanghai Supercomputer Center 03/26/2014 @GTC, San Jose Overview Introduction Fundamentals of the FDTD method Implementation of 3D UPML-FDTD algorithm on GPU
More informationObservational Cosmology
Astr 102: Introduction to Astronomy Fall Quarter 2009, University of Washington, Željko Ivezić Lecture 15: Observational Cosmology 1 Outline Observational Cosmology: observations that allow us to test
More informationGravitational microlensing and its capabilities for research of the dark matter. Lyudmila Berdina Institute of Radio Astronomy NAS of Ukraine
Gravitational microlensing and its capabilities for research of the dark matter Lyudmila Berdina Institute of Radio Astronomy NAS of Ukraine Gravitational lensing Spherically symmetric mass distribution
More informationHow can Mathematics Reveal Dark Matter?
How can Mathematics Reveal? Chuck Keeton Rutgers University April 2, 2010 Evidence for dark matter galaxy dynamics clusters of galaxies (dynamics, X-rays) large-scale structure cosmography gravitational
More informationRWTH Aachen University
IPCC @ RWTH Aachen University Optimization of multibody and long-range solvers in LAMMPS Rodrigo Canales William McDoniel Markus Höhnerbach Ahmed E. Ismail Paolo Bientinesi IPCC Showcase November 2016
More informationAdvanced Vectorization of PPML Method for Intel Xeon Scalable Processors
Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors Igor Chernykh 1, Igor Kulikov 1, Boris Glinsky 1, Vitaly Vshivkov 1, Lyudmila Vshivkova 1, Vladimir Prigarin 1 Institute of Computational
More informationThe Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems
The Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems Wu Feng and Balaji Subramaniam Metrics for Energy Efficiency Energy- Delay Product (EDP) Used primarily in circuit design
More informationGPU Computing Activities in KISTI
International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr
More informationPerformance Evaluation of Scientific Applications on POWER8
Performance Evaluation of Scientific Applications on POWER8 2014 Nov 16 Andrew V. Adinetz 1, Paul F. Baumeister 1, Hans Böttiger 3, Thorsten Hater 1, Thilo Maurer 3, Dirk Pleiter 1, Wolfram Schenck 4,
More informationDark Matter Detection: Finding a Halo in a Haystack
Dark Matter Detection: Finding a Halo in a Haystack Paul Covington, Dan Frank, Alex Ioannidis Introduction TM The predictive modeling competition platform Kaggle recently posed the Observing Dark Worlds
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationQuantum ESPRESSO Performance Benchmark and Profiling. February 2017
Quantum ESPRESSO Performance Benchmark and Profiling February 2017 2 Note The following research was performed under the HPC Advisory Council activities Compute resource - HPC Advisory Council Cluster
More informationEfficient multigrid solvers for mixed finite element discretisations in NWP models
1/20 Efficient multigrid solvers for mixed finite element discretisations in NWP models Colin Cotter, David Ham, Lawrence Mitchell, Eike Hermann Müller *, Robert Scheichl * * University of Bath, Imperial
More informationOptimization strategy for MASNUM surface wave model
Hillsboro, September 27, 2018 Optimization strategy for MASNUM surface wave model Zhenya Song *, + * First Institute of Oceanography (FIO), State Oceanic Administrative (SOA), China + Intel Parallel Computing
More informationA Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters
A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!
More informationDark Matter. Homework 3 due. ASTR 433 Projects 4/17: distribute abstracts 4/19: 20 minute talks. 4/24: Homework 4 due 4/26: Exam ASTR 333/433.
Dark Matter ASTR 333/433 Today Clusters of Galaxies Homework 3 due ASTR 433 Projects 4/17: distribute abstracts 4/19: 20 minute talks 4/24: Homework 4 due 4/26: Exam Galaxy Clusters 4 distinct measures:
More informationAST1100 Lecture Notes
AST00 Lecture Notes Part E General Relativity: Gravitational lensing Questions to ponder before the lecture. Newton s law of gravitation shows the dependence of the gravitational force on the mass. In
More informationCosmological Tests of Gravity
Cosmological Tests of Gravity Levon Pogosian Simon Fraser University, Canada VIA Lecture, 16 May, 2014 Workshop on Testing Gravity at SFU Harbour Centre January 15-17, 2015 Alternative theories of gravity
More informationResearch of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR
Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR Podgainy D.V., Streltsova O.I., Zuev M.I. on behalf of Heterogeneous Computations team HybriLIT LIT,
More informationAPPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD
APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia
More informationIs there a magnification paradox in gravitational lensing?
Is there a magnification paradox in gravitational lensing? Olaf Wucknitz wucknitz@astro.uni-bonn.de Astrophysics seminar/colloquium, Potsdam, 26 November 2007 Is there a magnification paradox in gravitational
More informationPetascale Quantum Simulations of Nano Systems and Biomolecules
Petascale Quantum Simulations of Nano Systems and Biomolecules Emil Briggs North Carolina State University 1. Outline of real-space Multigrid (RMG) 2. Scalability and hybrid/threaded models 3. GPU acceleration
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationThe ultimate measurement of the CMB temperature anisotropy field UNVEILING THE CMB SKY
The ultimate measurement of the CMB temperature anisotropy field UNVEILING THE CMB SKY PARAMETRIC MODEL 16 spectra in total C(θ) = CMB theoretical spectra plus physically motivated templates for the
More informationOn Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code
On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance
More informationStatistics of the Universe: Exa-calculations and Cosmology's Data Deluge
Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Debbie Bard Matt Bellis Cosmology: the study of the nature and history of the Universe History of Universe driven by competing forces:
More informationPerformance Evaluation of MPI on Weather and Hydrological Models
NCAR/RAL Performance Evaluation of MPI on Weather and Hydrological Models Alessandro Fanfarillo elfanfa@ucar.edu August 8th 2018 Cheyenne - NCAR Supercomputer Cheyenne is a 5.34-petaflops, high-performance
More informationSoftware optimization for petaflops/s scale Quantum Monte Carlo simulations
Software optimization for petaflops/s scale Quantum Monte Carlo simulations A. Scemama 1, M. Caffarel 1, E. Oseret 2, W. Jalby 2 1 Laboratoire de Chimie et Physique Quantiques / IRSAMC, Toulouse, France
More informationPerformance of Met Office Weather and Climate Codes on Cavium ThunderX2 Processors. Adam Voysey, Maff Glover HPC Optimisation Team
Performance of Met Office Weather and Climate Codes on Cavium ThunderX2 Processors Adam Voysey, Maff Glover HPC Optimisation Team Contents Introduction The Met Office and why we use HPC UM and NEMO Results
More informationParallel Polynomial Evaluation
Parallel Polynomial Evaluation Jan Verschelde joint work with Genady Yoffe University of Illinois at Chicago Department of Mathematics, Statistics, and Computer Science http://www.math.uic.edu/ jan jan@math.uic.edu
More informationTowards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters
Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM
More informationRecent Progress of Parallel SAMCEF with MUMPS MUMPS User Group Meeting 2013
Recent Progress of Parallel SAMCEF with User Group Meeting 213 Jean-Pierre Delsemme Product Development Manager Summary SAMCEF, a brief history Co-simulation, a good candidate for parallel processing MAAXIMUS,
More informationThe Age of the Universe If the entire age of the Universe were 1 calendar year, then 1 month would be equivalent to roughly 1 billion years
Astro.101 Sept. 9, 2008 Announcements A few slots are still open in the class; see prof. to sign up Web-page computer has been down; o.k. now Turn in your student contract Don t forget to do the OWL tutorial
More informationHPMPC - A new software package with efficient solvers for Model Predictive Control
- A new software package with efficient solvers for Model Predictive Control Technical University of Denmark CITIES Second General Consortium Meeting, DTU, Lyngby Campus, 26-27 May 2015 Introduction Model
More informationBROCK UNIVERSITY. Test 2, March 2018 Number of pages: 9 Course: ASTR 1P02, Section 1 Number of Students: 465 Date of Examination: March 12, 2018
BROCK UNIVERSITY Page 1 of 9 Test 2, March 2018 Number of pages: 9 Course: ASTR 1P02, Section 1 Number of Students: 465 Date of Examination: March 12, 2018 Number of hours: 50 min Time of Examination:
More informationParticle Dynamics with MBD and FEA Using CUDA
Particle Dynamics with MBD and FEA Using CUDA Graham Sanborn, PhD Senior Research Engineer Solver 2 (MFBD) Team FunctionBay, Inc., S. Korea Overview MFBD: Multi-Flexible-Body Dynamics Rigid & flexible
More informationIntroduction to (Strong) Gravitational Lensing: Basics and History. Joachim Wambsganss Zentrum für Astronomie der Universität Heidelberg (ZAH/ARI)
Introduction to (Strong) Gravitational Lensing: Basics and History Joachim Wambsganss Zentrum für Astronomie der Universität Heidelberg (ZAH/ARI) Introduction to (Strong) Gravitational Lensing: Basics
More informationSTUDY OF THE LARGE-SCALE STRUCTURE OF THE UNIVERSE USING GALAXY CLUSTERS
STUDY OF THE LARGE-SCALE STRUCTURE OF THE UNIVERSE USING GALAXY CLUSTERS BÙI VĂN TUẤN Advisors: Cyrille Rosset, Michel Crézé, James G. Bartlett ASTROPARTICLE AND COSMOLOGY LABORATORY PARIS DIDEROT UNIVERSITY
More informationSource plane reconstruction of the giant gravitational arc in Abell 2667: a condidate Wolf-Rayet galaxy at z 1
Source plane reconstruction of the giant gravitational arc in Abell 2667: a condidate Wolf-Rayet galaxy at z 1 Speaker: Shuo Cao Department of Astronomy Beijing Normal University Collaborators: Giovanni
More informationBachelor and MSc thesis with CTAC (Center for Theoretical Astrophysics and Cosmology), Institute for Computational Science (UZH)
Bachelor and MSc thesis with CTAC (Center for Theoretical Astrophysics and Cosmology), Institute for Computational Science (UZH) http://www.ics.uzh.ch General topics: (i) Theoretical and computational
More informationPerformance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures
Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures José I. Aliaga Performance and Energy Analysis of the Iterative Solution of Sparse
More informationBROAD SPECTRAL LINE AND CONTINUUM VARIABILITIES IN QSO SPECTRA INDUCED BY MICROLENSING:METHODS OF COMPUTATION
Proceedings of the IX Bulgarian-Serbian Astronomical Conference: Astroinformatics (IX BSACA) Sofia, Bulgaria, July -,, Editors: M. K. Tsvetkov, M. S. Dimitrijević, O. Kounchev, D. Jevremović andk. Tsvetkova
More information1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria
1 Overview Improving LSTC s Multifrontal Linear Solver Roger Grimes 3, Robert Lucas 3, Nick Meng 2, Francois-Henry Rouet 3, Clement Weisbecker 3, and Ting-Ting Zhu 1 1 Cray Incorporated 2 Intel Corporation
More informationFine-grained Parallel Incomplete LU Factorization
Fine-grained Parallel Incomplete LU Factorization Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Sparse Days Meeting at CERFACS June 5-6, 2014 Contribution
More informationGravitational Lensing
Gravitational Lensing Gravitational lensing, which is the deflection of light by gravitational fields and the resulting effect on images, is widely useful in cosmology and, at the same time, a source of
More informationInformation Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and
Accelerating the Multifrontal Method Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes {rflucas,genew,ddavis}@isi.edu and grimes@lstc.com 3D Finite Element
More informationGRAVITATIONAL WAVES. Eanna E. Flanagan Cornell University. Presentation to CAA, 30 April 2003 [Some slides provided by Kip Thorne]
GRAVITATIONAL WAVES Eanna E. Flanagan Cornell University Presentation to CAA, 30 April 2003 [Some slides provided by Kip Thorne] Summary of talk Review of observational upper limits and current and planned
More informationWeak Lensing. Alan Heavens University of Edinburgh UK
Weak Lensing Alan Heavens University of Edinburgh UK Outline History Theory Observational status Systematics Prospects Weak Gravitational Lensing Coherent distortion of background images Shear, Magnification,
More informationJohn C. Linford. ParaTools, Inc. EMiT 15, Manchester UK 1 July 2015
John C. Linford jlinford@paratools.com ParaTools, Inc. EMiT 15, Manchester UK 1 July 2015 CLIMATE & ATMOSPHERE Air and water quality Climate change Wildfire tracking Volcanic eruptions EMiT'15, Copyright
More informationReflecting on the Goal and Baseline of Exascale Computing
Reflecting on the Goal and Baseline of Exascale Computing Thomas C. Schulthess!1 Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b!2 Tracking supercomputer performance over
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationSunrise: Patrik Jonsson. Panchromatic SED Models of Simulated Galaxies. Lecture 2: Working with Sunrise. Harvard-Smithsonian Center for Astrophysics
Sunrise: Panchromatic SED Models of Simulated Galaxies Lecture 2: Working with Sunrise Patrik Jonsson Harvard-Smithsonian Center for Astrophysics Lecture outline Lecture 1: Why Sunrise? What does it do?
More informationWeile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3
A plane wave pseudopotential density functional theory molecular dynamics code on multi-gpu machine - GPU Technology Conference, San Jose, May 17th, 2012 Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun
More informationChapter 16 Dark Matter, Dark Energy, & The Fate of the Universe
16.1 Unseen Influences Chapter 16 Dark Matter, Dark Energy, & The Fate of the Universe Dark Matter: An undetected form of mass that emits little or no light but whose existence we infer from its gravitational
More informationDomain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016
Domain specific libraries Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Part 1: Introduction Kohn-Shame equations 1 2 Eigen-value problem + v eff (r) j(r)
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More information