Lenstool-HPC. From scratch to supercomputers: building a large-scale strong lensing computational software bottom-up. HPC Advisory Council, April 2018

Size: px
Start display at page:

Download "Lenstool-HPC. From scratch to supercomputers: building a large-scale strong lensing computational software bottom-up. HPC Advisory Council, April 2018"

Transcription

1 LenstoolHPC From scratch to supercomputers: building a largescale strong lensing computational software bottomup HPC Advisory Council, April 2018 Christoph Schäfer and Markus Rexroth (LASTRO) Gilles Fourestey (SCITAS)

2 Gravitational lensing Einstein ring (credit: Nasa/Hubble)

3 Gravitational lensing Einstein ring (credit: Nasa/Hubble)

4 Gravitational lensing Light refraction caused by a distribution of matter according to Albert Einstein's general theory of relativity (1916) Article about star GR in 1936 (A. Einstein, Science) Fritz Zwicky posited in 1937 that the effect could allow galaxy clusters to act like lenses First observed in 1979 "Twin QSO" SBS Twin QSO (center), Credit: ESA/Hubble & NASA

5 Gravitational lensing Optical artefacts created by dense mass distributions Galaxies Dark matter Black holes Parametric Lens Model the ellipticity of the projected mass distribution ω the finite core radius 0 the normalized surface mass density (x,y) the lens position... Reverse engineer the lenses: Recompose faraway objects by computing the lenses mass Typical search space dimension: >1010 Using the vanilla version of lenstool requires months to find the optimal solution!

6 LenstoolHPC Motivations LenstoolHPC was developed: based on Lenstool (Pr. Kneib et al., from 1996 onwards), In 6 manmonth FTE, By two field scientists and one application expert, bottomup from scratch. No separation of concern: Field scientists define algorithmic constrains at every step Computer scientists provide the most optimized implementation on specific hardware extreme(ish) programming Performance is scaled bottomup: Focus on algorithms/kernels and data structures Performance scaling from core to full machine

7 Formalism Source s position on the source plane: 2D lensing potential: Example of gradient (SIS): The lens equation:

8 Strong Lensing Algorithm Step 0 Finite core radius Given a parametric model for all the lens types: normalized surface mass density Step 0: Compute all the gradients (~90% of TTS) DLP for each pixel of the image Mapping algorithms to the hardware: High performance data structures (SOA) Implicit and Explicit (handcoded) vectorization Ellipticity of the projected mass distribution Position

9 Strong Lensing Algorithm Step 1 Given a parametric model for all the lens types Step 0: Compute all the gradients unlensing Step 1a: unlensing (linear transformation) TLP lensing the green dots (images) to the Source plane (yellow dot) Compute the barycenter of the yellow dots Step 1b: relensing (nonlinear transformation) TLP Decompose the Image plane into triangles Lense the triangles to the Source plane If the lensed triangle includes the barycenter, a predicted image is found (red triangles in Image plane) relensing

10 Strong Lensing Algorithm Step 2 & 3 Given a parametric model for all the lens types Step 0: Compute all the gradients Step 1a: unlensing (linear transformation) lensing the green dots (images) to the Source plane (yellow dot) Compute the barycenter of the yellow dots Step 1b: relensing (nonlinear transformation) Decompose the Image plane into triangles Lense the triangles to the Source plane If the lensed triangle includes the barycenter, a predicted image is found (red triangles in Image plane) Step 2: (MPI) Compute Step 3: Pass the Chi2 to a Bayesian MCMC code (MPI) Restart with new set of parameters until close to reality

11 Performance scaling Strong Lensing Algorithm

12 Gradient Benchmark Results (Step 0) Gradient benchmark computation: 5000x5000 pixels image, 69 sources, 203 constraints AVX2* Code AVX512F* TTS Factor TTS Factor Lenstool s 1X 4.8s 1X LenstoolHPC AOS 0.8s 1.3X 5.6s 0.9X LenstoolHPC SOA 0.5s 2.0X 3.3s 1.4X LenstoolHPC SOA + DLP 0.2s 4.5X 0.4s 11.4X Performance on Broadwell: IACA: ~ 6 Flops/cycle Intel Advisor: ~25% of peak *AVX2: Broadwell Intel Xeon CPU E GHz, intel compilers 17 *AVX512F: Intel Xeon Phi CPU 1.30GHz, intel compilers 17

13 Distributed Grid Gradient Grid Gradient computation distribution (step 1): Images split into regular subdomains with MPI Subdomains are handled using OpenMP/CUDA

14 Grid Gradient Benchmark (Step 1) AVX2 Grid Gradient benchmark (TTS, in s) AVX v4 2695v4 (PizDaint) SKL Plat HT KNL (greina) P100 (greina) V100 lenstool (TLP) NA NA lenstoolhpc (SOA + TLP) NA NA lenstoolhpc (SOA + TLP + DLP/SIMT) Single node Grid Gradient benchmark computation: 6000x6000 pixels image, 69 sources, 203 constraints. SIMT TLP is giving the best bang for your bucks SOA alone gives a nice boost (and is mandatory for efficient DLP) DLP is getting better with wider vector sizes (avx512 is ~2x avx2). V100 is much faster than P100

15 Chi2 computation The Chi2 is computed by computing the distance between the original images and their computed unlensed/relensed projections from steps 1a and 1b The blue dots correspond to the same image in the source plane Each distance for the same source (in blue) are reduced to Rank 0 using MPI_Pack The Chi2 is computed on Rank 0

16 DaintGPU: Chi2 (Step 2) Strong Scaling Num. nodes Grid Gradient Comp Quadrant unlensing MPI reduction TTS Scalability of the Chi2 benchmark using a 8k x 8k image, 69 sources, 203 constraints on Piz Daint multicore, 1 MPI process and 18 threads per socket, in seconds

17 DaintMC: Chi2 (Step 2) Strong Scaling Num. nodes Grid Gradient Comp. Quadrant unlensing MPI reduction TTS Scalability of the Chi2 benchmark using a 8k x 8k image, 69 sources, 203 constraints on Piz Daint multicore, 1 MPI process and 18 threads per socket, in seconds This represents a 50X compared to Lenstool in 6 months FTE

18 Current Status and Next Steps Development: Code on c4science, with unit tests for each kernels (lensing, unlensing, Chi2 ) Large development project on CSCS Piz Daint Aries network tuning GPU tuning: lensing, unlensing and chi computation are (very) regular Development a parallel MCMC framework, could lead to a 500X speedup, e.g. Pi4u: (P. E. Hadjidoukas et al., ETHZ) Papers: High Performance Computing for gravitational lens modeling: single vs double precision on GPUs and CPUs Markus Rexroth, Christoph Schafer, Gilles Fourestey, JeanPaul Kneib To be submitted High Performance Strong Lensing Map Generation for Lenstool Christoph Schafer, Gilles Fourestey, JeanPaul Kneib In Preparation

19 Lensing Map Generation Maps based on second derivative of lensing potentials (Mass, Amplification, Shear) Used for calculation of statistical errors of the MCMC method Sampling of parameter space Compute average and standard deviation for every pixel Added to best prediction, gives asymmetric error bars Fast Map generation crucial Actual process takes months Grid Gradient 2 benchmark TTS, in s lenstool lenstoolhpc 1.3 Speedup Single node Grid Gradient benchmark computation: 4200x4200 pixels image, 201 individual lenses. x567 Lenstool: Intel(R) Xeon(R) CPU E GHz Lenstool HPC: P100

20 Brownie points Thanks to Pr. JeanPaul Kneib (LASTRO, EPFL), Pr. Jan Hesthaven and Dr. Vittoria Rezzonico (SCITAS, EPFL) Thanks to Colin McMurtrie and Hussein ElHarake from CSCS for their support using the CSCS test cluster

21 Questions?

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures

A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

More Science per Joule: Bottleneck Computing

More Science per Joule: Bottleneck Computing More Science per Joule: Bottleneck Computing Georg Hager Erlangen Regional Computing Center (RRZE) University of Erlangen-Nuremberg Germany PPAM 2013 September 9, 2013 Warsaw, Poland Motivation (1): Scalability

More information

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess

From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system. Oliver Fuhrer and Thomas C. Schulthess From Piz Daint to Piz Kesch : the making of a GPU-based weather forecasting system Oliver Fuhrer and Thomas C. Schulthess 1 Piz Daint Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Compute node:

More information

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

The Fast Multipole Method in molecular dynamics

The Fast Multipole Method in molecular dynamics The Fast Multipole Method in molecular dynamics Berk Hess KTH Royal Institute of Technology, Stockholm, Sweden ADAC6 workshop Zurich, 20-06-2018 Slide BioExcel Slide Molecular Dynamics of biomolecules

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

WRF performance tuning for the Intel Woodcrest Processor

WRF performance tuning for the Intel Woodcrest Processor WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,

More information

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology

More information

Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling

Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling 2019 Intel extreme Performance Users Group (IXPUG) meeting Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr)

More information

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess

Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting. Thomas C. Schulthess Piz Daint & Piz Kesch : from general purpose supercomputing to an appliance for weather forecasting Thomas C. Schulthess 1 Cray XC30 with 5272 hybrid, GPU accelerated compute nodes Piz Daint Compute node:

More information

Cosmological N-Body Simulations and Galaxy Surveys

Cosmological N-Body Simulations and Galaxy Surveys Cosmological N-Body Simulations and Galaxy Surveys Adrian Pope, High Energy Physics, Argonne Na3onal Laboratory, apope@anl.gov CScADS: Scien3fic Data and Analy3cs for Extreme- scale Compu3ng, 30 July 2012

More information

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012 Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,

More information

Gravitational Lensing. A Brief History, Theory, and Applications

Gravitational Lensing. A Brief History, Theory, and Applications Gravitational Lensing A Brief History, Theory, and Applications A Brief History Einstein (1915): light deflection by point mass M due to bending of space-time = 2x Newtonian light tangentially grazing

More information

Observing Dark Worlds (Final Report)

Observing Dark Worlds (Final Report) Observing Dark Worlds (Final Report) Bingrui Joel Li (0009) Abstract Dark matter is hypothesized to account for a large proportion of the universe s total mass. It does not emit or absorb light, making

More information

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing

More information

Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling

Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling 2019 Intel extreme Performance Users Group (IXPUG) meeting Massively scalable computing method to tackle large eigenvalue problems for nanoelectronics modeling Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr)

More information

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts

More information

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

Some thoughts about energy efficient application execution on NEC LX Series compute clusters Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science

More information

Cluster strong lensing as a probe of the high redshift Universe

Cluster strong lensing as a probe of the high redshift Universe Cluster strong lensing as a probe of the high redshift Universe Jean-Paul KNEIB Laboratoire Astrophysique de Marseille (LAM) now on leave at: LASTRO - EPFL Mont-Blanc 1 Chapters Introduction on cluster

More information

Dark matter: summary

Dark matter: summary Dark matter: summary Gravity and detecting Dark Matter Massive objects, even if they emit no light, exert gravitational forces on other massive objects. m 1 r 12 m 2 We study the motions (dynamics) of

More information

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms

More information

Review for the Midterm Exam

Review for the Midterm Exam Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp

More information

ERLANGEN REGIONAL COMPUTING CENTER

ERLANGEN REGIONAL COMPUTING CENTER ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,

More information

The Memory Intensive System

The Memory Intensive System DiRAC@Durham The Memory Intensive System The DiRAC-2.5x Memory Intensive system at Durham in partnership with Dell Dr Lydia Heck, Technical Director ICC HPC and DiRAC Technical Manager 1 DiRAC Who we are:

More information

Some notes on efficient computing and setting up high performance computing environments

Some notes on efficient computing and setting up high performance computing environments Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient

More information

Extragalactic DM Halos and QSO Properties Through Microlensing

Extragalactic DM Halos and QSO Properties Through Microlensing Extragalactic DM Halos and QSO Properties Through Micro Eduardo Guerras (student) - Evencio Mediavilla (supervisor) Instituto de Astrofísica de Canarias Photon deflection by gravitating mass α = 4GM 2

More information

High-performance processing and development with Madagascar. July 24, 2010 Madagascar development team

High-performance processing and development with Madagascar. July 24, 2010 Madagascar development team High-performance processing and development with Madagascar July 24, 2010 Madagascar development team Outline 1 HPC terminology and frameworks 2 Utilizing data parallelism 3 HPC development with Madagascar

More information

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen

Verbundprojekt ELPA-AEO. Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen Verbundprojekt ELPA-AEO http://elpa-aeo.mpcdf.mpg.de Eigenwert-Löser für Petaflop-Anwendungen Algorithmische Erweiterungen und Optimierungen BMBF Projekt 01IH15001 Feb 2016 - Jan 2019 7. HPC-Statustagung,

More information

Scalable and Power-Efficient Data Mining Kernels

Scalable and Power-Efficient Data Mining Kernels Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the

More information

Introduction to Benchmark Test for Multi-scale Computational Materials Software

Introduction to Benchmark Test for Multi-scale Computational Materials Software Introduction to Benchmark Test for Multi-scale Computational Materials Software Shun Xu*, Jian Zhang, Zhong Jin xushun@sccas.cn Computer Network Information Center Chinese Academy of Sciences (IPCC member)

More information

Gravitational Lensing: Strong, Weak and Micro

Gravitational Lensing: Strong, Weak and Micro P. Schneider C. Kochanek J. Wambsganss Gravitational Lensing: Strong, Weak and Micro Saas-Fee Advanced Course 33 Swiss Society for Astrophysics and Astronomy Edited by G. Meylan, P. Jetzer and P. North

More information

Astronomical Computer Simulations. Aaron Smith

Astronomical Computer Simulations. Aaron Smith Astronomical Computer Simulations Aaron Smith 1 1. The Purpose and History of Astronomical Computer Simulations 2. Algorithms 3. Systems/Architectures 4. Simulation/Projects 2 The Purpose of Astronomical

More information

arxiv: v1 [astro-ph.im] 4 May 2015

arxiv: v1 [astro-ph.im] 4 May 2015 Draft version June 20, 2018 Preprint typeset using L A TEX style emulateapj v. 5/2/11 COMPARISON OF STRONG GRAVITATIONAL LENS MODEL SOFTWARE III. DIRECT AND INDIRECT SEMI-INDEPENDENT LENS MODEL COMPARISONS

More information

The Gravitational Microlensing Planet Search Technique from Space

The Gravitational Microlensing Planet Search Technique from Space The Gravitational Microlensing Planet Search Technique from Space David Bennett & Sun Hong Rhie (University of Notre Dame) Abstract: Gravitational microlensing is the only known extra-solar planet search

More information

The phenomenon of gravitational lenses

The phenomenon of gravitational lenses The phenomenon of gravitational lenses The phenomenon of gravitational lenses If we look carefully at the image taken with the Hubble Space Telescope, of the Galaxy Cluster Abell 2218 in the constellation

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Statistics of flux ratios in strong lenses: probing of dark matter on small scales

Statistics of flux ratios in strong lenses: probing of dark matter on small scales Statistics of flux ratios in strong lenses: probing of dark matter on small scales Daniel Gilman (UCLA) With: Simon Birrer, Tommaso Treu, Anna Nierenberg, Chuck Keeton, Andrew Benson image: ESA/Hubble,

More information

HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data

HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data Sam L. Shue, Andrew R. Willis, and Thomas P. Weldon Dept. of Electrical and Computer Engineering University of North

More information

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is

More information

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago

More information

Practical Combustion Kinetics with CUDA

Practical Combustion Kinetics with CUDA Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides

More information

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose 上海超级计算中心 Shanghai Supercomputer Center Lei Xu Shanghai Supercomputer Center 03/26/2014 @GTC, San Jose Overview Introduction Fundamentals of the FDTD method Implementation of 3D UPML-FDTD algorithm on GPU

More information

Observational Cosmology

Observational Cosmology Astr 102: Introduction to Astronomy Fall Quarter 2009, University of Washington, Željko Ivezić Lecture 15: Observational Cosmology 1 Outline Observational Cosmology: observations that allow us to test

More information

Gravitational microlensing and its capabilities for research of the dark matter. Lyudmila Berdina Institute of Radio Astronomy NAS of Ukraine

Gravitational microlensing and its capabilities for research of the dark matter. Lyudmila Berdina Institute of Radio Astronomy NAS of Ukraine Gravitational microlensing and its capabilities for research of the dark matter Lyudmila Berdina Institute of Radio Astronomy NAS of Ukraine Gravitational lensing Spherically symmetric mass distribution

More information

How can Mathematics Reveal Dark Matter?

How can Mathematics Reveal Dark Matter? How can Mathematics Reveal? Chuck Keeton Rutgers University April 2, 2010 Evidence for dark matter galaxy dynamics clusters of galaxies (dynamics, X-rays) large-scale structure cosmography gravitational

More information

RWTH Aachen University

RWTH Aachen University IPCC @ RWTH Aachen University Optimization of multibody and long-range solvers in LAMMPS Rodrigo Canales William McDoniel Markus Höhnerbach Ahmed E. Ismail Paolo Bientinesi IPCC Showcase November 2016

More information

Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors

Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors Advanced Vectorization of PPML Method for Intel Xeon Scalable Processors Igor Chernykh 1, Igor Kulikov 1, Boris Glinsky 1, Vitaly Vshivkov 1, Lyudmila Vshivkova 1, Vladimir Prigarin 1 Institute of Computational

More information

The Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems

The Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems The Green Index (TGI): A Metric for Evalua:ng Energy Efficiency in HPC Systems Wu Feng and Balaji Subramaniam Metrics for Energy Efficiency Energy- Delay Product (EDP) Used primarily in circuit design

More information

GPU Computing Activities in KISTI

GPU Computing Activities in KISTI International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr

More information

Performance Evaluation of Scientific Applications on POWER8

Performance Evaluation of Scientific Applications on POWER8 Performance Evaluation of Scientific Applications on POWER8 2014 Nov 16 Andrew V. Adinetz 1, Paul F. Baumeister 1, Hans Böttiger 3, Thorsten Hater 1, Thilo Maurer 3, Dirk Pleiter 1, Wolfram Schenck 4,

More information

Dark Matter Detection: Finding a Halo in a Haystack

Dark Matter Detection: Finding a Halo in a Haystack Dark Matter Detection: Finding a Halo in a Haystack Paul Covington, Dan Frank, Alex Ioannidis Introduction TM The predictive modeling competition platform Kaggle recently posed the Observing Dark Worlds

More information

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric

More information

Quantum ESPRESSO Performance Benchmark and Profiling. February 2017

Quantum ESPRESSO Performance Benchmark and Profiling. February 2017 Quantum ESPRESSO Performance Benchmark and Profiling February 2017 2 Note The following research was performed under the HPC Advisory Council activities Compute resource - HPC Advisory Council Cluster

More information

Efficient multigrid solvers for mixed finite element discretisations in NWP models

Efficient multigrid solvers for mixed finite element discretisations in NWP models 1/20 Efficient multigrid solvers for mixed finite element discretisations in NWP models Colin Cotter, David Ham, Lawrence Mitchell, Eike Hermann Müller *, Robert Scheichl * * University of Bath, Imperial

More information

Optimization strategy for MASNUM surface wave model

Optimization strategy for MASNUM surface wave model Hillsboro, September 27, 2018 Optimization strategy for MASNUM surface wave model Zhenya Song *, + * First Institute of Oceanography (FIO), State Oceanic Administrative (SOA), China + Intel Parallel Computing

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

Dark Matter. Homework 3 due. ASTR 433 Projects 4/17: distribute abstracts 4/19: 20 minute talks. 4/24: Homework 4 due 4/26: Exam ASTR 333/433.

Dark Matter. Homework 3 due. ASTR 433 Projects 4/17: distribute abstracts 4/19: 20 minute talks. 4/24: Homework 4 due 4/26: Exam ASTR 333/433. Dark Matter ASTR 333/433 Today Clusters of Galaxies Homework 3 due ASTR 433 Projects 4/17: distribute abstracts 4/19: 20 minute talks 4/24: Homework 4 due 4/26: Exam Galaxy Clusters 4 distinct measures:

More information

AST1100 Lecture Notes

AST1100 Lecture Notes AST00 Lecture Notes Part E General Relativity: Gravitational lensing Questions to ponder before the lecture. Newton s law of gravitation shows the dependence of the gravitational force on the mass. In

More information

Cosmological Tests of Gravity

Cosmological Tests of Gravity Cosmological Tests of Gravity Levon Pogosian Simon Fraser University, Canada VIA Lecture, 16 May, 2014 Workshop on Testing Gravity at SFU Harbour Centre January 15-17, 2015 Alternative theories of gravity

More information

Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR

Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR Research of the new Intel Xeon Phi architecture for solving a wide range of scientific problems at JINR Podgainy D.V., Streltsova O.I., Zuev M.I. on behalf of Heterogeneous Computations team HybriLIT LIT,

More information

APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD

APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD APPLICATION OF CUDA TECHNOLOGY FOR CALCULATION OF GROUND STATES OF FEW-BODY NUCLEI BY FEYNMAN'S CONTINUAL INTEGRALS METHOD M.A. Naumenko, V.V. Samarin Joint Institute for Nuclear Research, Dubna, Russia

More information

Is there a magnification paradox in gravitational lensing?

Is there a magnification paradox in gravitational lensing? Is there a magnification paradox in gravitational lensing? Olaf Wucknitz wucknitz@astro.uni-bonn.de Astrophysics seminar/colloquium, Potsdam, 26 November 2007 Is there a magnification paradox in gravitational

More information

Petascale Quantum Simulations of Nano Systems and Biomolecules

Petascale Quantum Simulations of Nano Systems and Biomolecules Petascale Quantum Simulations of Nano Systems and Biomolecules Emil Briggs North Carolina State University 1. Outline of real-space Multigrid (RMG) 2. Scalability and hybrid/threaded models 3. GPU acceleration

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

The ultimate measurement of the CMB temperature anisotropy field UNVEILING THE CMB SKY

The ultimate measurement of the CMB temperature anisotropy field UNVEILING THE CMB SKY The ultimate measurement of the CMB temperature anisotropy field UNVEILING THE CMB SKY PARAMETRIC MODEL 16 spectra in total C(θ) = CMB theoretical spectra plus physically motivated templates for the

More information

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code

On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance

More information

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Debbie Bard Matt Bellis Cosmology: the study of the nature and history of the Universe History of Universe driven by competing forces:

More information

Performance Evaluation of MPI on Weather and Hydrological Models

Performance Evaluation of MPI on Weather and Hydrological Models NCAR/RAL Performance Evaluation of MPI on Weather and Hydrological Models Alessandro Fanfarillo elfanfa@ucar.edu August 8th 2018 Cheyenne - NCAR Supercomputer Cheyenne is a 5.34-petaflops, high-performance

More information

Software optimization for petaflops/s scale Quantum Monte Carlo simulations

Software optimization for petaflops/s scale Quantum Monte Carlo simulations Software optimization for petaflops/s scale Quantum Monte Carlo simulations A. Scemama 1, M. Caffarel 1, E. Oseret 2, W. Jalby 2 1 Laboratoire de Chimie et Physique Quantiques / IRSAMC, Toulouse, France

More information

Performance of Met Office Weather and Climate Codes on Cavium ThunderX2 Processors. Adam Voysey, Maff Glover HPC Optimisation Team

Performance of Met Office Weather and Climate Codes on Cavium ThunderX2 Processors. Adam Voysey, Maff Glover HPC Optimisation Team Performance of Met Office Weather and Climate Codes on Cavium ThunderX2 Processors Adam Voysey, Maff Glover HPC Optimisation Team Contents Introduction The Met Office and why we use HPC UM and NEMO Results

More information

Parallel Polynomial Evaluation

Parallel Polynomial Evaluation Parallel Polynomial Evaluation Jan Verschelde joint work with Genady Yoffe University of Illinois at Chicago Department of Mathematics, Statistics, and Computer Science http://www.math.uic.edu/ jan jan@math.uic.edu

More information

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM

More information

Recent Progress of Parallel SAMCEF with MUMPS MUMPS User Group Meeting 2013

Recent Progress of Parallel SAMCEF with MUMPS MUMPS User Group Meeting 2013 Recent Progress of Parallel SAMCEF with User Group Meeting 213 Jean-Pierre Delsemme Product Development Manager Summary SAMCEF, a brief history Co-simulation, a good candidate for parallel processing MAAXIMUS,

More information

The Age of the Universe If the entire age of the Universe were 1 calendar year, then 1 month would be equivalent to roughly 1 billion years

The Age of the Universe If the entire age of the Universe were 1 calendar year, then 1 month would be equivalent to roughly 1 billion years Astro.101 Sept. 9, 2008 Announcements A few slots are still open in the class; see prof. to sign up Web-page computer has been down; o.k. now Turn in your student contract Don t forget to do the OWL tutorial

More information

HPMPC - A new software package with efficient solvers for Model Predictive Control

HPMPC - A new software package with efficient solvers for Model Predictive Control - A new software package with efficient solvers for Model Predictive Control Technical University of Denmark CITIES Second General Consortium Meeting, DTU, Lyngby Campus, 26-27 May 2015 Introduction Model

More information

BROCK UNIVERSITY. Test 2, March 2018 Number of pages: 9 Course: ASTR 1P02, Section 1 Number of Students: 465 Date of Examination: March 12, 2018

BROCK UNIVERSITY. Test 2, March 2018 Number of pages: 9 Course: ASTR 1P02, Section 1 Number of Students: 465 Date of Examination: March 12, 2018 BROCK UNIVERSITY Page 1 of 9 Test 2, March 2018 Number of pages: 9 Course: ASTR 1P02, Section 1 Number of Students: 465 Date of Examination: March 12, 2018 Number of hours: 50 min Time of Examination:

More information

Particle Dynamics with MBD and FEA Using CUDA

Particle Dynamics with MBD and FEA Using CUDA Particle Dynamics with MBD and FEA Using CUDA Graham Sanborn, PhD Senior Research Engineer Solver 2 (MFBD) Team FunctionBay, Inc., S. Korea Overview MFBD: Multi-Flexible-Body Dynamics Rigid & flexible

More information

Introduction to (Strong) Gravitational Lensing: Basics and History. Joachim Wambsganss Zentrum für Astronomie der Universität Heidelberg (ZAH/ARI)

Introduction to (Strong) Gravitational Lensing: Basics and History. Joachim Wambsganss Zentrum für Astronomie der Universität Heidelberg (ZAH/ARI) Introduction to (Strong) Gravitational Lensing: Basics and History Joachim Wambsganss Zentrum für Astronomie der Universität Heidelberg (ZAH/ARI) Introduction to (Strong) Gravitational Lensing: Basics

More information

STUDY OF THE LARGE-SCALE STRUCTURE OF THE UNIVERSE USING GALAXY CLUSTERS

STUDY OF THE LARGE-SCALE STRUCTURE OF THE UNIVERSE USING GALAXY CLUSTERS STUDY OF THE LARGE-SCALE STRUCTURE OF THE UNIVERSE USING GALAXY CLUSTERS BÙI VĂN TUẤN Advisors: Cyrille Rosset, Michel Crézé, James G. Bartlett ASTROPARTICLE AND COSMOLOGY LABORATORY PARIS DIDEROT UNIVERSITY

More information

Source plane reconstruction of the giant gravitational arc in Abell 2667: a condidate Wolf-Rayet galaxy at z 1

Source plane reconstruction of the giant gravitational arc in Abell 2667: a condidate Wolf-Rayet galaxy at z 1 Source plane reconstruction of the giant gravitational arc in Abell 2667: a condidate Wolf-Rayet galaxy at z 1 Speaker: Shuo Cao Department of Astronomy Beijing Normal University Collaborators: Giovanni

More information

Bachelor and MSc thesis with CTAC (Center for Theoretical Astrophysics and Cosmology), Institute for Computational Science (UZH)

Bachelor and MSc thesis with CTAC (Center for Theoretical Astrophysics and Cosmology), Institute for Computational Science (UZH) Bachelor and MSc thesis with CTAC (Center for Theoretical Astrophysics and Cosmology), Institute for Computational Science (UZH) http://www.ics.uzh.ch General topics: (i) Theoretical and computational

More information

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures José I. Aliaga Performance and Energy Analysis of the Iterative Solution of Sparse

More information

BROAD SPECTRAL LINE AND CONTINUUM VARIABILITIES IN QSO SPECTRA INDUCED BY MICROLENSING:METHODS OF COMPUTATION

BROAD SPECTRAL LINE AND CONTINUUM VARIABILITIES IN QSO SPECTRA INDUCED BY MICROLENSING:METHODS OF COMPUTATION Proceedings of the IX Bulgarian-Serbian Astronomical Conference: Astroinformatics (IX BSACA) Sofia, Bulgaria, July -,, Editors: M. K. Tsvetkov, M. S. Dimitrijević, O. Kounchev, D. Jevremović andk. Tsvetkova

More information

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria 1 Overview Improving LSTC s Multifrontal Linear Solver Roger Grimes 3, Robert Lucas 3, Nick Meng 2, Francois-Henry Rouet 3, Clement Weisbecker 3, and Ting-Ting Zhu 1 1 Cray Incorporated 2 Intel Corporation

More information

Fine-grained Parallel Incomplete LU Factorization

Fine-grained Parallel Incomplete LU Factorization Fine-grained Parallel Incomplete LU Factorization Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Sparse Days Meeting at CERFACS June 5-6, 2014 Contribution

More information

Gravitational Lensing

Gravitational Lensing Gravitational Lensing Gravitational lensing, which is the deflection of light by gravitational fields and the resulting effect on images, is widely useful in cosmology and, at the same time, a source of

More information

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and Accelerating the Multifrontal Method Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes {rflucas,genew,ddavis}@isi.edu and grimes@lstc.com 3D Finite Element

More information

GRAVITATIONAL WAVES. Eanna E. Flanagan Cornell University. Presentation to CAA, 30 April 2003 [Some slides provided by Kip Thorne]

GRAVITATIONAL WAVES. Eanna E. Flanagan Cornell University. Presentation to CAA, 30 April 2003 [Some slides provided by Kip Thorne] GRAVITATIONAL WAVES Eanna E. Flanagan Cornell University Presentation to CAA, 30 April 2003 [Some slides provided by Kip Thorne] Summary of talk Review of observational upper limits and current and planned

More information

Weak Lensing. Alan Heavens University of Edinburgh UK

Weak Lensing. Alan Heavens University of Edinburgh UK Weak Lensing Alan Heavens University of Edinburgh UK Outline History Theory Observational status Systematics Prospects Weak Gravitational Lensing Coherent distortion of background images Shear, Magnification,

More information

John C. Linford. ParaTools, Inc. EMiT 15, Manchester UK 1 July 2015

John C. Linford. ParaTools, Inc. EMiT 15, Manchester UK 1 July 2015 John C. Linford jlinford@paratools.com ParaTools, Inc. EMiT 15, Manchester UK 1 July 2015 CLIMATE & ATMOSPHERE Air and water quality Climate change Wildfire tracking Volcanic eruptions EMiT'15, Copyright

More information

Reflecting on the Goal and Baseline of Exascale Computing

Reflecting on the Goal and Baseline of Exascale Computing Reflecting on the Goal and Baseline of Exascale Computing Thomas C. Schulthess!1 Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b!2 Tracking supercomputer performance over

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

Sunrise: Patrik Jonsson. Panchromatic SED Models of Simulated Galaxies. Lecture 2: Working with Sunrise. Harvard-Smithsonian Center for Astrophysics

Sunrise: Patrik Jonsson. Panchromatic SED Models of Simulated Galaxies. Lecture 2: Working with Sunrise. Harvard-Smithsonian Center for Astrophysics Sunrise: Panchromatic SED Models of Simulated Galaxies Lecture 2: Working with Sunrise Patrik Jonsson Harvard-Smithsonian Center for Astrophysics Lecture outline Lecture 1: Why Sunrise? What does it do?

More information

Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3

Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun Fu 1, Xuebin Chi 1, Weiguo Gao 2, Lin-Wang Wang 3 A plane wave pseudopotential density functional theory molecular dynamics code on multi-gpu machine - GPU Technology Conference, San Jose, May 17th, 2012 Weile Jia 1, Long Wang 1, Zongyan Cao 1, Jiyun

More information

Chapter 16 Dark Matter, Dark Energy, & The Fate of the Universe

Chapter 16 Dark Matter, Dark Energy, & The Fate of the Universe 16.1 Unseen Influences Chapter 16 Dark Matter, Dark Energy, & The Fate of the Universe Dark Matter: An undetected form of mass that emits little or no light but whose existence we infer from its gravitational

More information

Domain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016

Domain specific libraries. Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Domain specific libraries Material science codes on innovative HPC architectures Anton Kozhevnikov, CSCS December 5, 2016 Part 1: Introduction Kohn-Shame equations 1 2 Eigen-value problem + v eff (r) j(r)

More information

Direct Self-Consistent Field Computations on GPU Clusters

Direct Self-Consistent Field Computations on GPU Clusters Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd

More information