Cartesius Opening. Jean-Marc DENIS. June, 14th, International Business Director Extreme Computing Business Unit

Size: px
Start display at page:

Download "Cartesius Opening. Jean-Marc DENIS. June, 14th, International Business Director Extreme Computing Business Unit"

Transcription

1 Cartesius Opening June, 14th, 2013 Jean-Marc DENIS International Business Director Extreme Computing Business Unit 1

2 Cartesius (Renatus, ) (*) René Descartes (French: [ʁəne dekaʁt]; Latinized: Renatus Cartesius; adjectival form: "Cartesian";[6] 31 March February 1650) was a French philosopher, mathematician, and writer who spent most of his adult life in the Dutch Republic. He has been dubbed the 'Father of Modern Philosophy'. Descartes' influence in mathematics is equally apparent; the Cartesian coordinate system allowing reference to a point in space as a set of numbers, and allowing algebraic equations to be expressed as geometric shapes in a two-dimensional coordinate system (and conversely, shapes to be described as equations) was named after him. He is credited as the father of analytical geometry, the bridge between algebra and geometry, crucial to the discovery of infinitesimal calculus and analysis. Descartes was also one of the key figures in the Scientific Revolution and has been described as an example of genius. Descartes was a major figure in 17th-century continental rationalism, later advocated by Baruch Spinoza and Gottfried Leibniz, and opposed by the empiricist school of thought consisting of Hobbes, Locke, Berkeley, Jean-Jacques Rousseau, and Hume. Leibniz, Spinoza and Descartes were all well versed in mathematics as well as philosophy, and Descartes and Leibniz contributed greatly to science as well. He is perhaps best known for the philosophical statement "Cogito ergo sum" (French: Je pense, donc je suis; English: I think, therefore I am), found in part IV of Discourse on the Method (1637) and 7 of part I of Principles of Philosophy (1644). La Haye en Touraine, the town was the birthplace of the philosopher René Descartes ( ), although his family home was in nearby Chatellerault. Descartes left La Haye in approximately 1606 to attend the College Henri IV at La Fleche. The town was renamed La Haye-Descartes in 1802 in his honor, and then renamed again to Descartes in (*) 2

3 Cartesius (SurfSara, 2013 ) Phase 1 (2013) 271 TFlops 572 compute nodes GB memory 1071 TiB storage IB FDR 3

4 Phase 2 (2014) Tflops (x5) compute nodes (32 Fat & 1620 Thin) (x3) GB Memory (x2,5) TiB storage & 202 GB/s (x7) IB FDR (no change) 4

5 Courtesy AIRBUS France/IESP Why ExaScale Computing? Oil & Gas: better resource detection flops ,5 0,1 Complexity of algorithm Visco-elastic FWI Petro-elastic inversion Elastic FWI Visco-elastic modeling Isotropic/anisotropic FWI Elastic modeling/rtm Isotropic/anisotropic RTM Isotropic/anisotropic modeling 50 TF Paraxial isotropic/anisotropic imaging (50x10 12) Asymptotic approximation imaging PF (10 16) 1 PF (10 15) Industrial challenges in oil and gas: depth imaging roadmap courtesy IESP Oil reservoir discovered Unclear image Non-significant image Aircraft: complete multi-physics simulation Human brain project Capacity: # of Overnight Loads cases run 10 2 Unsteady RANS LE S Available Computational Capacity [Flop/s] 1 Zeta (10 21 ) RANS High Speed RANS Low Speed Smart use of HPC power: Algorithms Data mining Knowledge 1 Exa (10 18 ) 1 Peta (10 15 ) 1 Tera (10 12 ) 1 Giga (10 9 ) 10 6 HS Design Aero Data CFD-based Optimisation Set LOADS Full MDO & CFD-CSM & HQ Capability achieved during one night batch CFD-based noise simulation Real-time CFD-based in flight simulation 5

6 (Some) Exascale challenges 1,000 x30 PFlops 30 1 x

7 Addressing the Exascale Challenges Optimize system Power Consumption (minimize PUE) Develop new HPC processors Fix the Memory wall TeraBytes Bandwidth Terabit interconnect (optical links everywhere) Non-Volatile Memory (NV-RAM) storage and fast memory SW complexity: manageability, programming models 7

8 Bull focus for ExaScale Computing Power Consumption Exponential increase in number of cores 100 millions of cores In 2011, 50% of CIO claimed that none of their compute tasks did use more than 120 cores #cores MWatts x20 20MW 1MW 1 PF PF 2020 FLOPS x1000 Average number of cores per supercomputer (Top 20 of Top500) 2020 exaflops 8

9 Bull research program for ExaScale Computing Power Consumption PUE optimization Down to 1 + ε (very) hot water Adiabatic Computer room Cogeneration No wasted energy. Any piece of heat is re-used Supercomputer management Power monitoring tools Use the right HW for the right app Application optimization Save (a lot) on energy consumption with (very) limited performance degradation Opportunities for Collaborations Exponential increase in number of cores SW stack OS Communications (MPI but not only) Batch Affinity (cpu/mem/node/ ) Data management (filesystems) Overpass current interconnect limitations Topology (ies) RDMA mechanisms Latency at large Scale Programming model (many) different programming models: MIMD+SIMD Languages Reliability MTBF close to zero automatic recovery mechanisms 9

10 Manageability at ExaScale The processor is the new transistor" (Chris Rowen) MPI, OpenMP, Threads, Cuda, OpenCL,... Message passing, shared memory Locality Raise level of abstraction Set of compute resources Parallelism based compute resources New high level programming languages Optimize compute environment Describe key characteristics of applications Elect the most appropriate set of node types Manage resources with heuristics predicting the future workload Migrate Processes Resource fragmentation reduction Hardware failures Prediction Allow dynamic application frameworks Automatic application loadbalancing Meshes refinement optimization Restart lost processes in case of failure 10

11 Programmability at ExaScale Parallelism / Concurrency is easy to apprehend but much more complex to express in an application program Distribute task and data to operate on Old SMP approaches (bulk parallelism à la OpenMP) making a come back (cf MIC) Old SIMD approaches (bulk parallelism à la CM2/CM5) making a come back (cf CUDA) At Highest level Message passing (MPI-3) Data decomposition With increasing degree of parallelism hierarchical approach is necessary 11

12 2020 exascale downscale to departmental and Embedded computing SME s computing By Pflops in a rack PetaFlop system (2012) ExaFlop / data center (2020) - TFlops in a chip Number of nodes [3-8],000 [50-200],000 (10x) Computation 1 PetaFlop 1 ExaFlop (Flops & Inst.) (1000x) Memory Capacity [1-2]00 TB > 100 PB (B) (1000x) Global Memory [2-5] 00 > 100 PB/s BW (B/s) TB/s (1000x) Interconnect [5-10]0 ~50 PB/s bisection BW TB/s (1000x) Storage Capacity [1-10] PB >1 EB (B) (1000x) Storage BW (B/s) [10-500] > 10 TB/s GB/s (1000x) IOP/s 100,000 > 100 M (1000x) Power Cons. [.5-1.] MW < 20 MW (W) (20x) PetaFlop/ departmental (2020) TeraFlop / embedded (2020) [50-100] 1 1 PetaFlop 1 TeraFlop > 10^14 > 10^11 > 100 TB/s > 100 GB/s ~10 TB/s N/A >1 PB > 1 TB > 10 GB/s > 10 MB/s > 100,000 > 100 < 20 KW < 20 W 12

13 Cogito Ergo Sum Computa Ergo Sum 13

14 Cogito Ergo Sum Computo Ergo Sum 14

15 15

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29 Outline A few words on MD applications and the GROMACS package The main work in an MD simulation Parallelization Stream computing

More information

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory

Nuclear Physics and Computing: Exascale Partnerships. Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Physics and Computing: Exascale Partnerships Juan Meza Senior Scientist Lawrence Berkeley National Laboratory Nuclear Science and Exascale i Workshop held in DC to identify scientific challenges

More information

Lecture 19. Architectural Directions

Lecture 19. Architectural Directions Lecture 19 Architectural Directions Today s lecture Advanced Architectures NUMA Blue Gene 2010 Scott B. Baden / CSE 160 / Winter 2010 2 Final examination Announcements Thursday, March 17, in this room:

More information

MATH1014 Calculus II. A historical review on Calculus

MATH1014 Calculus II. A historical review on Calculus MATH1014 Calculus II A historical review on Calculus Edmund Y. M. Chiang Department of Mathematics Hong Kong University of Science & Technology September 4, 2015 Instantaneous Velocities Newton s paradox

More information

GPU Computing Activities in KISTI

GPU Computing Activities in KISTI International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr

More information

ab initio Electronic Structure Calculations

ab initio Electronic Structure Calculations ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners

Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners José I. Aliaga Leveraging task-parallelism in energy-efficient ILU preconditioners Universidad Jaime I (Castellón, Spain) José I. Aliaga

More information

Reliability at Scale

Reliability at Scale Reliability at Scale Intelligent Storage Workshop 5 James Nunez Los Alamos National lab LA-UR-07-0828 & LA-UR-06-0397 May 15, 2007 A Word about scale Petaflop class machines LLNL Blue Gene 350 Tflops 128k

More information

Some thoughts about energy efficient application execution on NEC LX Series compute clusters

Some thoughts about energy efficient application execution on NEC LX Series compute clusters Some thoughts about energy efficient application execution on NEC LX Series compute clusters G. Wellein, G. Hager, J. Treibig, M. Wittmann Erlangen Regional Computing Center & Department of Computer Science

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing

Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing Prasanna Balaprakash, Leonardo A. Bautista Gomez, Slim Bouguerra, Stefan M. Wild, Franck Cappello, and Paul D. Hovland

More information

Cactus Tools for Petascale Computing

Cactus Tools for Petascale Computing Cactus Tools for Petascale Computing Erik Schnetter Reno, November 2007 Gamma Ray Bursts ~10 7 km He Protoneutron Star Accretion Collapse to a Black Hole Jet Formation and Sustainment Fe-group nuclei Si

More information

Red Sky. Pushing Toward Petascale with Commodity Systems. Matthew Bohnsack. Sandia National Laboratories Albuquerque, New Mexico USA

Red Sky. Pushing Toward Petascale with Commodity Systems. Matthew Bohnsack. Sandia National Laboratories Albuquerque, New Mexico USA Red Sky Pushing Toward Petascale with Commodity Systems Matthew Bohnsack Sandia National Laboratories Albuquerque, New Mexico USA mpbohns@sandia.gov Tuesday March 9, 2010 Matthew Bohnsack (Sandia Nat l

More information

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts

More information

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster

Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp

More information

Data Intensive Computing meets High Performance Computing

Data Intensive Computing meets High Performance Computing Data Intensive Computing meets High Performance Computing Kathy Yelick Associate Laboratory Director for Computing Sciences, Lawrence Berkeley National Laboratory Professor of Electrical Engineering and

More information

Architecture-Aware Algorithms and Software for Peta and Exascale Computing

Architecture-Aware Algorithms and Software for Peta and Exascale Computing Architecture-Aware Algorithms and Software for Peta and Exascale Computing Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester 4/25/2011 1 H. Meuer, H. Simon, E.

More information

René Descartes AB = 1. DE is parallel to AC. Check the result using a scale drawing for the following values FG = 1

René Descartes AB = 1. DE is parallel to AC. Check the result using a scale drawing for the following values FG = 1 MEI Conference 2016 René Descartes Multiplication and division AB = 1 DE is parallel to AC BC and BD are given values (lengths) Show that BE is the product of BC and BD Check the result using a scale drawing

More information

Beyond Newton and Leibniz: The Making of Modern Calculus. Anthony V. Piccolino, Ed. D. Palm Beach State College Palm Beach Gardens, Florida

Beyond Newton and Leibniz: The Making of Modern Calculus. Anthony V. Piccolino, Ed. D. Palm Beach State College Palm Beach Gardens, Florida Beyond Newton and Leibniz: The Making of Modern Calculus Anthony V. Piccolino, Ed. D. Palm Beach State College Palm Beach Gardens, Florida Calculus Before Newton & Leibniz Four Major Scientific Problems

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Tradeoff between Reliability and Power Management

Tradeoff between Reliability and Power Management Tradeoff between Reliability and Power Management 9/1/2005 FORGE Lee, Kyoungwoo Contents 1. Overview of relationship between reliability and power management 2. Dakai Zhu, Rami Melhem and Daniel Moss e,

More information

Julian Merten. GPU Computing and Alternative Architecture

Julian Merten. GPU Computing and Alternative Architecture Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg

More information

Administrative Stuff

Administrative Stuff EE141- Spring 2004 Digital Integrated Circuits Lecture 30 PERSPECTIVES 1 Administrative Stuff Homework 10 posted just for practice. No need to turn in (hw 9 due today). Normal office hours next week. HKN

More information

Gravitational Wave Data (Centre?)

Gravitational Wave Data (Centre?) ARC Centre of Excellence for Gravitational Wave Discovery Gravitational Wave Data (Centre?) Matthew Bailes (Director) Swinburne University of Technology ARC Laureate Fellow OzGrav Director Gravitational

More information

Introduction to communication avoiding linear algebra algorithms in high performance computing

Introduction to communication avoiding linear algebra algorithms in high performance computing Introduction to communication avoiding linear algebra algorithms in high performance computing Laura Grigori Inria Rocquencourt/UPMC Contents 1 Introduction............................ 2 2 The need for

More information

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012 Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,

More information

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry and Eugene DePrince Argonne National Laboratory (LCF and CNM) (Eugene moved to Georgia Tech last week)

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

MATH 1231 MATHEMATICS 1B Calculus Section 4.3: - Series.

MATH 1231 MATHEMATICS 1B Calculus Section 4.3: - Series. MATH 1231 MATHEMATICS 1B 2010. Calculus Section 4.3: - Series. 1. Sigma notation 2. What is a series? 3. The big question 4. What you should already know 5. Telescoping series 5. Convergence 6. n th term

More information

The QMC Petascale Project

The QMC Petascale Project The QMC Petascale Project Richard G. Hennig What will a petascale computer look like? What are the limitations of current QMC algorithms for petascale computers? How can Quantum Monte Carlo algorithms

More information

Analog Computation in Flash Memory for Datacenter-scale AI Inference in a Small Chip

Analog Computation in Flash Memory for Datacenter-scale AI Inference in a Small Chip 1 Analog Computation in Flash Memory for Datacenter-scale AI Inference in a Small Chip Dave Fick, CTO/Founder Mike Henry, CEO/Founder About Mythic 2 Focused on high-performance Edge AI Full stack co-design:

More information

Development of Thought continued. The dispute between rationalism and empiricism concerns the extent to which we

Development of Thought continued. The dispute between rationalism and empiricism concerns the extent to which we Development of Thought continued The dispute between rationalism and empiricism concerns the extent to which we are dependent upon sense experience in our effort to gain knowledge. Rationalists claim that

More information

Quantum computing with superconducting qubits Towards useful applications

Quantum computing with superconducting qubits Towards useful applications Quantum computing with superconducting qubits Towards useful applications Stefan Filipp IBM Research Zurich Switzerland Forum Teratec 2018 June 20, 2018 Palaiseau, France Why Quantum Computing? Why now?

More information

Figure 1 - Resources trade-off. Image of Jim Kinter (COLA)

Figure 1 - Resources trade-off. Image of Jim Kinter (COLA) CLIMATE CHANGE RESEARCH AT THE EXASCALE Giovanni Aloisio *,, Italo Epicoco *,, Silvia Mocavero and Mark Taylor^ (*) University of Salento, Lecce, Italy ( ) Euro-Mediterranean Centre for Climate Change

More information

One Optimized I/O Configuration per HPC Application

One Optimized I/O Configuration per HPC Application One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University

More information

ECMWF Computing & Forecasting System

ECMWF Computing & Forecasting System ECMWF Computing & Forecasting System icas 2015, Annecy, Sept 2015 Isabella Weger, Deputy Director of Computing ECMWF September 17, 2015 October 29, 2014 ATMOSPHERE MONITORING SERVICE CLIMATE CHANGE SERVICE

More information

Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei

Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei 1/20 Huge-Scale Molecular Dynamics Simulation of Multi-bubble Nuclei H. Watanabe ISSP, The M. Suzuki H. Inaoka N. Ito Kyushu University RIKEN AICS The, RIKEN AICS Outline 1. Introduction 2. Benchmark results

More information

Astronomy of the Next Decade: From Photons to Petabytes. R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST

Astronomy of the Next Decade: From Photons to Petabytes. R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST Astronomy of the Next Decade: From Photons to Petabytes R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST Classical Astronomy still dominates new facilities Even new large facilities (VLT,

More information

Update on Cray Earth Sciences Segment Activities and Roadmap

Update on Cray Earth Sciences Segment Activities and Roadmap Update on Cray Earth Sciences Segment Activities and Roadmap 31 Oct 2006 12 th ECMWF Workshop on Use of HPC in Meteorology Per Nyberg Director, Marketing and Business Development Earth Sciences Segment

More information

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption EE115C Winter 2017 Digital Electronic Circuits Lecture 6: Power Consumption Four Key Design Metrics for Digital ICs Cost of ICs Reliability Speed Power EE115C Winter 2017 2 Power and Energy Challenges

More information

Math 4388 Amber Pham 1. The Birth of Calculus. for counting. There are two major interrelated topics in calculus known as differential and

Math 4388 Amber Pham 1. The Birth of Calculus. for counting. There are two major interrelated topics in calculus known as differential and Math 4388 Amber Pham 1 The Birth of Calculus The literal meaning of calculus originated from Latin, which means a small stone used for counting. There are two major interrelated topics in calculus known

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

A plane in which each point is identified with a ordered pair of real numbers (x,y) is called a coordinate (or Cartesian) plane.

A plane in which each point is identified with a ordered pair of real numbers (x,y) is called a coordinate (or Cartesian) plane. Coordinate Geometry Rene Descartes, considered the father of modern philosophy (Cogito ergo sum), also had a great influence on mathematics. He and Fermat corresponded regularly and as a result of their

More information

Scalable and Power-Efficient Data Mining Kernels

Scalable and Power-Efficient Data Mining Kernels Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the

More information

A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries

A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries A Framework for Hybrid Parallel Flow Simulations with a Trillion Cells in Complex Geometries SC13, November 21 st 2013 Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler, Ulrich

More information

Lecture 2: Metrics to Evaluate Systems

Lecture 2: Metrics to Evaluate Systems Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm Penporn Koanantakool and Katherine Yelick {penpornk, yelick}@cs.berkeley.edu Computer Science Division, University of California,

More information

Checkpoint Scheduling

Checkpoint Scheduling Checkpoint Scheduling Henri Casanova 1,2 1 Associate Professor Department of Information and Computer Science University of Hawai i at Manoa, U.S.A. 2 Visiting Associate Professor National Institute of

More information

Data analysis of massive data sets a Planck example

Data analysis of massive data sets a Planck example Data analysis of massive data sets a Planck example Radek Stompor (APC) LOFAR workshop, Meudon, 29/03/06 Outline 1. Planck mission; 2. Planck data set; 3. Planck data analysis plan and challenges; 4. Planck

More information

Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters --

Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters -- Parallel Processing for Energy Efficiency October 3, 2013 NTNU, Trondheim, Norway Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer

More information

The Memory Intensive System

The Memory Intensive System DiRAC@Durham The Memory Intensive System The DiRAC-2.5x Memory Intensive system at Durham in partnership with Dell Dr Lydia Heck, Technical Director ICC HPC and DiRAC Technical Manager 1 DiRAC Who we are:

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

In today s world, people with basic calculus knowledge take the subject for granted. As

In today s world, people with basic calculus knowledge take the subject for granted. As Ziya Chen Math 4388 Shanyu Ji Calculus In today s world, people with basic calculus knowledge take the subject for granted. As long as they plug in numbers into the right formula and do the calculation

More information

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each

More information

VLSI Design I. Defect Mechanisms and Fault Models

VLSI Design I. Defect Mechanisms and Fault Models VLSI Design I Defect Mechanisms and Fault Models He s dead Jim... Overview Defects Fault models Goal: You know the difference between design and fabrication defects. You know sources of defects and you

More information

Everyday Multithreading

Everyday Multithreading Everyday Multithreading Parallel computing for genomic evaluations in R C. Heuer, D. Hinrichs, G. Thaller Institute of Animal Breeding and Husbandry, Kiel University August 27, 2014 C. Heuer, D. Hinrichs,

More information

Qualitative vs Quantitative metrics

Qualitative vs Quantitative metrics Qualitative vs Quantitative metrics Quantitative: hard numbers, measurable Time, Energy, Space Signal-to-Noise, Frames-per-second, Memory Usage Money (?) Qualitative: feelings, opinions Complexity: Simple,

More information

Huge-scale Molecular Study of Multi-bubble Nuclei

Huge-scale Molecular Study of Multi-bubble Nuclei Huge-scale Molecular Study of Multi-bubble Nuclei Hiroshi WATANABE 1 and Nobuyasu ITO 2 1 The Institute for Solid State Physics, The University of Tokyo, Kashiwanoha 5-1-5, Kashiwa, Chiba 277-8581, Japan

More information

A Data Communication Reliability and Trustability Study for Cluster Computing

A Data Communication Reliability and Trustability Study for Cluster Computing A Data Communication Reliability and Trustability Study for Cluster Computing Speaker: Eduardo Colmenares Midwestern State University Wichita Falls, TX HPC Introduction Relevant to a variety of sciences,

More information

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE by Gerrit Muller University of Southeast Norway-NISE e-mail: gaudisite@gmail.com www.gaudisite.nl Abstract The purpose of the conceptual view is described. A number of methods or models is given to use

More information

Quantum Chemical Calculations by Parallel Computer from Commodity PC Components

Quantum Chemical Calculations by Parallel Computer from Commodity PC Components Nonlinear Analysis: Modelling and Control, 2007, Vol. 12, No. 4, 461 468 Quantum Chemical Calculations by Parallel Computer from Commodity PC Components S. Bekešienė 1, S. Sėrikovienė 2 1 Institute of

More information

Opportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem

Opportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem Opportunities for ELPA to Accelerate the Solution of the Bethe-Salpeter Eigenvalue Problem Peter Benner, Andreas Marek, Carolin Penke August 16, 2018 ELSI Workshop 2018 Partners: The Problem The Bethe-Salpeter

More information

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts. TDDI4 Concurrent Programming, Operating Systems, and Real-time Operating Systems CPU Scheduling Overview: CPU Scheduling CPU bursts and I/O bursts Scheduling Criteria Scheduling Algorithms Multiprocessor

More information

MATH 25 CLASS 8 NOTES, OCT

MATH 25 CLASS 8 NOTES, OCT MATH 25 CLASS 8 NOTES, OCT 7 20 Contents. Prime number races 2. Special kinds of prime numbers: Fermat and Mersenne numbers 2 3. Fermat numbers 3. Prime number races We proved that there were infinitely

More information

Scalable Systems for Computational Biology

Scalable Systems for Computational Biology John von Neumann Institute for Computing Scalable Systems for Computational Biology Ch. Pospiech published in From Computational Biophysics to Systems Biology (CBSB08), Proceedings of the NIC Workshop

More information

Leibniz and the Discovery of Calculus. The introduction of calculus to the world in the seventeenth century is often associated

Leibniz and the Discovery of Calculus. The introduction of calculus to the world in the seventeenth century is often associated Leibniz and the Discovery of Calculus The introduction of calculus to the world in the seventeenth century is often associated with Isaac Newton, however on the main continent of Europe calculus would

More information

High-Performance Computing and Groundbreaking Applications

High-Performance Computing and Groundbreaking Applications INSTITUTE OF INFORMATION AND COMMUNICATION TECHNOLOGIES BULGARIAN ACADEMY OF SCIENCE High-Performance Computing and Groundbreaking Applications Svetozar Margenov Institute of Information and Communication

More information

Lower Bounds on Algorithm Energy Consumption: Current Work and Future Directions. March 1, 2013

Lower Bounds on Algorithm Energy Consumption: Current Work and Future Directions. March 1, 2013 Lower Bounds on Algorithm Energy Consumption: Current Work and Future Directions James Demmel, Andrew Gearhart, Benjamin Lipshitz and Oded Schwartz Electrical Engineering and Computer Sciences University

More information

arxiv: v1 [hep-lat] 8 Nov 2014

arxiv: v1 [hep-lat] 8 Nov 2014 Staggered Dslash Performance on Intel Xeon Phi Architecture arxiv:1411.2087v1 [hep-lat] 8 Nov 2014 Department of Physics, Indiana University, Bloomington IN 47405, USA E-mail: ruizli AT umail.iu.edu Steven

More information

Optimization Techniques for Parallel Code 1. Parallel programming models

Optimization Techniques for Parallel Code 1. Parallel programming models Optimization Techniques for Parallel Code 1. Parallel programming models Sylvain Collange Inria Rennes Bretagne Atlantique http://www.irisa.fr/alf/collange/ sylvain.collange@inria.fr OPT - 2017 Goals of

More information

Some Great Breakthrough Ideas in Mathematics. The First Crisis in Math Discovery of Irrational Numbers 6/2/2018.

Some Great Breakthrough Ideas in Mathematics. The First Crisis in Math Discovery of Irrational Numbers 6/2/2018. Some Great Breakthrough Ideas in Mathematics kohkheemeng@gmail. com The First Crisis in Math Discovery of Irrational Numbers The Pythagorean Theorem - One of the very first great theorems in Math The First

More information

Cyclops Tensor Framework

Cyclops Tensor Framework Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r

More information

Petaflops, Exaflops, and Zettaflops for Science and Defense

Petaflops, Exaflops, and Zettaflops for Science and Defense SAND2005-2690C Petaflops, Exaflops, and Zettaflops for Science and Defense Erik P. DeBenedictis Sandia National Laboratories May 16, 2005 Sandia is a multiprogram laboratory operated by Sandia Corporation,

More information

Computational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation

Computational Numerical Integration for Spherical Quadratures. Verified by the Boltzmann Equation Computational Numerical Integration for Spherical Quadratures Verified by the Boltzmann Equation Huston Rogers. 1 Glenn Brook, Mentor 2 Greg Peterson, Mentor 2 1 The University of Alabama 2 Joint Institute

More information

Core-Collapse Supernova Simulation A Quintessentially Exascale Problem

Core-Collapse Supernova Simulation A Quintessentially Exascale Problem Core-Collapse Supernova Simulation A Quintessentially Exascale Problem Bronson Messer Deputy Director of Science Oak Ridge Leadership Computing Facility Theoretical Astrophysics Group Oak Ridge National

More information

Universität Dortmund UCHPC. Performance. Computing for Finite Element Simulations

Universität Dortmund UCHPC. Performance. Computing for Finite Element Simulations technische universität dortmund Universität Dortmund fakultät für mathematik LS III (IAM) UCHPC UnConventional High Performance Computing for Finite Element Simulations S. Turek, Chr. Becker, S. Buijssen,

More information

Frontiers of Extreme Computing 2007 Applications and Algorithms Working Group

Frontiers of Extreme Computing 2007 Applications and Algorithms Working Group 1 Frontiers of Extreme Computing 2007 Applications and Algorithms Working Group October 25, 2007 Horst Simon (chair), David Bailey, Rupak Biswas, George Carr, Phil Jones, Bob Lucas, David Koester, Nathan

More information

New Mathematics and Computer Science, B.S. Degree.

New Mathematics and Computer Science, B.S. Degree. New Mathematics and Computer Science, B.S. Degree. List of requirements: Math 1041 (4 cr.) Calculus I Math 1042 (4 cr.) Calculus II Math 2043 (4 cr.) Calculus III Math 2101 (3 cr.) Linear Algebra Math

More information

Mathematics Foundation for College. Lesson Number 8a. Lesson Number 8a Page 1

Mathematics Foundation for College. Lesson Number 8a. Lesson Number 8a Page 1 Mathematics Foundation for College Lesson Number 8a Lesson Number 8a Page 1 Lesson Number 8 Topics to be Covered in this Lesson Coordinate graphing, linear equations, conic sections. Lesson Number 8a Page

More information

Applications of Mathematical Economics

Applications of Mathematical Economics Applications of Mathematical Economics Michael Curran Trinity College Dublin Overview Introduction. Data Preparation Filters. Dynamic Stochastic General Equilibrium Models: Sunspots and Blanchard-Kahn

More information

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)

More information

Chapter 1 INTRODUCTION TO CALCULUS

Chapter 1 INTRODUCTION TO CALCULUS Chapter 1 INTRODUCTION TO CALCULUS In the English language, the rules of grammar are used to speak and write effectively. Asking for a cookie at the age of ten was much easier than when you were first

More information

Fault-Tolerant Techniques for HPC

Fault-Tolerant Techniques for HPC Fault-Tolerant Techniques for HPC Yves Robert Laboratoire LIP, ENS Lyon Institut Universitaire de France University Tennessee Knoxville Yves.Robert@inria.fr http://graal.ens-lyon.fr/~yrobert/htdc-flaine.pdf

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Map-Reduce Denis Helic KTI, TU Graz Oct 24, 2013 Denis Helic (KTI, TU Graz) KDDM1 Oct 24, 2013 1 / 82 Big picture: KDDM Probability Theory Linear Algebra

More information

N-body Simulations. On GPU Clusters

N-body Simulations. On GPU Clusters N-body Simulations On GPU Clusters Laxmikant Kale Filippo Gioachin Pritish Jetley Thomas Quinn Celso Mendes Graeme Lufkin Amit Sharma Joachim Stadel Lukasz Wesolowski James Wadsley Edgar Solomonik Fabio

More information

Progress in NWP on Intel HPC architecture at Australian Bureau of Meteorology

Progress in NWP on Intel HPC architecture at Australian Bureau of Meteorology Progress in NWP on Intel HPC architecture at Australian Bureau of Meteorology www.cawcr.gov.au Robin Bowen Senior ITO Earth System Modelling Programme 04 October 2012 ECMWF HPC Presentation outline Weather

More information

Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling. Tam Chantem Electrical & Computer Engineering

Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling. Tam Chantem Electrical & Computer Engineering Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling Tam Chantem Electrical & Computer Engineering High performance Energy efficient Multicore Systems High complexity

More information

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics Lecture 23 Dealing with Interconnect Impact of Interconnect Parasitics Reduce Reliability Affect Performance Classes of Parasitics Capacitive Resistive Inductive 1 INTERCONNECT Dealing with Capacitance

More information

Unsteady CFD for Automotive Aerodynamics

Unsteady CFD for Automotive Aerodynamics Unsteady CFD for Automotive Aerodynamics T. Indinger, B. Schnepf, P. Nathen, M. Peichl, TU München, Institute of Aerodynamics and Fluid Mechanics Prof. Dr.-Ing. N.A. Adams Outline 2 Motivation Applications

More information

More Science per Joule: Bottleneck Computing

More Science per Joule: Bottleneck Computing More Science per Joule: Bottleneck Computing Georg Hager Erlangen Regional Computing Center (RRZE) University of Erlangen-Nuremberg Germany PPAM 2013 September 9, 2013 Warsaw, Poland Motivation (1): Scalability

More information

The roots of computability theory. September 5, 2016

The roots of computability theory. September 5, 2016 The roots of computability theory September 5, 2016 Algorithms An algorithm for a task or problem is a procedure that, if followed step by step and without any ingenuity, leads to the desired result/solution.

More information

Integration. Darboux Sums. Philippe B. Laval. Today KSU. Philippe B. Laval (KSU) Darboux Sums Today 1 / 13

Integration. Darboux Sums. Philippe B. Laval. Today KSU. Philippe B. Laval (KSU) Darboux Sums Today 1 / 13 Integration Darboux Sums Philippe B. Laval KSU Today Philippe B. Laval (KSU) Darboux Sums Today 1 / 13 Introduction The modern approach to integration is due to Cauchy. He was the first to construct a

More information

Chemists in France. 1. How many French-speaking chemists have won Nobel Prizes? a. 4 b. 12 c. 15 d. 6

Chemists in France. 1. How many French-speaking chemists have won Nobel Prizes? a. 4 b. 12 c. 15 d. 6 Chemists in France As in other sciences, France and French-speaking countries have played an important role in the development of chemistry. See how much you know with the following items. 1. How many

More information

Shortest Lattice Vector Enumeration on Graphics Cards

Shortest Lattice Vector Enumeration on Graphics Cards Shortest Lattice Vector Enumeration on Graphics Cards Jens Hermans 1 Michael Schneider 2 Fréderik Vercauteren 1 Johannes Buchmann 2 Bart Preneel 1 1 K.U.Leuven 2 TU Darmstadt SHARCS - 10 September 2009

More information

Road to Calculus: The Work of Pierre de Fermat. On December 1, 1955 Rosa Parks boarded a Montgomery, Alabama city bus and

Road to Calculus: The Work of Pierre de Fermat. On December 1, 1955 Rosa Parks boarded a Montgomery, Alabama city bus and Student: Chris Cahoon Instructor: Daniel Moskowitz Calculus I, Math 181, Spring 2011 Road to Calculus: The Work of Pierre de Fermat On December 1, 1955 Rosa Parks boarded a Montgomery, Alabama city bus

More information

GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group

GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group GloMAP Mode on HECToR Phase2b (Cray XT6) Mark Richardson Numerical Algorithms Group 1 Acknowledgements NERC, NCAS Research Councils UK, HECToR Resource University of Leeds School of Earth and Environment

More information

Modeling and Tuning Parallel Performance in Dense Linear Algebra

Modeling and Tuning Parallel Performance in Dense Linear Algebra Modeling and Tuning Parallel Performance in Dense Linear Algebra Initial Experiences with the Tile QR Factorization on a Multi Core System CScADS Workshop on Automatic Tuning for Petascale Systems Snowbird,

More information