High-performance computing and the Square Kilometre Array (SKA) Chris Broekema (ASTRON) Compute platform lead SKA Science Data Processor
Radio Astronomy and Computer Science Both very young sciences (1950s) Both matured due to technology developed in WW2 Computer Science: cryptography Radio Astronomy: radar Today radio astronomy is a data intensive science at the forefront of what Grey termed the 4th paradigm. The Square Kilometre Array (2020 ) is the poster child of this.
Aperture synthesis
History of radio astronomy First observations 1932 by Karl Jansky First purpose built telescope 1937 by Grote Reber 21 cm emission line of neutral hydrogen predicted 1944 by van de Hulst Detected in 1951 by Ewen and Purcell (MIT) Published after confirmation by Muller and Oort Opening Dwingeloo radio telescope in 1956 Doppler effect (redshift) of fast moving objects shows structure of the local galaxy (1950's)
History of radio astronomy
Doppler shift: velocity translated info frequency Andromeda Nebula
Observations are limited by Sensitivity collecting area Resolution diameter relative to λ (R=λ/D) System noise low-noise design, correlate Interference filters (spacial, spectral, etc) Foreground noise calibration techniques Processing / compute / data transport capacity
Aperture synthesis
Square Kilometre Array 197 dishes
Square Kilometre Array 131,072 antennas
SKA1_low SKA1_mid Location Australia South Afrika Number of receivers 131,072 (512 st x 256 el) 197 (133+64 MeerKAT) Receiver diameter 35m (station) 15m (13.5m MeerKAT) Maximum baseline 80km 150km Frequency channels 65,536 65,536 SDP input bandwidth 4.7 Tbps 5.2 Tbps Required compute capacity (*) 5.7 PFlops 24 PFlops (*) Work in progress
SKA: A Leading Big Data Challenge for 2020 Digital Signal Processing (DSP) Antennas To Process is HPC 2020: 100 PBytes/day 2028: 10,000 PBytes/day Over 10 s to 1000 s kms Transfer antennas to DSP 2020: 20,000 PBytes/day 2028: 200,000 PBytes/day Over 10 s to 1000 s kms HPC Processing 2020: 300 PFlop 2028: 30 EFlop High Performance Computing Facility (HPC)
SKA High level system overview
SKA SDP preliminary roll-out schedule
Architectural Principles Main principles: Ensure scalability Ensure affordability Ensure maintainability Support current state-of-the-art algorithms Exploit data parallelism, not just in frequency but also other dimensions We have only two fundamental/bulk data structures Raster grids and key-value-value stream records [e.g. u,v,w, -> visibility] Emphasis is on the framework to manage the throughput Hardware platform will be replaced on a short duty cycle c.f. any HPC facility Algorithms and workflow will evolve as we learn about telescopes Approach: Co-design of software and physical layer architectures
SDP context diagram with external interfaces Signal and Data Transport Telescope manager Central Signal Processor Signal and Data Transport Science Data Processor Signal and Data Transport power, cooling, floor space, etc Infrastructure World outside SKA observatory
Overview of Science Data Processor
Overview of Science Data Processor
Physical Layer
Physical Layer
Top level SDP network
The Compute Island concept A Compute Island is a self-contained set of processing resources, capable of end-toend processing of a chunk of UVW-space, including - Ingest Buffer Calibration & Imaging Archive
SDP scaling using Compute Islands
Compute Node Candidate Architecture Heavily inspired by COBALT node Used for prelim costing & scaling Many alternatives possible ARM OpenPower + CAPI FPGA DSP Custom hardware combinations of above etc...
Required SDP capacity Required SDP computational capacity depends on 1. Input bandwidth 2. Computational intensity of the data reduction 3. Computational efficiency in % of R peak Top 500 computational efficiency Top 500 performance development projected SDP efficiency
Required SDP capacity