Software, Computing and Data Storage for the

Similar documents
Overview of the Square Kilometre Array. Richard Schilizzi COST Workshop, Rome, 30 March 2010

The Square Kilometre Array Radio Telescope Project : An Overview

International Project Update

Square Kilometre Array Nigel Rix

The Square Kilometre Array. Richard Schilizzi SKA Program Development Office

The Square Kilometre Array & High speed data recording

Welcome and SKA Overview Michiel van Haarlem. Interim Director General SKA Organisation 15 February 2012

SKA Precursors and Pathfinders. Steve Torchinsky

An Introduction to ASKAP Bringing Radio Interferometers Into the Multi-pixel Era

Large Field of View Radio Astronomy; relevant for many KSP s

The international SKA project

Science Operations with the Square Kilometre Array

Square Kilometre Array: status. Philip Diamond SKA Director-General 16 th September 2013

The Square Kilometre Array and the radio/gamma-ray connection toward the SKA era

The Australia Telescope. The Australia Telescope National Facility. Why is it a National Facility? Who uses the AT? Ray Norris CSIRO ATNF

Radio Astronomy module

e-vlbi Radio interferometers How does VLBI work? What science do we do with VLBI? How has the technology changed? Advantages of e-vlbi e-astronomy

Square Kilometre Array. Steve Torchinsky

Future radio galaxy surveys

SKA Industry Update. Matthew Johnson Head of the UK SKA Project Office STFC

The Square Kilometre Array

Radio Telescopes of the Future

New Zealand Impacts on CSP and SDP Designs

From LOFAR to SKA, challenges in distributed computing. Soobash Daiboo Paris Observatory -LESIA

Apertif. Paolo Serra. Tom Oosterloo (PI) Marc Verheijen (PI) Laurens Bakker George Heald Wim van Cappellen Marianna Ivashina

The MeerKAT-SKA schools competition Enter and stand a chance to WIN!

Transient Cosmic Phenomena and their Influence on the Design of the SKA Radio Telescope

The Future of Radio Astronomy. Karen O Neil

The Square Kilometer Array. Presented by Simon Worster

The Search for Extraterrestrial Intelligence (SETI) Director Jodrell Bank Centre for Astrophysics

The Square Kilometre Array

An Introduction to Radio Astronomy

International Facilities

The high angular resolution component of the SKA

Max-Planck-Institut für Radioastronomie, Bonn Istituto di Radioastronomia - INAF, Bologna

An Introduction to Radio Astronomy

From the VLT to ALMA and to the E-ELT

ALMA/ASTE/Mopra Report

ASKAP. and phased array feeds in astronomy. David McConnell CASS: ASKAP Commissioning and Early Science 16 November 2017

E-MERLIN and EVN/e-VLBI Capabilities, Issues & Requirements

ALMA Development Program

TOWARDS A NEW GENERATION OF RADIO ASTRONOMICAL INSTRUMENTS: signal processing for large distributed arrays

Future Radio Interferometers

The South African SKA/meerKAT Project. Kobus Cloete

Light and Telescope 10/24/2018. PHYS 1403 Introduction to Astronomy. Reminder/Announcement. Chapter Outline. Chapter Outline (continued)

e-merlin MERLIN background e-merlin Project Science

High-performance computing and the Square Kilometre Array (SKA) Chris Broekema (ASTRON) Compute platform lead SKA Science Data Processor

Radio astronomy in Africa: Opportunities for cooperation with Europe within the context of the African-European Radio Astronomy Platform (AERAP)

RadioNet: Advanced Radio Astronomy in Europe

Square Kilometer Array

The Square Kilometre Array: An international project to realize the world s

The Square Kilometre Array and Data Intensive Radio Astronomy 1

The VLBI Space Observatory Programme VSOP-2

FIVE FUNDED* RESEARCH POSITIONS

TECHNICAL REPORT NO. 86 fewer points to average out the noise. The Keck interferometry uses a single snapshot" mode of operation. This presents a furt

SKA Science Data Processing or MAKING SENSE OF SENSOR BIG DATA

Memo 100 Preliminary Specifications for the Square Kilometre Array

Design Reference Mission for SKA1 P. Dewdney System Delta CoDR

Science at ESO. Mark Casali

Determining the Specification of an Aperture Array for Cosmological Surveys

Perspectives for Future Groundbased Telescopes. Astronomy

The Square Kilometre Array (SKA) Radio Telescope Progress and Technical Directions. P. J. Hall, R. T. Schilizzi, P. E. F. Dewdney and T. J. W.

PoS(PRA2009)015. Exploring the HI Universe with ASKAP. Martin Meyer. The DINGO team

Data Challenges for Next-generation Radio Telescopes

Cecilia Fariña - ING Support Astronomer

Square Kilometre Array: World s Largest Radio Telescope Design and Science drivers

Cosmological Galaxy Surveys: Future Directions at cm/m Wavelengths

Chapter 6 Light and Telescopes

Centimeter Wave Star Formation Studies in the Galaxy from Radio Sky Surveys

Square Kilometer Array:

Astronomy of the Next Decade: From Photons to Petabytes. R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST

MAJOR SCIENTIFIC INSTRUMENTATION

Geodetic Very Long Baseline Interferometry (VLBI)

Astronomy. Optics and Telescopes

Science at Very High Angular Resolution with SKA. Sergei Gulyaev Auckland, 14 February 2018

Astro 32 - Galactic and Extragalactic Astrophysics/Spring 2016

ASKAP Array Configurations: Options and Recommendations. Ilana Feain, Simon Johnston & Neeraj Gupta Australia Telescope National Facility, CSIRO

ATHENA in the Context of the Next Decade. R. Kennicutt, IoA Cambridge

Exascale IT Requirements in Astronomy Joel Primack, University of California, Santa Cruz

Probing Dark Energy with the Square Kilometre Array

The SKA Molonglo Prototype (SKAMP) progress & first results. Anne Green University of Sydney

Mario Santos (on behalf of the Cosmology SWG) Stockholm, August 24, 2015

Polarimetry with Phased-Array Feeds

Project Engineer SKA Science Data Processor Consortium

Todays Topics 3/19/2018. Light and Telescope. PHYS 1403 Introduction to Astronomy. CCD Camera Makes Digital Images. Astronomical Detectors

Radio Observation of Milky Way at MHz. Amateur Radio Astronomy Observation of the Milky Way at MHz from the Northern Hemisphere

Astrophysics Advisory Committee

Polarimetry. Dave McConnell, CASS Radio Astronomy School, Narrabri 30 September kpc. 8.5 GHz B-vectors Perley & Carilli (1996)

Transition Observing and Science

New physics is learnt from extreme or fundamental things

SKA Science Data Processing

Radio Transient Surveys with The Allen Telescope Array & the SKA. Geoffrey C Bower (UC Berkeley)

Lecture Outlines. Chapter 5. Astronomy Today 8th Edition Chaisson/McMillan Pearson Education, Inc.

How do you make an image of an object?

evlbi Networks Bringing Radio Telescopes Together

Building LOFAR status update

Cycle 1 & 2 Status & Cycle 3 results

SETI with SKA1 and SKA2

Radio Interferometry and VLBI. Aletha de Witt AVN Training 2016

Planning an interferometer observation

Transcription:

Software, Computing and Data Storage for the Square Kilometre Array Duncan Hall XLDB 2009 2009 August 28

Outline SKA: What are the drivers? How does radio astronomy work? What are the SKA s prime characteristics? Project Phases: Where are we at? What are the possible operating modes? Can large software really be that hard? Real-time data: pushing the HPC envelope How much data do we need to store? Summary

A global project 55 institutes in 19 countries Similarities to CERN?

Science ce drivers Origins Cosmology and galaxy evolution Galaxies, dark matter and dark energy Probing the Dark Ages Formation of the first stars Cradle of life Search for signs of life Fundamental Forces Strong-field tests of general relativity Was Einstein correct? Origin and evolution of cosmic magnetism Where does magnetism come from? Exploration of the Unknown Adapted from R. Schilizzi

Answering the questions: s ALMA ELT SKA JWST IXO

The journalists view?

How does radio astronomy work?

76 m: 3,000+ tonnes

How telescopes escopes work Conventional telescopes reflect the light of a distant object from a parabolic surface to a focus M. Longair via R. Schilizzi

A partially a filled aperture e... But the reflecting surfaces do not need to be part of the same surface. Suppose we cover up most of the surface of the mirror M. Longair via R. Schilizzi

... can produce images We can still combine the radiation from the uncovered sections to create an image of the distant object, if we arrange the path lengths to the focus to be the same. M. Longair via R. Schilizzi

Time delays correct path lengths To make sure that the waves travel the same distance when they are combined, we need to add this path difference to the waves arriving at the other telescope. The incoming waves are in phase. M. Longair via R. Schilizzi

Radio interferometry e et Each pair of antennas is called a baseline More different baselines more detailed the image Short baselines - antennas are close to each other - provide coarse structure Long baselines provide the fine detail: the longer the finer the detail t s M. Longair via R. Schilizzi Correlator A 6 antenna interferometer has 15 baselines

1974: removing artefacts Högbom 1974 Aperture Synthesis with a Non-Regular Distribution of Interferometer Baselines, Astronomy and Astrophysics Supplement, Vol. 15, p.417

Resultant point spread function Högbom 1974 Aperture Synthesis with a Non-Regular Distribution of Interferometer Baselines, Astronomy and Astrophysics Supplement, Vol. 15, p.417

CLEAN iterations: results Högbom 1974 Aperture Synthesis with a Non-Regular Distribution of Interferometer Baselines, Astronomy and Astrophysics Supplement, Vol. 15, p.417

What are the SKA s prime characteristics?

EVLA 25 m ds dish

Inside sdean EVLA dish ds

SKA: prime characteristics acte stcs 1. More collecting area: ~1km 2 Detect and image hydrogen in the early universe Sensitivity ~ 50 x EVLA, LOFAR 2. Bigger field of view Fast surveying capability over the whole sky Survey speed ~ 1,000,000 x EVLA EVLA 3. Wide ranges of frequencies Low : 70-300 MHz Mid: 300 MHz-10 GHz High: 10-25+ GHz 4. Large physical extent : ~3,000+ km Detailed imaging of compact objects Astrometry with ~0.001 arc second angular resolution Adapted from R. Schilizzi

Artist s tstsimpression pesso 1,500 dishes (~15m diameter) in a5km core Additional 1,500 dishes from 5 km to ~3,000+ km Aperture arrays (AA) in a core Signal processing: (1) beam forming Optical fibre connection to (2) correlator Optical fibre to remote (3) High Performance Computer

One possible configuration o Comms links + power Correlator 40 stations 10-200 km Dishes in stations along spiral arms 40 remote stations 200 to >3,000 km Dishes Dense AA Sparse AA Max. Distance for Dense AAs 200 km Station Adapted from A. Faulkner

Signal Sg Path

Where are we at?

Phased construction o Phase 1: 2013 2018 construction 10-20% of the collecting area Phase 2: 2018 2022 construction Full array at low and mid frequencies Phase 3: 2023+ construction High frequencies

Current development e e PrepSKA 2008-20112011 The Preparatory Phase for the SKA is being funded by the European Commission s 7 th Framework Program 5.5M EC funding for 3 years + 17M contributed funding from partners (still growing) 150M SKA-related R&D around the world Coordinated by the Science and Technology Facilities Council (UK)

WP2: Design + Cost Coordinated by the SKA Program Development Office in Manchester System Definition iti Dishes, feeds, receivers Aperture arrays Signal transport Signal processing Software High performance computers Data storage Power requirements

What are the possible operating modes?

One possible operational mode: http://www.spectrum.ieee.org/jun08/6247 van Rossum and Drake The Python Language Reference Release 3.0 2008Dec04

https://safe.nrao.edu/wiki/pub/software/calim09program/meqtrees-calibration.pdf Another possible mode:

Yet another possible mode Data Archive Researcher Real-Time Pipeline Real-Time M&C Scheduling RoW Researchers 1 4 3 2 Stored Proposal Create Proposal Schedule observation Initiate observation 4 Meta Data Receptors initialised Correlator initialised Buffered Raw Data 5 Calibration Buffered Calibrated Data Imaging Science Results 6: Non-imaging Analyse & Visualise Results Terminate observation Reset schedule Results released to RoW 7 Data & Results Available 8: HPC

Can large software really be that hard?

Yes! First-order model for estimating effort Diseconomies of scale Confirmation from the literature

Sound familiar? a Over-commitment Frequent difficulty in making commitments that staff can meet with an orderly engineering process Often resulting in a series of crises During crises projects typically abandon planned procedures and revert to coding and testing In spite of ad hoc or chaotic processes, can develop products that work However typically cost and time budgets are not met Success depends on individual competencies and/or death march heroics Can t be repeated unless the same individuals work on the next project Capability is a characteristic of individuals, not the organisation Mark Paulk et alia: The Capability Maturity Model for Software; in Software Engineering M. Dorfman and R. H. Thayer, 1997

An ill-conditioned non-linear problem: where, 1 Change requests Requirements??

First-order model for estimating effort

McConnell s data on log-log axes 1,000 Estimated Effort for Scientific Systems & Engineering Research Projects Source: "Software Estimation"; S. McConnell: 2006 Staff Years 100 Staff Years = 2.0( 5) x (SLOC)^1.33 Staff Years = 4.7( 6) x (SLOC)^1.36 McConnell diseconomy 10 +1 Standard Deviation Avg. Staff Years (=12 months) Source Lines of Code (SLOC) 1 10,000 100,000 1,000,000

2008Nov11 Case study c.f. McConnell data 1,000 Estimated Effort for Scientific Systems & Engineering Research Projects Source: "Software Estimation"; S. McConnell: 2006 Staff Years 100 Staff Years = 2.0( 5) x (SLOC)^1.33 Staff Years = 4.7( 6) x (SLOC)^1.36 McConnell diseconomy 10 Staff Years= 4.5( 11) x SLOC^2.4 +1 Standard Deviation Case study 1 diseconomy Avg. Staff Years (=12 months) Source Lines of Code (SLOC) 1 10,000 100,000 1,000,000 http://www.skatelescope.org/pages/wp2_meeting_10nov08/day2-3_hall.pdf

IEEE Software, September/October 2009 Astronomers developed legacy codes

How big are the legacy codes? MSLOC: Debian 4.0 283 Mac OS X 10.4 86 Vista 50 Linux kernel 2.6.29 29 11 OpenSolaris 10 Kemball, R. M. Crutcher, R. Hasan: A component-based framework for radio-astronomical imaging software systems Software Practice and Experience: 2008; 38: pp 493 507 http://en.wikipedia.org/wiki/source_lines_of_code

Legacy: ~20 to ~700+ staff years effort? 1,000 Estimated Effort for Scientific Systems & Engineering Research Projects Source: "Software Estimation"; S. McConnell: 2006 Staff Years 100 Staff Years = 2.0( 5) x (SLOC)^1.33 Staff Years = 4.7( 6) x (SLOC)^1.36 10 Staff Years= 4.5( 11) x SLOC^2.4 +1 Standard Deviation Avg. Staff Years (=12 months) Source Lines of Code (SLOC) 1 10,000 100,000 1,000,000 http://www.skatelescope.org/pages/wp2_meeting_10nov08/day2-3_hall.pdf;

A software intensive system product is much more than the initial algorithm: Algorithm PoC program x3 Software intensive system x3 Product: Generalisation Testing Documentation Maintenance etc. Software intensive system product x~10 Fred Brooks; The Mythical Man-Month Essays on Software Engineering Anniversary Edition : 1995

SDSS: Gray and Szalay Where the Rubber Meets the Sky: Bridging the Gap between Databases and Science MSR-TR-2004-110: 2004 October One problem the large science experiments face is that software is an out-of-control expense They budget 25% or so for software and end up paying a lot more The extra software costs are often hidden in other parts of the project the instrument control system software may be hidden in the instrument budget

SKA real-time data: pushing the HPC envelope...

Φ2 real-time data from dishes SKA Conceptual SKA Conceptual High Level Block Diagram Block Diagram Outlying Station Outlying Station 80 Gbs -1 80 Gbs -1 ~40 80 Gbs -1 Outlying Station Outlying stations on spiral arm (only one arm is shown) Outlying Station On Site 0.1 ~ 1 ExaFlops; 0.1 ~ 1 ExaByte storage ~2,280 80 Gbs -1 per dish Wide band single pixel feeds (WBSPF) Dish Array Dense Aperture Array Phased array feeds (PAF) Outlying Station Signal Processing Facility Operations and Maintenance Centre Science Computing Facility SKA HQ (Off Site) Regional Science Centre(s) Global Sparse Aperture Array Low High Digital Signal Processing Beamforming Correlation High Performance Computing Data Storage Regional Engineering Centre(s) 186 Tbs -1 44 TBs -1 P. Dewdney et al. The Square Kilometre Array ; Proceedings of the IEEE; Vol. 97, No. 8; pp 1482 1496; August 2009 J. Cordes The Square Kilometre Array Astro2010 RFI #2 Ground Response 27 July 2009; Table 1, pp 9-10 Drawing number : TBD Date : 2009-02-13 Revision : E

Pushing the HPC envelope Performance [TFlops] = 0.055e 0.622(year-1993) ~1 EFlop ~10 PFlop SKAΦ2 2022 ~100 TFlop SKAΦ1 ASKAP Cornwell and van Diepen Scaling Mount Exaflop: from the pathfinders to the Square Kilometre Array http://www.atnf.csiro.au/people/tim.cornwell/mountexaflop.pdf

Orders of magnitude are required: 1,000,000,000 (Exaflop) 100,000,000 Gigaflops World's 500 Most Powerful and Efficient Computers [2007 2009 Green500]: Gigaflops vs kwatts ( Mflops vs Watts) >20 times greater Gigaflops / kwatt required 10,000,000 10,000 Mflops/Watt 500 Mflops/Watt ~100 times more Gigaflops required for SKA 1,000,000 (Petaflop) 2008 & 2009 World's most powerful computer 100,000 10Mfl Mflops/Watt 2009Jun 2008Nov 2008Jun 2008Feb 10,000000 2007Nov Source: http://www.green500.org/ accessed 2009Jul10 kwatts 1,000 (Teraflop) 10 100 1,000 10,000 100,000 http://www.green500.org accessed July 2009

How much data do we need to store?

Communications of the ACM August 2009 vol. 52 no. 8 Large datasets challenge applications

10 PetaByte tape robot at CERN 500-GB tapes switched to 1-TB models an upgrade that took a year of continuous load/read/load/write/discard operations, running in the interstices between the data centre s higher-priority tasks http://www.flickr.com/photos/doctorow/2711081044/sizes/o/in/set-72157606675048531/ C. Doctorow Welcome to the Petacentre ; Nature, Vol. 455, $ September 2008, pp. 17-21

Disk storage: annual 50% cost reduction 1 EB = $1~$10 million P. Kogge et alia ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems ; TR-2008-13, DARPA ExaScale Computing Study, 2008 Sep 28, page 125 Note: neither RAID, controllers, nor interconnect cables are included in these estimates

Power for EB-size disk looks reasonable P. Kogge et alia ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems ; TR-2008-13, DARPA ExaScale Computing Study, 2008 Sep 28, page 124 Note: power numbers here are for the drives only; any electronics associated with drive controllers [e.g. ECC, RAID] needs to be counted separately

Su Summary ay http://en.wikipedia.org/wiki/file:pogo_-_earth_day_1971_poster.jpg