Large Synoptic Survey Telescope, Computational-science Requirements of George Beckett, 2 nd September 2016
Overview
Large Synoptic Survey Telescope (LSST) New optical telescope being constructed in northern Chile (Cerro Pachón) High altitude and very dry climate make it ideal site for (ground-based) optical astronomy Large significant width, depth (sensitivity), and speed at which it can image sky Synoptic imaging in 6 colour-bands to form overall picture of sky Survey creating an atlas of entire visible sky
Primary Science Drivers Understand Dark Energy and Dark Matter Create inventory of solar system Catalogue transient sky Map Milky Way
Large Synoptic Survey Telescope (LSST) Costs Construction budget of $640M Operational budget of $370M Funding US National Science Foundation; US Department of Energy (camera technology) International investment (for $100M of operational costs)
Large Synoptic Survey Telescope (LSST) Will run 10-year survey, starting in 2022 Will image 30k sq. deg. of sky every three nights Observe each patch more than 800 times Stacked images help identify very faint images Expect 24Bn new galaxies and 14Bn new stars Frequent, regular visits allow creation of timeseries for dynamic objects Near-earth objects Supernovae (explosive death of star)
Data
Data Management Data is proprietary for two years restricted to LSST PIs Survey data hosted in Data Access Centres in Chile hosting and local computing at NSCA, USA processing, hosting, and local computing at IN2P3, in France processing and hosting UK will host regional Data Access Centre Hosting and local computing possibly supporting more than UK astronomy
Data Classification Four classifications of data Raw data images taken by camera Level 1 Products difference imaging data w.r.t. reference Released nightly, detail objects that have unexpectedly changed brightness or position ~10 6 events published each night (metadata and postage-stamp image) Level 2 Products image products (reduced and calibrated) plus catalogues Released annually, as incremental Data Releases D/R grows from 1.6PB in Yr 1 to 31PB in Yr 10 Level 3 Products datasets derived from Levels 1 and 2 products Not part of survey output: created by community Estimate 10% of computing and storage requirements for Level 3 products
10
UK involvement
LSST:UK Consortium
LSST:UK Science Centre Programme Aug 14: Start of construction Oct 19: First light Oct 22: Start of main survey Sep 32: End of main survey UK Phase A: Development Phase B: Commissioning Phase C: Early Ops. Phase D: Standard Operations Four-phase programme Forecast budget of 32M (not including capital for infrastructure) STFC funding of 17.7M awarded 2.7M for Phase A 15M for LSST operations contributions data rights for 100 UK-based astronomers
LSST:UK Phase A 2.7M for Phase A programme (July 2015 Mar 2019) Edinburgh is coordinator Bob Mann is PI; MGB is project manager and technical lead LUSC-DAC: Data Access Centre (6 staff years, Edinburgh) DAC design DAC testbed, Data Challenges, support for LUSC-DEV LUSC-DEV: (16 staff years, Man, Cam, QUB, Soton, UCL, Oxf) Weak lensing analysis of galaxy intrinsic alignment, shape classification Milky Way star/galaxy separation, tidal stream detection Transients alert handling, classification Solar System NEO, light-curve analysis Sensor characterization image analysis systematics
Current highlights
Characterisation of Dark Energy Weak lensing Statistical technique to determine dark-energy distribution based on observed distortion to distant galaxies Dark Energy Survey 1 pilot (Joe Zuntz, Manch.) Shape determination of 100M galaxy images over 3 wave bands 30TB image data at NERSC and/ or BNL Image classifier im3shape (Python plus C/ C++, FORTRAN kernels). 10 20 secs per galaxy image LSST will have 1,000 data to analyse 1 DES is precursor to LSST Credit: Joe Zuntz/ Manchester
Characterisation of Dark Energy Starting point 30,000 image sets held at NERSC and/ or BNL Image classifier im3shape (Python, C/ C++, FORTRAN) 10 20 secs to analyse each galaxy image, w/ O(10kB) output Computer time at NERSC (Cori), though progress impeded by demand DES analysis undertaken with GridPP (Alessandra Forti, Marcus Ebert, Rob Currie, Daniela Bauer) Credit: Joe Zuntz/ Manchester
Characterisation of Dark Energy Successful pilot GridPP now used for production analysis Turn-around for analysis much faster than previously Migration to GridPP 4 months to complete (and 25 days of effort) BNL input transferred in two steps (better for NERSC) CernVM client better than test node (little local disk) Software staged w/ input, work around lag w/ CVMFS Ganga/ Dirac hybrid (w/ Ganga Tasks) for workflow Final report [MGB, Juntz, Forti, Ebert, Jun 16] Credit: Joe Zuntz/ Manchester
Classification of Supernovae Significant interest in Type 1a SNe Standard Candles w/ known luminosity E.g. measure expansion of Universe Separating Type Ia from others (Ib, II, ) not straightforward For LSST, classification based on light-curve alone Soton/ UCL team running survey simulation testing how schedule and cadence of survey affects ability to classify SNe Use machine learning to enable large-scale SNe classification Credit: Natasha Karpenka, Mark Sullivan/ Soton, Michelle Lochner, Hiranya Peiris, Jason McEwen, Ofer Lahav/ UCL
Classification of Supernovae Optimise LSST survey strategy for SNe classification Generate (OpSim) observing strategy (coordinates and band) Seed sky map with SNe (based on analytical models of light curves) Simulate LSST observations (CatSim) and try to recover SNe Many 12-month observing strategies to be analysed Candidate for GridPP? Cluster-scale CatSim runs Input from (preferably external) observing strategy and SNe model d/b Output to catalogue d/b Credit: Natasha Karpenka, Mark Sullivan/ Soton, Michelle Lochner, Hiranya Peiris, Jason McEwen, Ofer Lahav/ UCL
Dark Energy Science Collaboration (DESC) US team applying to Department of Energy for operational support for science analysis (during Oct 17 Sep 21) Opportunity for UK (and French) to contribute key expertise and secure influence in important tasks UK interest focused on weak lensing (Bridle, Man) and computationalscience (Edin), plus computing time Candidate is PhoSim (generate sky images by simulating path of photons through atmosphere, optics, and camera) Requires: 2M core hrs (2017); 10M core hrs (2018); 15M core hrs (2019) Modest storage requirements, plus supports Dirac interface (used at NERSC)
Survey Catalogue Technology Candidate database is Qserv Distributed relational database Spatially-sharded with overlaps Built on Xrootd LSST:UK evaluation underway Deployment of UKIDSS and PAN- STARRS surveys into test-bed instance (Ebert and Sutorius, Edin) Estimation of query performance/ query optimisation for dark-energy analysis (Tseng, Oxford)
Sensor technologies LSST needs special camera to achieve fast, high-resolution, wide view 3.2 Gigapixels largest astronomy camera ever made Procured through e2v (UK) and ITL (US) e2v CCD250 is custom CCD designed in UK with non-flatness of <5 µm Designed for maximum resolution High quantum effects Ian Shipsey (Oxford) is liaison between e2v and LSST Credit: Ian Shipsey/ University of Oxford
Longer term
Data Access Centre Key output for Phase A activity is proposal for Data Access Centre Host UK copy of LSST sky survey Consume and analyse data from transient-alert stream Provide compute and storage capacity for creation/ analysis of derived data products (UK-specific analysis work) possibly in conjunction with data from other telescopes and surveys Also, host and support commissioning data (if available to LSST:UK) Working with peer PPAN facilities/ experiments through UK-Tier 0 DAC Sizing, Version 0.2 produced (June 2016) Compute (2019 2033) 180M core hrs/ yr; Expect ~4 further iterations during Phase A Storage (2019 2033) 4 150PB
Summary
Summary LSST is most ambitious optical sky survey ever conceived Unprecedented opportunity to Study transient universe Discover and catalogue solar-system Explore galaxy expansion and dark universe Significant collaboration between astronomy and particle physics Dark-energy analysis Camera-technology development
Summary (cont d) UK well-placed to make significant contribution leadership in Transient detection and solar-system science Weak-lensing analysis Supernovae science Milky Way science Significant computational-science and computing challenge Expertise and facilities of GridPP vital for UK to secure significant influence in project Longer term, concerted effort required to ensure significant facilities can fulfil expertise and infrastructure requirements