Unveiling the misteries of our Galaxy with Intersystems Caché

Similar documents
First Cycle Processing of Gaia data

Gaia Status & Early Releases Plan

Gaia Data Processing - Overview and Status

Exoplanetary transits as seen by Gaia

High-performance computing in Java: the data processing of Gaia. X. Luri & J. Torra ICCUB/IEEC

Selection of stars to calibrate Gaia

GREAT. Kick-off meeting

Star clusters before and after Gaia Ulrike Heiter

Taking the census of the Milky Way Galaxy. Gerry Gilmore Professor of Experimental Philosophy Institute of Astronomy Cambridge

Gaia: Mapping the Milky Way

Scientific Data Flood. Large Science Project. Pipeline

The Gaia Mission. Coryn Bailer-Jones Max Planck Institute for Astronomy Heidelberg, Germany. ISYA 2016, Tehran

Gaia News:Counting down to launch A. Vallenari. INAF, Padova Astronomical Observatory on behalf of DPACE

Distributed Genetic Algorithm for feature selection in Gaia RVS spectra. Application to ANN parameterization

Gaia ESA's billion star telescope

Gaia. Stereoscopic Census of our Galaxy. one billion pixels for one billion stars

Gaia. Stereoscopic Census of our Galaxy. one billion pixels for one billion stars

Simulations of the Gaia final catalogue: expectation of the distance estimation

The Gaia mission: status, problems, opportunities

Access to massive catalogues in the Gaia archive: a new paradigm

Big-Data as a Challenge for Astrophysics

The Impact of Gaia on Our Knowledge of Stars and Their Planets

Gaia, the universe in 3D: an overview of the mission. Gaia, the universe in 3D: an overview of the mission. X. Luri, ICCUB/IEEC

Thoughts on future space astrometry missions

WISE Science Data System Single Frame Position Reconstruction Peer Review: Introduction and Overview

HOST: Welcome to Astronomy Behind the Headlines, a podcast by the Astronomical

ESA Gaia & the multifrequency behavior of high-energy sources with ultra-low dispersion spectroscopy

arxiv:astro-ph/ v1 28 Jun 2002

arxiv:astro-ph/ v1 29 Nov 2006

APPLICATIONS FOR PHYSICAL SCIENCE

The Gaia mission. Objectives, description, data processing F. Mignard. Observatory of the Côte d'azur, Nice.

Gaia: at the frontiers of astrometry ELSA conference 7-11 June 2010

Large-Scale Behavioral Targeting

Towards an accurate alignment of the VLBI frame with the future Gaia optical frame

Open Research Online The Open University s repository of research publications and other research outputs

A Library of the X-ray Universe: Generating the XMM-Newton Source Catalogues

Figure 19.19: HST photo called Hubble Deep Field.

The Arabs: The visible universe is times larger than Tycho Brahe believed. Translation of Greek texts and Science

Astrophysics Advisory Committee

The Three Dimensional Universe, Meudon - October, 2004

arxiv: v1 [astro-ph] 12 Nov 2008

From the VLT to ALMA and to the E-ELT

Tristan Cantat-Gaudin

THE MILKY WAY GALAXY BACKGROUND READING FOR MIDDLE AND HIGH SCHOOL SCIENCE

The Gaia CCD radiation damage

GDR1 photometry. CU5/DPCI team

Astroinformatics: massive data research in Astronomy Kirk Borne Dept of Computational & Data Sciences George Mason University

Tests of MATISSE on large spectral datasets from the ESO Archive

The FAME Mission: An Adventure in Celestial Astrometric Precision

(Present and) Future Surveys for Metal-Poor Stars

Stellar distances and velocities. ASTR320 Wednesday January 24, 2018

The Milky Way Galaxy is Heading for a Major Cosmic Collision

Transient Astronomy with the Gaia Satellite

Linking the ICRF and the future Gaia optical frame

Astronomers discover an active, bright galaxy "in its infancy"

Chapter 24. Stars, Galaxies & the Universe. Distance units

Final Presentation of Assessment Star Tracker for Asteroid Search. By Peter Davidsen

The Square Kilometre Array and Data Intensive Radio Astronomy 1

JINA Observations, Now and in the Near Future

Real Astronomy from Virtual Observatories

Gaia: Astrometric performance and current status of the project

arxiv: v1 [astro-ph.im] 25 Apr 2018

An end-to-end simulation framework for the Large Synoptic Survey Telescope Andrew Connolly University of Washington

Local Volume, Milky Way, Stars, Planets, Solar System: L3 Requirements

Habitable worlds: Giovanna Tinetti. Presented by Göran Pilbratt. Image&credit&Hanno&Rein

Galaxies and the Universe

SVOM in the multi-messenger area

HD Transits HST/STIS First Transiting Exo-Planet. Exoplanet Discovery Methods. Paper Due Tue, Feb 23. (4) Transits. Transits.

Leonid Meteor Observer in LEO: A Proposal for a University Microsatellite for the 2001 Leonids

arxiv:astro-ph/ v1 5 Oct 2001

WISE Science Data System Frame Co-addition Peer Review: Introduction and Overview

Galaxy Collisions & the Origin of Starburst Galaxies & Quasars. February 24, 2003 Hayden Planetarium

AST 101 Intro to Astronomy: Stars & Galaxies

Standard candles in the Gaia perspective

The Science of Gaia and Future Challenges

Astroinformatics in the data-driven Astronomy

Gaia Photometric Data Analysis Overview

Massive OB stars as seen by Gaia

Our Place in the Universe

Planck. Report on the status of the mission Carlo Baccigalupi, SISSA

Impressions: First Light Images from UVIT in Orbit

From Gaia frame to ICRF-3 3?

Amplification (or not) of Light Pollution due to the presence of clouds

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE

PROBA 1. F. Teston ESA/ESTEC D/TEC-EL

A Large Monolithic-Aperture Optical/UV Serviceable Space Telescope Deployed to L2 by an Ares-V Cargo Launch Vehicle

Astronomy of the Next Decade: From Photons to Petabytes. R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST

A Ramble Through the Night Sky

RLW paper titles:

Detection of Polarization Effects in Gaia Data

PlanetQuest The Planet-Wide Observatory

Astro 101 Slide Set: Multiple Views of an Extremely Distant Galaxy

Gaia Astrometry Upkeeping by GNSS - Evaluation Study [GAUGES]

Rosetta Mission Amateur Data Archive

Rømer Science Mission Plan

Galaxies and Star Systems

Dr. Andrea Bocci. Using GPUs to Accelerate Online Event Reconstruction. at the Large Hadron Collider. Applied Physicist

Detection and characterization of exoplanets from space

Cecilia Fariña - ING Support Astronomer

arxiv: v1 [astro-ph.im] 13 Jun 2017

Transcription:

Unveiling the misteries of our Galaxy with Intersystems Caché Dr. Jordi Portell i de Mora on behalf of the Gaia team at the Institute for Space Studies of Catalonia (IEEC UB) and the Gaia Data Processing and Analysis Consortium throughout Europe InterSystems Spain Summit 2016 - Barcelona, 26 October 2016

It s all about big numbers After all, our areas are not that different Human Genome: ~3 billion base pairs Company customers: up to ~7 billion people (potentially) Our Galaxy: ~200 billion stars

So you want a census of our Galaxy? 1. Get the list of stars 2. Locate them 3. Get their information a) CV: Where are they coming from? (and where are they going?) b) Education: How bright are they? c) Personal interests: Which is their favourite color? d) Food habits: What are they composed of? 4. Do this for a representative fraction of the Galaxy Gentle reminder: we re talking about ~200 billion stars (2x10 11 )

The answer: ESA s Gaia mission Global Astrometry from Space Positions, distances and motions of ~1 billion stars Photometry: brightness and colors Spectroscopy: fingerprint (chemical composition) Most complete and accurate Milky Way census to date Accuracy: ~0.000000004 degrees (~15 microarcsecs)

The Gaia spacecraft in a nutshell Orbit around L2 point 1.5 million Km from Earth 5 years (nominally) 2 telescopes Gigapixel camera 106 CCDs, 9Mpix each Autonomous operation Object detection onboard

Launch! Watch video here: https://www.youtube.com/watch?v=gidvvgtefjg and more Gaia videos here: http://www.cosmos.esa.int/web/gaia/media-gallery/videos

Data, data, data! Downlink: 7 Mbps, 8h/day 25 GB / day (65 GB uncompressed) 115 TB in 5 years 50 million measurements / day (1 measurement = 12 15 tiny photos) 100 billion measurements in 5 years

And now what do we do with the data? All data must be promptly received and handled Continuous data accumulation and arrangement Spacecraft health must be monitored daily Many issues can only be detected after data processing Big processing only at the end? NO! Incremental data processing VERY complex algorithms HUGE system of equations iterative solution

The Data Processing and Analysis Consortium

DPAC keystone: the Daily Pipeline Reception and decoding of raw data packets Decompression Initial Data Treatment (IDT) First Look diagnostics and calibrations

Some features of the Gaia daily pipeline Complex software system written in Java (as all DPAC) Portability, scalability (multi thread operation), development tools ~1 million lines of code Many different algorithms Data driven approach: triggering depending on data type received High efficiency required Data train approach: prepare data and pass to algorithms Avoid random data access if possible!

Products from the daily pipeline Raw measurements reconstruction Flagging of peculiar objects and conditions Monitoring of basic angle between telescopes Interferometry: accuracy of 3.6 picometres

Products from the daily pipeline First determination of satellite attitude 100 milliarcsec accuracy (0.00003 degrees) Determination of astrophysical background Needed for an accurate photometry

Products from the daily pipeline Image parameters determination Main output for downstream systems Position and brightness Preliminary cross matching Identification of observations and link to catalogue sources Most demanding element in terms of DB performance Continuous queries on a table with ~3 billion records

But is this technically feasible?? YES! Pipeline typically idle a few hours/day even in dense days Peaks of ~150 million measurements/day handled without problems ~10 tables with typ. ~1 2 billion records Pipeline is stable

How? 30 IBM idataplex computing nodes CPU typically underused; RAM quite busy Very big DB node: 32 Xeon cores (E5 4640 2.4GHz), avg. ~50% in dense days 1.25TB RAM 7.5TB of local SSD for most frequently accessed tables NetAPP NAS storage: FAS 8060 with ~600 disks and 10TB SSD 10G network (Cisco 5548, <1ms latency) Typical DB disk occupation: ~20TB (regular cleanup: only most recent data kept) Intersystems Caché ; ) Over 10 billion records served daily (incl. complex queries) Also CESCA/CSUC (pre deployment tests): 5 computing nodes + DB node ( just 64GB RAM)

2 years of nominal operations 50 billion measurements processed so far

Main achievement (so far): Gaia DR1

Just a tiny (daily) detail...

What is all this useful for? Science and Knowledge Better understanding of the Galaxy we live in Astronomers are so happy with our data! Technological advances Numerical algorithms and techniques Massive data handling systems New data handling and compression algorithms A clear example: FAPEC Compressor so it gives a direct return to you! Efficient data handling and compression of medical images, genomic information, professinal imaging, massive data transfers, etc.

More info: http://gaia.ub.edu @GaiaUB GaiaApp Unveiling the misteries of our Galaxy with Intersystems Caché Dr. Jordi Portell i de Mora jordi.portell@dapcom.es on behalf of the Gaia team at the Institute for Space Studies of Catalonia (IEEC UB) and the Gaia Data Processing and Analysis Consortium throughout Europe InterSystems Spain Summit 2016 - Barcelona, 26 October 2016