Rick Ebert & Joseph Mazzarella For the NED Team. Big Data Task Force NASA, Ames Research Center 2016 September 28-30

Similar documents
Real Astronomy from Virtual Observatories

Astronomy. Catherine Turon. for the Astronomy Working Group

Department of Astrophysical Sciences Peyton Hall Princeton, New Jersey Telephone: (609)

Science in the Virtual Observatory

NASA/IPAC EXTRAGALACTIC DATABASE

ESASky, ESA s new open-science portal for ESA space astronomy missions

Machine Learning Applications in Astronomy

Cosmology The Road Map

A Library of the X-ray Universe: Generating the XMM-Newton Source Catalogues

Astro 101 Slide Set: Multiple Views of an Extremely Distant Galaxy

Chandra: the Next Decade

D4.2. First release of on-line science-oriented tutorials

RLW paper titles:

Gaia data in the CDS portal. Sébastien Derriere, Mark Allen, Thomas Boch & CDS team Observatoire de Strasbourg, France

Lecture 11: SDSS Sources at Other Wavelengths: From X rays to radio. Astr 598: Astronomy with SDSS

Extragalactic Astronomy

Lesson 4 - Telescopes

Time Domain Astronomy in the 2020s:

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

Scientific Data Flood. Large Science Project. Pipeline

An intelligent client application for on-line astronomical information

Active Galaxies and Galactic Structure Lecture 22 April 18th

Doing astronomy with SDSS from your armchair

LSST, Euclid, and WFIRST

Beyond the Visible -- Exploring the Infrared Universe

2019 Astronomy Team Selection Test

Exploring the Depths of the Universe

The Herschel Multi-tiered Extragalactic Survey (HerMES) The Evolution of the FIR/SMM Luminosity Function and of the Cosmic SFRD

STScI at the AAS 231: January 2018, Washington, D.C.

ISO and AKARI: paving the road to Herschel

Princeton Astrophysics Community Meeting

Data challenges of the Virtual Observatory in Time Domain Astronomy

Michael L. Norman Chief Scientific Officer, SDSC Distinguished Professor of Physics, UCSD

JWST and the ELTs Bruno Leibundgut

The NASA/IPAC Infrared Science Archive (IRSA): The Demo

Galaxies. Galaxy Diversity. Galaxies, AGN and Quasars. Physics 113 Goderya

Table of Contents and Executive Summary Final Report, ReSTAR Committee Renewing Small Telescopes for Astronomical Research (ReSTAR)

Part 3: The Dark Energy

Dark Energy: Measuring the Invisible with X-Ray Telescopes

MATISSE. Multi-purpose Advanced Tool for Instruments for the Solar System Exploration. Angelo Zinzi 1,2, Maria Teresa Capria 3,1, Angelo Antonelli 1,2

Lessons Learned! from Past & Current ESA-NASA Collaborations

Current Status of Chinese Virtual Observatory

Multi-wavelength Astronomy

New Worlds, New Horizons in Astronomy and Astrophysics

Detecting New Sources of High-Energy Gamma Rays

Clusters of Galaxies " High Energy Objects - most of the baryons are in a hot (kt~ k) gas." The x-ray luminosity is ergs/sec"

GALAXY EVOLUTION STUDIES AND HIGH PERFORMANCE COMPUTING

Skyalert: Real-time Astronomy for You and Your Robots

Rest-frame properties of gamma-ray bursts observed by the Fermi Gamma-Ray Burst Monitor

ASI - Italian Space Agency The Space Science Data Center is a Research Infrastructure of the Italian Space Agency

The PRIsm MUlti-object Survey (PRIMUS)

Astronomy across the spectrum: telescopes and where we put them. Martha Haynes Discovering Dusty Galaxies July 7, 2016

Quasars and Active Galactic Nuclei (AGN)

Diffuse Gamma-Ray Emission

ESASky. Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro

arxiv: v1 [astro-ph.im] 11 May 2010 Alberto Accomazzi Harvard-Smithsonian Center for Astrophysics

Active Galaxies & Quasars

PLANCK SZ CLUSTERS. M. Douspis 1, 2

*** IAU Division D Bulletin N. 2: July 1st, 2014 ***

Hubble s Law and the Cosmic Distance Scale

The Millennium Simulation: cosmic evolution in a supercomputer. Simon White Max Planck Institute for Astrophysics

The Big Bang Theory. Rachel Fludd and Matthijs Hoekstra

Photometric Redshifts with DAME

The Contents of the Universe (or/ what do we mean by dark matter and dark energy?)

Astronomy across the spectrum: telescopes and where we put them. Martha Haynes Exploring Early Galaxies with the CCAT June 28, 2012

Modern Image Processing Techniques in Astronomical Sky Surveys

New Astronomy With a Virtual Observatory

Nearby Galaxy Clusters with UVIT/Astrosat

Fermi: Highlights of GeV Gamma-ray Astronomy

Feeding the Beast. Chris Impey (University of Arizona)

IRS Spectroscopy of z~2 Galaxies

Virtual Observatory: Observational and Theoretical data

10 Years. of TeV Extragalactic Science. with VERITAS. Amy Furniss California State University East Bay

PERSPECTIVES of HIGH ENERGY NEUTRINO ASTRONOMY. Paolo Lipari Vulcano 27 may 2006

Holdings and distribution

International Olympiad on Astronomy and Astrophysics (IOAA)

WISE - the Wide-field Infrared Survey Explorer

GLAST. Exploring the Extreme Universe. Kennedy Space Center. The Gamma-ray Large Area Space Telescope

ESASky. Bruno Merín Sara Nieto Elena Racero Jesús Salgado María Henar Sarmiento Pilar de Teodoro

Type II Supernovae as Standardized Candles

These notes may contain copyrighted material! They are for your own use only during this course.

The Bright Side of the X-ray Sky The XMM-Newton Bright Survey. R. Della Ceca. INAF Osservatorio Astronomico di Brera,Milan

Gamma-ray Astronomy Missions, and their Use of a Global Telescope Network

Learning Objectives: Chapter 13, Part 1: Lower Main Sequence Stars. AST 2010: Chapter 13. AST 2010 Descriptive Astronomy

Standards for Planetarium Audiovisuals

Introduction to the Sloan Survey

GOODS/VIMOS Spectroscopy: Data Release Version 2.0.1

Part two of a year-long introduction to astrophysics:

Astr Resources

Chapter 5. Telescopes. Copyright (c) The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

VOSA. A VO Spectral Energy Distribution Analyzer. Carlos Rodrigo Blanco 1,2 Amelia Bayo Arán 3 Enrique Solano 1,2 David Barrado y Navascués 1

Multi-wavelength Surveys for AGN & AGN Variability. Vicki Sarajedini University of Florida

Building a 4-D Weather Data Cube for the NextGen Initial Operating Capability (IOC)

Science with the New Hubble Instruments. Ken Sembach STScI Hubble Project Scientist

6. Star Colors and the Hertzsprung-Russell Diagram

Investigating the Infrared Properties of Candidate Blazars. Jessica Hall

CHEMICAL ABUNDANCE ANALYSIS OF RC CANDIDATE STAR HD (46 LMi) : PRELIMINARY RESULTS

arxiv:astro-ph/ v1 6 Dec 1999

GLAST LAT Multiwavelength Studies Needs and Resources

Galaxies: Structure, formation and evolution

Transcription:

NED Mission: Provide a comprehensive, reliable and easy-to-use synthesis of multi-wavelength data from NASA missions, published catalogs, and the refereed literature, to enhance and enable astrophysical research beyond the Milky Way. Rick Ebert & Joseph Mazzarella For the NED Team Big Data Task Force NASA, Ames Research Center 2016 September 28-30 2016 California Institute of Technology

NED Overview Published NED is more than an archive, it is a comprehensive and current census of the universe. NED plays a key role in maximizing the science return from NASA missions by synthesizing NASA data with complementary data across the spectrum. Distilling, vetting, and addition of derived information simplifies and accelerates research. Keep current by continuously integrating new information from literature, surveys and mission archives. NAC Big Data Task Force NED synthesizes key measurements from >100k articles and catalogs 22 NASA missions: 2MASS, BLAST, Chandra, Compton, Einstein, Fermi, GALEX, HEAO, Herschel, HST, INTEGRAL, IRAS, NuSTAR, Planck, ROSAT, RXTE, Spitzer, Suzaku, UIT, WISE, WMAP, XMM-Newton Holdings 214 million objects 6.97 billion data records 7 TByte database 700 GBytes of contributed images & spectra ned.ipac.caltech.edu 2 Object name Coordinates Redshifts DMpc Fluxes Sizes Attributes References Notes Contributed Images Spectra Derived Cross-IDs Distances Metric sizes! SEDs Luminosities! Aλ Velocity corrections Cosmological corrections

Improving NED with Technology Our primary task is to run a robust operational system with continuous, distributed client connections, applying new technology where appropriate. User Interface Using web application framework/ CMS Drupal Leverages extensive base of community software and expertise NED Form Engine builds data driven user interface forms Simple Search for common queries Machine Learning Initial objective: - Classify e-journal articles for suitability to NED Challenges: - Data are presented differently - Capturing expertise Current focus: - Evaluate various approaches - Rule-base training Future: - apply Machine Learning to data prospecting by NED users - literature data extraction & object classification Database Classification of data types in articles Ongoing: refactoring and extensions to improve scalability, extensibility, and enable data prospecting, classification & mining Future: Leverage extensions for spatial indexing NAC Big Data Task Force 3

Big Data Implications Volume Over 2016-2019, NED is growing by > 10x to support NASA science with data joined for > 1 billion objects with > 10 billion attributes Response: Refactoring database schema to improve scalability, extensibility, and data driven design Velocity Rising data rate from sky surveys & literature Keeping current is essential Response: Parallel processing of cross- IDs, including new pipeline, and small pilot project in machine learning Variety 22+ NASA mission data sets γ rays through radio frequencies wide variety of data-types 100k+ articles & catalogs each with unique data presentation Response: advancing data capture & fusion techniques Veracity Peer-reviewed literature is a primary data source. Data often not refereed to the same extent results. Response: Scientific expertise to make big data better Vetting, communicating with authors on issues Published document Best Practices for Data Publication by the NED Team for journals Instructions for Authors NAC Big Data Task Force 4

Science with NED-Today The Event Horizon of M87" by Avery E. Broderick et al. 2015, ApJ, 805, 179 Discovery of a new class of super-luminous spiral galaxies This previously unknown class challenges theories of galaxy formation and evolution. The discovery is based entirely on data joined in NED. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 NASA press release 2016 March 17 Patrick Ogle et al. 2016, ApJ, 817, 109 NAC Big Data Task Force 5 This work would not have been practical without extensive use of the NASA/IPAC Extragalactic Database (NED). NED is cited in 20% of Astronomical Telegrams published Jan-Jun 2016 Used extensively at the telescope. E.g., identification of host galaxies of newly discovered supernovae

Science with NED-Future Vision: Evolve the database and user interfaces to facilitate statistical studies of panchromatic data fused for billions of galaxies and AGN observed across cosmic time Enabling analysis of Spectral Energy Distributions with broad coverage for millions of galaxies. Enable powerful data prospecting queries with information joined across missions Build a Spectral Energy Distribution of WFIRST galaxies Find published galaxy clusters with > 30 members Find galaxies with Luminosity > 3.2*10 11 [L ] Find objects with multi-mission flux ratio f(galex FUV)/f(Spitzer 25 um) > 1E-04 Develop visualizations of content and completeness to support statistical studies, and better framed queries. NAC Big Data Task Force 6

Impact Metrics Every day NED serves over 70,000 queries and is acknowledged by 2 peer-reviewed articles. Usage NED Literature citations A record 730 peer-reviewed articles and 264 telegrams acknowledged using NED in 2015 Trend of incorporating NED access in tools, services, and automated processes is reflected in usage. Comparable to a NASA Great Observatory 26 million requests in 2015 994 in 2015 2016 NASA Group Achievement Award for outstanding contributions to NASA s strategic goals in astrophysics and substantial impact on extragalactic research measured by citations in the peer-reviewed literature. NAC Big Data Task Force 7

Question 1: Planning What are the processes for planning for future (5-10 years) capabilities of your service? How and from whom do you gather input for this planning process and where does input typically come from? NASA Astrophysics Archive Senior Review Process Assessment of relevance to NASA Astrophysics Roadmap Recommendations from NED Users Committee (NUC) User Surveys and Outreach at science conferences. What new feature(s) do you really want to implement? Statistical algorithms and Machine Learning analysis integrated and applied directly to the content of NED Multi-dimensional exploration tools and interfaces NAC Big Data Task Force 8

Question 2: Inertia What feature(s) of your service would you like to stop performing? Bibliographic searches by author name and serving journal abstracts How do you gather input for making such decisions and where does input typically come from? Consultation with our User Committee, User Surveys, Science outreach NASA Astrophysics Roadmap What is preventing you from stopping? This is in progress, and requires: Extending current interoperability between NED and Astrophysics Data System (ADS) NED providing additional metadata about journal article data to ADS ADS providing interface filters specific to extragalactic data tagged/categorized by NED NAC Big Data Task Force 9

Question 3: Interoperability (1) What steps you are taking to make your data interoperable with allied data sets from other data sites in and out of NASA? NED provides an interoperability component used by other archives and tools. Object information and cross-identifications Name resolution Links in results to the originating and other allied archive data sets. We maintain this by Participation in NASA Astrophysics Data-Center Executive Committee (ADEC) Participation in NASA Astronomy Virtual Observatory (NAVO) Use of IVOA query and data exchange recommendations where applicable Streamlining connectivity with open, community data analysis tools Development of NED specific APIs: Name Resolver; ObjectLookup; JSON NAC Big Data Task Force 10

Question 3: Interoperability (2) How do you find allied data sets and what criteria make data sets candidates for enabling interoperability? Data sets are selected by scanning e-journal articles and monitoring NASA mission data product publication. Criteria: Contains published and peer reviewed information Contains newly identified extragalactic objects New or improved measured attributes for objects already in NED Contains well formed data and metadata NAC Big Data Task Force 11

NED provides diverse and rich data synthesis on a large scale global connections to the astronomy community key interoperability services to other data centers tools and data to accelerate extragalactic research NED is growing more than 10-fold in the next few years to facilitate panchromatic research on billions of objects. NED is continuously seeking improvements for our next generation services and the technology to achieve them. NAC Big Data Task Force 12