Astroinformatics: massive data research in Astronomy Kirk Borne Dept of Computational & Data Sciences George Mason University

Similar documents
Scientific Data Flood. Large Science Project. Pipeline

Surprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University

Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University

Astronomy of the Next Decade: From Photons to Petabytes. R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST

Astronomical Notes. Astronomische Nachrichten. A machine learning classification broker for the LSST transient database

The New LSST Informatics and Statistical Sciences Collaboration Team. Kirk Borne Dept of Computational & Data Sciences GMU

Real Astronomy from Virtual Observatories

Machine Learning Applications in Astronomy

Transient Alerts in LSST. 1 Introduction. 2 LSST Transient Science. Jeffrey Kantor Large Synoptic Survey Telescope

How do telescopes work? Simple refracting telescope like Fuertes- uses lenses. Typical telescope used by a serious amateur uses a mirror

Doing astronomy with SDSS from your armchair

13 June Board on Research Data & Information (BRDI)

Astroinformatics in the data-driven Astronomy

The Large Synoptic Survey Telescope

Physics Lab #10: Citizen Science - The Galaxy Zoo

Taking the census of the Milky Way Galaxy. Gerry Gilmore Professor of Experimental Philosophy Institute of Astronomy Cambridge

LARGE SYNOPTIC SURVEY TELESCOPE AZ AAPT March 2011

LSST. Pierre Antilogus LPNHE-IN2P3, Paris. ESO in the 2020s January 19-22, LSST ESO in the 2020 s 1

How do telescopes "see" on Earth and in space?

Astronomy 1. 10/17/17 - NASA JPL field trip 10/17/17 - LA Griffith Observatory field trip

Large Synoptic Survey Telescope

Heidi B. Hammel. AURA Executive Vice President. Presented to the NRC OIR System Committee 13 October 2014

Science advances by a combination of normal science and discovery of anomalies.

THE MILKY WAY GALAXY BACKGROUND READING FOR MIDDLE AND HIGH SCHOOL SCIENCE

An end-to-end simulation framework for the Large Synoptic Survey Telescope Andrew Connolly University of Washington

Optical Synoptic Telescopes: New Science Frontiers *

Introduction: Don t get bogged down in classical (grade six) topics, or number facts, or other excessive low-level knowledge.

Skyalert: Real-time Astronomy for You and Your Robots

Topics and questions for astro presentations

How Did the Universe Begin?

Astronomy. Optics and Telescopes

1 What s Way Out There? The Hubble Ultra Deep Field

Directed Reading. Section: Viewing the Universe THE VALUE OF ASTRONOMY. Skills Worksheet. 1. How did observations of the sky help farmers in the past?

V. Astronomy Section

Kitt Peak Nightly Observing Program

Welcome to AURA Observatory in Chile

9 Exploring GalaxyZoo One

Parametrization and Classification of 20 Billion LSST Objects: Lessons from SDSS

Time Domain Astronomy in the 2020s:

Collecting Light. In a dark-adapted eye, the iris is fully open and the pupil has a diameter of about 7 mm. pupil

GalaxyZoo and the Zooniverse of Astronomy Citizen Science

Foreword. Taken from: Hubble 2009: Science Year in Review. Produced by NASA Goddard Space Flight Center and the Space Telescope Science Institute.

What is astronomy actually? These are good questions and worthy of an answer.

Real-time Variability Studies with the NOAO Mosaic Imagers

A Look Back: Galaxies at Cosmic Dawn Revealed in the First Year of the Hubble Frontier Fields Initiative

RLW paper titles:

Lecture 25: Cosmology: The end of the Universe, Dark Matter, and Dark Energy. Astronomy 111 Wednesday November 29, 2017

The Future of Cosmology

APS Science Curriculum Unit Planner

New Astronomy With a Virtual Observatory

The Galaxy Zoo Project

Telescopes 3 Feb. Purpose

Igor Soszyński. Warsaw University Astronomical Observatory

LSST Science. Željko Ivezić, LSST Project Scientist University of Washington

ANTARES: The Arizona-NOAO Temporal Analysis and Response to Events System

Cosmic acceleration. Questions: What is causing cosmic acceleration? Vacuum energy (Λ) or something else? Dark Energy (DE) or Modification of GR (MG)?

Using Authentic Astronomical Data in Investigations & Activities. Version 1.1, presented at CONASTA 59, UTS, Monday 5 July 2010.

W. M. KECK OBSERVATORY AWARDED NSF GRANT TO DEVELOP NEXT-GENERATION ADAPTIVE OPTICS SYSTEM

Studying the universe

What shall we learn about the Milky Way using Gaia and LSST?

Galaxies and the Universe

Astro-lab at the Landessternwarte Heidelberg. Overview astro-lab & introduction to tasks. Overview astro-lab

Sample file. Solar System. Author: Tina Griep. Understanding Science Series

From DES to LSST. Transient Processing Goes from Hours to Seconds. Eric Morganson, NCSA LSST Time Domain Meeting Tucson, AZ May 22, 2017

Galactic Census: Population of the Galaxy grades 9 12

TMW MEDIA GROUP, INC. RELEASES THE WONDERS OF ASTRONOMY & SPACE

Active Galaxies and Galactic Structure Lecture 22 April 18th

Table of Contents and Executive Summary Final Report, ReSTAR Committee Renewing Small Telescopes for Astronomical Research (ReSTAR)

A Vision for the US Virtual Astronomical Observatory. October D. De Young (NOAO), VAO Project Scientist

1. The symbols below represent the Milky Way galaxy, the solar system, the Sun, and the universe.

From the Big Bang to Big Data. Ofer Lahav (UCL)

Overview Students read about the structure of the universe and then compare the sizes of different objects in the universe.

BAYESIAN CROSS-IDENTIFICATION IN ASTRONOMY

ASTRONOMY 202 Spring 2007: Solar System Exploration

Lecture Outlines. Chapter 25. Astronomy Today 7th Edition Chaisson/McMillan Pearson Education, Inc.

LCO Global Telescope Network: Operations and policies for a time-domain facility. Todd Boroson

The Main Point. Familiar Optics. Some Basics. Lecture #8: Astronomical Instruments. Astronomical Instruments:

LSST, Euclid, and WFIRST

Esri and GIS Education

Large Synoptic Survey Telescope, Computational-science Requirements of. George Beckett, 2 nd September 2016

ASTRONOMY (ASTRON) ASTRON 113 HANDS ON THE UNIVERSE 1 credit.

National Aeronautics and Space Administration. Glos. Glossary. of Astronomy. Terms. Related to Galaxies

Chapter 26. Objectives. Describe characteristics of the universe in terms of time, distance, and organization

Dark Energy Research at SLAC

Contents. Part I Developing Your Skills

The Zadko Telescope: the Australian Node of a Global Network of Fully Robotic Follow-up Telescopes

GraspIT Questions AQA GCSE Physics Space physics

Paolo Giommi Senior Scientist, International Relations Unit Italian Space Agency. 11 Apr 2017

Galaxies & Introduction to Cosmology

Cosmology with Wide Field Astronomy

The Milky Way Galaxy is Heading for a Major Cosmic Collision

TELESCOPES. How do they work?

Astronomy Today. Eighth edition. Eric Chaisson Steve McMillan

arxiv:astro-ph/ v1 3 Aug 2004

AURA MEMBER INSTITUTIONS

Examining Dark Energy With Dark Matter Lenses: The Large Synoptic Survey Telescope. Andrew R. Zentner University of Pittsburgh

Fairfield Public Schools Science Curriculum Science of the Cosmos

A Ramble Through the Night Sky

Picturing the Universe. How Photography Revolutionized Astronomy

Telescopes: Portals of Discovery Pearson Education, Inc.

Transcription:

Astroinformatics: massive data research in Astronomy Kirk Borne Dept of Computational & Data Sciences George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/

Ever since humans first gazed into the heavens

Astronomy has been a Data-Driven Science

From Data-Driven to Data-Intensive Astronomy has always been a data-driven science It is now a data-intensive science It will become even more data-intensive in the coming decade(s). Hence, a new field of data-oriented research and education in Astronomy is emerging: Astroinformatics

Vermeer s Astronomer & Geographer (2 mappers, collaborating on Astroinformatics and Geoinformatics)

Astronomy Data Environment: Sky Surveys To avoid biases caused by limited samples, astronomers now study the sky systematically = Sky Surveys Surveys are used to measure and collect data from all objects that are contained in large regions of the sky, in a systematic, controlled, repeatable fashion. These surveys include (... this is just a subset): MACHO and related surveys for dark matter objects: ~ 1 Terabyte Digitized Palomar Sky Survey: 3 Terabytes 2MASS (2-Micron All-Sky Survey): 10 Terabytes GALEX (ultraviolet all-sky survey): 30 Terabytes Sloan Digital Sky Survey (1/4 of the sky): 40 Terabytes and this one is just starting: Pan-STARRS: 40 Petabytes! Leading up to the big survey next decade: LSST (Large Synoptic Survey Telescope): 100 Petabytes!

The LSST : a better sky survey

LSST = Large Synoptic Survey Telescope http://www.lsst.org/ (mirror funded by private donors) 8.4-meter diameter primary mirror = 10 square degrees! Hello! (design, construction, and operations of telescope, observatory, and data system: NSF) (camera: DOE)

LSST = Large Synoptic Survey Telescope http://www.lsst.org/ (mirror funded by private donors) 8.4-meter diameter primary mirror = 10 square degrees! Hello! (design, construction, and operations of telescope, observatory, and data system: NSF) (camera: DOE)

LSST Key Science Drivers: Mapping the Universe Solar System Map (moving objects, NEOs, asteroids: census & tracking) Nature of Dark Energy (distant supernovae, weak lensing, cosmology) Optical transients (of all kinds, with alert notifications within 60 seconds) Galactic Structure (proper motions, stellar populations, star streams) LSST in time and space: When? 2016-2026 Where? Cerro Pachon, Chile Model of LSST Observatory

Observing Strategy: One pair of images every 40 seconds for each spot on the sky, then continue across the sky continuously every night for 10 years (2016-2026), with time domain sampling in log(time) intervals (to capture dynamic range of transients). LSST (Large Synoptic Survey Telescope): Ten-year time series imaging of the night sky mapping the Universe! 100,000 events each night anything that goes bump in the night! Cosmic Cinematography! The New Sky! @ http://www.lsst.org/ Education and Public Outreach have been an integral and key feature of the project since the beginning the EPO program includes formal Ed, informal Ed, Citizen Science projects, and Science Centers / Planetaria.

The LSST focal plane array Camera Specs: (pending funding from the DOE) 201 CCDs @ 4096x4096 pixels each! = 3 Gigapixels = 6 GB per image, covering 10 sq.degrees = ~3000 times the area of one Hubble Telescope image LSST Data Challenges Obtain one 6-GB sky image in 15 seconds Process that image in 5 seconds Obtain & process another co-located image for science validation within 20 s (= 15-second exposure + 5-second processing & slew) Process the 100 million sources in each image pair, catalog all sources, and generate worldwide alerts within 60 seconds (e.g., incoming killer asteroid) Generate 100,000 alerts per night (VOEvent messages) Obtain 2000 images per night Produce ~30 Terabytes per night Move the data from South America to US daily Repeat this every day for 10 years (2016-2026) Provide rapid DB access to worldwide community: 100-200 Petabyte image archive 20-40 Petabyte database catalog 12

The LSST Data Challenges Massive data stream: ~2 Terabytes of image data per hour that must be mined in real time (for 10 years). Massive 20-Petabyte database: more than 50 billion objects need to be classified, and most will be monitored for important variations in real time. Massive event stream: knowledge extraction in real time for 100,000 events each night.

Informatics-based Science Education Informatics enables transparent reuse and analysis of scientific data in inquiry-based classroom learning (http://serc.carleton.edu/usingdata/). Students are trained: to access large distributed data repositories to conduct meaningful scientific inquiries into the data to mine and analyze the data to make data-driven scientific discoveries The 21 st century workforce demands training and skills in these areas, as all agencies, businesses, and disciplines are becoming flooded with data. Numerous Data Sciences programs now starting at several universities (GMU, Caltech, RPI, Vanderbilt, Michigan, Cornell, ). CODATA ADMIRE initiative: Advanced Data Methods and Information technologies for Research and Education

Citizen Science Exploits the cognitive abilities of Human Computation! Citizen Science = Volunteer Science = Participatory Science Crowdsourcing for science! e.g., Galaxy Zoo @ http://www.galaxyzoo.org/ Citizen science refers to the involvement of volunteer nonprofessionals in the research enterprise. The Citizen Science experience must be engaging, must work with real scientific data/information (all of it), must not be busy-work (all clicks must count), must address authentic science research questions that are beyond the capacity of science teams and enterprises, and must involve the scientists.

The Zooniverse: http://zooniverse.org/ Advancing Science through User-Guided Learning in Massive Data Streams Building a framework for new Citizen Science projects Includes user-based research tools Science domains: Astronomy (Galaxy Merger Zoo) The Moon (Lunar Reconnaissance Orbiter) The Sun (STEREO dual spacecraft) Egyptology (the Papyri Project) and more ( accepting proposals from community)

Concluding comment: Why do this? Ans: to respond to the science data flood X-Informatics (e.g., X = Bio, Geo, Astro, ) (in silico): addresses the scientific data lifecycle challenges in the era of dataintensive science and the data flood defines lightweight ontologies, semantics, taxonomies, concepts, content descriptors for a science domain for the purpose of organizing, accessing, searching, fusing, integrating, mining, and analyzing massive data repositories. Citizen Science (user-guided, informatics-powered): Human computation (e.g., tagging, labeling, classification) characterized by enormous cognitive capacity and pattern recognition efficiency (carbon-based computing) Semantic e-science and Volunteer Citizen Science Tagging everything, everywhere: Analytics in the Cloud