Complete Classification Conundrum

Similar documents
Studies of Peculiar Black Holes with Gemini Telescopes

International Science Development Teams (ISDT) Time-domain Science. Masaomi Tanaka (National Astronomical Observatory of Japan)

Variable stars with LSST

The Gaia mission and the variable objects

arxiv: v1 [astro-ph.sr] 25 Apr 2018

Connec&ng Bayesian Networks and Seman&cs (as a step towards complete classifica&on)

Science Olympiad UW- Milwaukee Regional. Astronomy Test

ISDT Report Time-domain Science

Transient and Variable Star Science with LSST Lucianne M. Walkowicz University of California, Berkeley

Igor Soszyński. Warsaw University Astronomical Observatory

Large Synoptic Survey Telescope

Classification of CoRoT Exofield Light Curves

Astronomy and Astrophysics

S. Basa, Laboratoire d Astrophysique de Marseille, France

Physics Lab #6: Citizen Science - Variable Star Zoo

Intrinsic variable types on the Hertzsprung Russell diagram. Image credit: Wikipedia

arxiv:astro-ph/ v1 3 Aug 2004

Lecture 10. Advanced Variable Star Stuff. March :00 PM BMPS 1420

Towards Real-time Classification of Astronomical Transients

Galaxies. Galaxy Diversity. Galaxies, AGN and Quasars. Physics 113 Goderya

The BRITE satellite and Delta Scuti Stars: The Magnificent Seven

Chapter 9. Stars. The Hertzsprung-Russell Diagram. Topics for Today s Class. Phys1411 Introductory Astronomy Instructor: Dr.

Planets around evolved stellar systems. Tom Marsh, Department of Physics, University of Warwick

The structure and evolution of stars. Learning Outcomes

Compact Pulsators Observed by Kepler

Chapter 14 The Milky Way Galaxy

Automated Variable Source Classification: Methods and Challenges

Results of the OGLE-II and OGLE-III surveys

From the Big Bang to Big Data. Ofer Lahav (UCL)

ASTRONOMY. MINNESOTA REGIONS 2011 by Michael Huberty

Astronomy C. Madison Brady

The Deaths of Stars. The Southern Crab Nebula (He2-104), a planetary nebula (left), and the Crab Nebula (M1; right), a supernova remnant.

Science Alerts from GAIA. Simon Hodgkin Institute of Astronomy, Cambridge

arxiv: v1 [astro-ph] 21 Dec 2007

Chapter 8: The Family of Stars

The Era of Synoptic Surveys. Peter Nugent (LBNL)

Time Domain Astronomy in the 2020s:

Friday, March 21, 2014 Reading for Exam 3: End of Section 6.6 (Type Ia binary evolution), 6.7 (radioactive decay), Chapter 7 (SN 1987A), NOT Chapter

Star systems like our Milky Way. Galaxies

Time-Series Photometric Surveys: Some Musings

The Large Synoptic Survey Telescope

Page # Astronomical Distances. Lecture 2. Astronomical Distances. Cosmic Distance Ladder. Distance Methods. Size of Earth

Evolution of Stars Population III: Population II: Population I:

arxiv: v2 [astro-ph.sr] 19 Jul 2011

High Resolution X-Ray Spectra A DIAGNOSTIC TOOL OF THE HOT UNIVERSE

ASTR STELLAR PULSATION COMPONENT. Peter Wood Research School of Astronomy & Astrophysics

A new method to search for Supernova Progenitors in the PTF Archive. Frank Masci & the PTF Collaboration

Potential Synergies Between MSE and the ELTs A Purely TMT-centric perspective But generally applicable to ALL ELTs

The Science Cases for CSTAR, AST3, and KDUST

Characterization of variable stars using the ASAS and SuperWASP databases

This Week in Astronomy

The 2006 Outburst of RS Oph: What are the questions that need to be discussed --and answered?

List of submitted target proposals

Photometric variability of luminous blue variable stars on different time scales

arxiv: v2 [astro-ph] 8 Sep 2008

Goal - to understand what makes supernovae shine (Section 6.7).

Astronomy 102: Stars and Galaxies Examination 3 April 11, 2003

Machine Learning Applications in Astronomy

The SWARMS Survey: Unveiling the Galactic population of Compact White Dwarf Binaries

DASCH To explore the ~1wk ~100y )mescale and extreme variability of stellar and non stellar objects

2019 Astronomy Team Selection Test

2. Can observe radio waves from the nucleus see a strong radio source there Sagittarius A* or Sgr A*.

Science Olympiad Astronomy C Division Event National Exam

(Slides for Tue start here.)

Dead & Variable Stars

Science Olympiad Astronomy C Division Event Golden Gate Invitational February 11, 2017

Distant Type Ia supernova Refsdal gravitationally lensed, appears multiple places and times.

Astronomy in the news? Government shut down. No astronomy picture of the day

Pulsating stars plethora of variables and observational tasks

CRTS and the VOEvent Networks

Science Olympiad Astronomy C Division Event. National Cathedral School Invitational Washington, DC December 3 rd, 2016

Basics, types Evolution. Novae. Spectra (days after eruption) Nova shells (months to years after eruption) Abundances

The global flow reconstruction of chaotic stellar pulsation

Asterosismologia presente e futuro. Studio della struttura ed evoluzione delle stelle per mezzo delle oscillazioni osservate sulla superficie

Stellar Astrophysics: Stellar Pulsation

Observations of Pulsating Stars

Stellar Astrophysics: Pulsating Stars. Stellar Pulsation

Learning Objectives. distances to objects in our Galaxy and to other galaxies? apparent magnitude key to measuring distances?

Introduction to AGN. General Characteristics History Components of AGN The AGN Zoo

ASTR Midterm 2 Phil Armitage, Bruce Ferguson

Dark Matter ASTR 2120 Sarazin. Bullet Cluster of Galaxies - Dark Matter Lab

Relativity and Astrophysics Lecture 15 Terry Herter. RR Lyrae Variables Cepheids Variables Period-Luminosity Relation. A Stellar Properties 2

Introduction to SDSS -instruments, survey strategy, etc

! p. 1. Observations. 1.1 Parameters

PESSTO The Public ESO Spectroscopic Survey for Transient Objects

PENNSYLVANIA SCIENCE OLYMPIAD STATE FINALS 2012 ASTRONOMY C DIVISION EXAM APRIL 27, 2012

A zoo of transient sources. (c)2017 van Putten 1

Evidence for Stellar Evolution

Observing Stellar Evolution Observing List

Lecture 11 Quiz 2. AGN and You. A Brief History of AGN. This week's topics

Science Olympiad Astronomy C Division Event MIT Invitaitonal

Science Olympiad Astronomy Event Division C. Supervisor: JoDee Baker.

Data challenges of the Virtual Observatory in Time Domain Astronomy

Techniques for measuring astronomical distances generally come in two variates, absolute and relative.

Powering the Universe with Supermassive Black Holes. Steve Ehlert and Paul Simeon

Active galaxies. Some History Classification scheme Building blocks Some important results

Lecture 13: Binary evolution

ASTRONOMY II Spring 1995 FINAL EXAM. Monday May 8th 2:00pm

The Death of Stars. Today s Lecture: Post main-sequence (Chapter 13, pages ) How stars explode: supernovae! White dwarfs Neutron stars

SCIENTIFIC CASES FOR RECEIVERS UNDER DEVELOPMENT (OR UNDER EVALUATION)

Transcription:

Complete Classification Conundrum Ashish Mahabal aam@astro.caltech.edu Center for Data-Driven Discovery (CD^3), Caltech Collaborators: CRTS, PTF, LSST, SAMSI, IUCAA teams SCMA VI, CMU, 20160607

Sky Maps of a few (optical) surveys CRTS PTF Gaia LSST

Stars, Milky Way, and Local Volume Statistics and Informatics Dark energy Strong Lensing Solar System Galaxies Transients and Variable Stars Large Scale Structure/Baryon Oscillation

Stars, Milky Way, and Local Volume Statistics and Informatics Dark energy Strong Lensing Solar System Galaxies Transients and Variable Stars Large Scale Structure/Baryon Oscillation

What is a transient? Fast transient (flaring dm), CSS080118:112149 131310 One that has a large brightness change (delta-magnitude) within a short timespan (small delta-time) light-curve

Challenge 1: Characterize/Classify as much with as little data as possible Variability Tree Extrinsic Intrinsic Asteroids AGN Rotation Eclipse Stars Stars Radio quiet RQQ Seyfert I Seyfert 2 LINER Radio loud RLQ BLRG Blazar NLRG WLRG BL Lac OVV Microlensing Eclipse Rotation Eclipse Eruptive Cataclysmic Pulsation Secular Asteroid occultation EA Eclipsing binary EB EW Planetary transits Credit : L. Eyer & N. Mowlavi (03/2009) (updated 04/2013) ELL FKCOM Single red giants β Per, α Vir SXA SX Arietis MS (B0-A7) with strong B fields ACV RCB BY Dra RS CVn Binary red giants α 2 Canes Venaticorum MS (B8-A7) with strong B fields DY Per UV Ceti Red dwarfs (K-M stars) FU PMS GCAS Be stars ZAND Symbiotic SN Ia SN II, Ib, Ic Supernovae WR LBV S Dor SPBe λ Eri N Novae ACYG α Cygni Hot OB Supergiants β Cephei UG Dwarf novae BCEP SPB PG 1159 (DO,V GW Vir) He/C/O-WDs Slowly pulsating B stars (PG1716+426, Betsy) long period sdb V777 Her (DBV) He-WDs SX Phoenicis PV Tel He star V361 Hya (EC14026) short period sdb V1093 Her SXPHE ZZ Ceti (DAV) H-WDs PMS δ Scuti DST δ Scuti Solar-like roap Photom. RR Lyrae GDOR FG Sge Sakurai, V605 Aql γ Doradus δ Cepheids SR M CW Miras Semiregulars L SARV Irregulars Small ampl. red var. RR CEP RV Period. R Hya (Miras) δ Cep (Cepheid) RV Tau (W Vir) Type II Ceph. Despite the heterogeneity, gaps, heteroskedasticity

Challenge 1: Characterize/Classify as much with as little data as possible Variability Tree Extrinsic Intrinsic Asteroids AGN Rotation Eclipse Stars Stars Radio quiet RQQ Seyfert I Seyfert 2 LINER Radio loud RLQ BLRG Blazar NLRG WLRG BL Lac OVV Microlensing Eclipse Rotation Eclipse Eruptive Cataclysmic Pulsation Secular Asteroid occultation EA Eclipsing binary EB EW Planetary transits Credit : L. Eyer & N. Mowlavi (03/2009) (updated 04/2013) ELL FKCOM Single red giants β Per, α Vir SXA SX Arietis MS (B0-A7) with strong B fields ACV RCB BY Dra RS CVn Binary red giants α 2 Canes Venaticorum MS (B8-A7) with strong B fields DY Per UV Ceti Red dwarfs (K-M stars) FU PMS GCAS Be stars ZAND Symbiotic SN Ia SN II, Ib, Ic Supernovae WR LBV S Dor SPBe λ Eri N Novae ACYG α Cygni Hot OB Supergiants β Cephei UG Dwarf novae BCEP SPB PG 1159 (DO,V GW Vir) He/C/O-WDs Slowly pulsating B stars (PG1716+426, Betsy) long period sdb V777 Her (DBV) He-WDs SX Phoenicis PV Tel He star V361 Hya (EC14026) short period sdb V1093 Her SXPHE ZZ Ceti (DAV) H-WDs PMS δ Scuti DST δ Scuti Solar-like roap Photom. RR Lyrae GDOR FG Sge Sakurai, V605 Aql γ Doradus δ Cepheids SR M CW Miras Semiregulars L SARV Irregulars Small ampl. red var. RR CEP RV Period. R Hya (Miras) δ Cep (Cepheid) RV Tau (W Vir) Type II Ceph. No transient left behind

Example CRTS Transients CSS090429:135125-075714 Flare star CSS090429:101546+033311 Dwarf Nova CSS090426:074240+544425 Blazar, 2EG J0744+5438 Different phenomena look the same!

AGN Variability - different perspectives CRTS Kepler

Truncation and Censoring

19 20 21 0 1000 2000 Day Magnitude AGN 19.0 19.5 20.0 20.5 21.0 0 1000 2000 Day Magnitude SN 14 16 18 20 0 1000 2000 Day Magnitude Flare 16 17 18 19 20 21 0 1000 2000 Day Magnitude Non Transient When regressing base rates should not be forgotten. - What You See Is All There Is (WYSIATI) Faraway, Mahabal et al. 2015

500 Million Light Curves with ~ 10 11 data points RR Lyrae W Uma Flare star (UV Ceti) Eclipsing CV Blazar CRTS PIs Djorgovski, Drake

Challenge 2: Only a small fraction are rare - find/characterize them early CRTS 10+ year status Current Status: Few tens of transients per night Future (LSST): 10^6-10^7 per night; 10^4 per minute That is why we need automatic classification algorithms

Variability on huge range of timescales Class Timescale Amplitude ( mags) WD Pulsations 4-10 min 0.01-0.1 AM CVn (orbital period) 10-65 min 0.1-1 WD spin (int. polars) 20-60 min 0.02-0.4 AM CVn outbursts 1-5 days 2-5 Dwarf Novae outburst 4 days - 30 years 2-8 Symbiotic (outburst) weeks-months 1-3 Novae-like high/low days-years 2-5 Recurrent Novae 10-20 year 6-11 Novae 10 3-10 4 yr 7-15 Slide from Lucianne Walkowicz

Expected Rate of Transients Class Mag t (days) Universal Rate LSST Rate Luminous SNe -19...-23 50-400 10-7 Mpc -3 yr -1 20000 Orphan Afterglows SHB -14...-18 5-15 3 x10-7...-9 Mpc -3 yr -1 ~10-100 Orphan Afterglows LSB -22...-26 2-15 3 x 10-10...-11 Mpc -3 yr -1 1000 On-axis GRB afterglows...-37 1-15 10-11 Mpc -3 yr -1 ~50 Tidal Disruption Flares -15...-19 30-350 10-6 Mpc -3 yr -1 6000 Luminous Red Novae -9...-13 20-60 10-13 yr -1 Lsun -1 80-3400 Fallback SNe -4...-21 0.5-2 <5 x 10-6 Mpc -3 yr -1 < 800 SNe Ia -17...-19.5 30-70 3 x 10-5 Mpc -3 yr -1 200000 SNe II -15...-20 20-300 (3..8) x 10-5 Mpc -3 yr -1 100000 Table adapted from Rau et al. 2009 by Lucianne Walkowicz

NOAO s proposed broker Antares Solar System LSST History Other catalogs Ancillary data 0.1 rare alerts/image Saha et al 1409.0056

NOAO s proposed broker Antares Solar System LSST History Other catalogs Ancillary data 0.1 rare alerts/image 2017 workshop 2016 LSST AHM Saha et al 1409.0056

Bayesian Networks Search space growth hyperexponential Very broadly speaking 5 flavors of BNs possible Naïve Tree Augmented Network (TAN) Constructed (semantics, expert knowledge etc. based) Single winner from several naïve Fully learned from data

SNe/non-SNe BN normalized Based on peaks 80-90% completeness Only archival information

Discriminating features Chengyi Lee You can not step into the same river twice.

Hierarchical approach Archival search Binary Blackholes Graham et al. 2015 CARMA/Wavelets

flux_%_mid20 flux_%_mid35 flux_%_mid50 flux_%_mid65 flux_%_mid80 freq_n_alias freq_varrat Many features - not all are independent scatter_res_raw fold_2p_slope_10% fold_2p_slope_90% QSO p2p_scatter_pfold_over_mad p2p_scatter_2praw non_qso percent_difference_flux_percentile max_slope Adam Miller medperc90_p2_p pair_slope_trend freq_signif freq_rrd skew beyond1std std freq_y_offset stetson_j stetson_k MAD Amplitude small_kurtosis p2p_scatter_over_mad percent_amplitude median_buffer_range_percentage linear_trend freq_model_min_delta_mag freq_model_max_delta_mag freq_model_phi1_phi2 p2p_ssqr_diff_over_var Resort to dimensionality reduction

Challenge 3: A variety of parameters - choose judiciously Discovery; Contextual; Follow-up; Prior Classification Whole curve measures Median magnitude (mag); mean of absolute differences of successive observed magnitude; the maximum difference magnitudes Fitted curve measures Scaled total variation scaled by number of days of observation; range of fitted curve; maximum derivative in the fitted curve Residual from fit measures The maximum studentized residual; SD of residuals; skewness of residuals; Shapiro-Wilk statistic of residuals Cluster measures Fit the means within the groups (up to 4 measurements); and then take the logged SD of the residuals from this fit; the max absolute residuals from this fit; total variation of curve based on group means scaled by range of observation

Challenge 4: real-time computation required - find ways to make that happen Recomputation of features Updating priors Ridgeway et al., arxiv: 1409.3265

Challenge 4: real-time computation required - find ways to make that happen Recomputation of features Updating priors Ridgeway et al., arxiv: 1409.3265

Challenge 5: Metaclassification - combining diverse classifiers optimally As varied classifiers are used for parts of the classification tree combing their outputs in an optimal way becomes crucial Mahabal, Donalek

Sky Maps of a few surveys CRTS PTF Gaia LSST

Why Domain Adaptation? Surveys differ in depth (aperture), filters, cadence Same (type of) objects produce different statistical features (skew, median absolute deviation etc.) Learning tends to be done on each survey separately - leading to unnecessary delays DA helps build on the otherwise untapped intersurvey synergy (think DASCH -> CRTS/ZTF/ Kepler -> LSST) Jingling Li, S Vaijanapurkar, B Bue

Feature Correlations

KIC 8462852 (aka Tabby s star aka WTF star) nir and UV flux Fading or not fading?

50K Variables from CRTS Drake et al. 2014 SMOTE and Sampling with replacement used to take care of unbalancedness

If you had just two features Feature 1 Feature 1 Feature 1 Feature 2 Feature 2 Feature 2

Geodesik Flow Kernel Integrate flow of subspace: S to T Kernel incapsulates incremental changes between subspaces Kernel converts domain specific features into invariant ones (Gong et al. 2012)

Co-Domain Adaptation Slow adaptation from S to T Add best target objects in each round Elect shared S and T subsets from training and Average Accuracy unlabelled data (Chen et al. 2011) Target Data Training % L = DS D T l U = D T u Adding a fraction of sources from the target domain to the source domain for training improves performance

Summary of challenges Characterize/Classify as much with as little data as possible Only a small fraction are rare - find/characterize them early A variety of parameters - choose judiciously Real-time computation is required - find ways to make that happen Metaclassification - combining diverse classifiers optimally

Summary of challenges Characterize/Classify as much with as little data as possible Only a small fraction are rare - find/characterize them early A variety of parameters - choose judiciously Real-time computation is required - find ways to make that happen Metaclassification - combining diverse classifiers optimally These challenges involve: Making sense of unparalleled volumes of structured and unstructured data in real-time, and Teaching machines how humans think by understanding pattern recognition when handling diverse types of data sources

Summary of challenges Characterize/Classify as much with as little data as possible Only a small fraction are rare - find/characterize them early A variety of parameters - choose judiciously Real-time computation is required - find ways to make that happen Metaclassification - combining diverse classifiers optimally These challenges involve: Making sense of unparalleled volumes of structured and unstructured data in real-time, and Teaching machines how humans think by understanding pattern recognition when handling diverse types of data sources Better tools to make sense of very sparse data and Streamlined workflows