Making precise and accurate measurements with data-driven models

Similar documents
Data-driven models of stars

JINA Observations, Now and in the Near Future

What are the three basic types of spectra?

The Gaia-ESO Public Spectroscopic Survey a lesson for our community in use of limited telescope access. Gerry Gilmore Sofia Randich Gaia-ESO Co-PIs

(Present and) Future Surveys for Metal-Poor Stars

12. Physical Parameters from Stellar Spectra. Fundamental effective temperature calibrations Surface gravity indicators Chemical abundances

Bayesian inference using Gaia data. Coryn Bailer-Jones Max Planck Institute for Astronomy, Heidelberg

Determination of [α/fe] and its Application to SEGUE F/G Stars. Young Sun Lee

Action-based Dynamical Modeling of the Milky Way Disk with Gaia & RAVE

Engineering considerations for large astrophysics projects

Stars and Stellar Astrophysics. Kim Venn U. Victoria

Combining Gaia DR1, DR2 and Matthias Steinmetz (AIP) a preview on the full Gaia dataset. Matthias Steinmetz (AIP)

Matthias Steinmetz. 16 Oct 2012 Science from the Next Generation Imaging and Spectroscopic Surveys 1

Tests of MATISSE on large spectral datasets from the ESO Archive

Are open clusters chemically homogeneous? Fan Liu

Chapter 7: From theory to observations

Classical Methods for Determining Stellar Masses, Temperatures, and Radii

From theory to observations

Exploring the structure and evolu4on of the Milky Way disk

arxiv: v2 [astro-ph.im] 24 Jan 2018 Accepted 19 December Received 6 August 2017.

ASPCAP Tests with Real Data

Galactic archaeology with the RAdial Velocity Experiment

Distributed Genetic Algorithm for feature selection in Gaia RVS spectra. Application to ANN parameterization

LEGA-C. The Physics of Galaxies 7 Gyr Ago. Arjen van der Wel Max Planck Institute for Astronomy, Heidelberg

HW 5 posted. Deadline: * Monday 3.00 PM * -- Tip from the coach: Do it earlier, as practice for mid term (it covers only parts included in exam).

The Hertzsprung-Russell Diagram

Chapter 5 Light and Matter: Reading Messages from the Cosmos. How do we experience light? Colors of Light. How do light and matter interact?

How Massive is the Milky Way?

Types of Stars and the HR diagram

Automated analysis: SDSS, BOSS, GIRAFFE

Parallax: Measuring the distance to Stars

Oxygen in red giants from near-infrared OH lines: 3D effects and first results from. Puerto de la Cruz, May 14, 2012! Carlos Allende Prieto!

Galac%c Unprecedented Precision (0.01 dex)

DETERMINATION OF STELLAR ROTATION WITH GAIA AND EFFECTS OF SPECTRAL MISMATCH. A. Gomboc 1,2, D. Katz 3

Characterization of the exoplanet host stars. Exoplanets Properties of the host stars. Characterization of the exoplanet host stars

Lecture Outline: Spectroscopy (Ch. 4)

Astronomy II (ASTR-1020) Homework 2

GALACTIC DOPPELGANGER: THE CHEMICAL SIMILARITY AMONG FIELD STARS AND AMONG STARS WITH A COMMON BIRTH ORIGIN

Techniques for measuring astronomical distances generally come in two variates, absolute and relative.

The Apache Point Observatory Galactic Evolution Experiment. Ricardo Schiavon

Gaia Photometric Data Analysis Overview

Hubble s Law and the Cosmic Distance Scale

A Stellar Spectra 3. Stars shine at night (during the day too!). A star is a self-luminous sphere of gas. Stars are held together by gravity.

Assignments for Monday Oct. 22. Read Ch Do Online Exercise 10 ("H-R Diagram" tutorial)

Milky Way star clusters

From theory to observations

Radial Velocity Surveys. Matthias Steinmetz (AIP)

The Chemical/Dynamical Evolution of the Galactic Bulge

Outline. c.f. Zhao et al. 2006, ChJA&A, 6, 265. Stellar Abundance and Galactic Chemical Evolution through LAMOST Spectroscopic Survey

OPTION E, ASTROPHYSICS TEST REVIEW

(Slides for Tue start here.)

Galaxies and the expansion of the Universe

! p. 1. Observations. 1.1 Parameters

arxiv: v1 [astro-ph.ga] 6 Mar 2018

Stellar Astrophysics: The Classification of Stellar Spectra

DETERMINING STELLAR PARAMETERS FROM SPECTROSCOPIC OBSERVATIONS. Philip Muirhead Department of Astronomy Boston University

The HERMES project. Reconstructing Galaxy Formation. Ken Freeman RSAA, ANU. The metallicity distribution in the Milky Way discs Bologna May 2012

APPLICATION FOR OBSERVING TIME

Galaxy Evolution at High Resolution: The New View of the Milky Way's Disc. Jo Bovy (University of Toronto; Canada Research Chair)

Chapter 10: Unresolved Stellar Populations

Astr 323: Extragalactic Astronomy and Cosmology. Spring Quarter 2014, University of Washington, Željko Ivezić. Lecture 1:

Extragalactic Astronomy

Milky Way s Anisotropy Profile with LAMOST/SDSS and Gaia

OPTION E, ASTROPHYSICS TEST REVIEW

Fundamental (Sub)stellar Parameters: Surface Gravity. PHY 688, Lecture 11

Chemistry & Dynamics of the Milky Way From Before Hipparcos Until Gaia

Galaxies & Introduction to Cosmology

Lecture Three: Stellar Populations. Stellar Properties: Stellar Populations = Stars in Galaxies. What defines luminous properties of galaxies

Today. Stars. Properties (Recap) Binaries. Stellar Lifetimes

University of Naples Federico II, Academic Year Istituzioni di Astrofisica, read by prof. Massimo Capaccioli. Lecture 16

Introduction The Role of Astronomy p. 3 Astronomical Objects of Research p. 4 The Scale of the Universe p. 7 Spherical Astronomy Spherical

Metal Poor Stars: A Review for Non-Observers. Charli Sakari

Stars: some basic characteristics

Star clusters before and after Gaia Ulrike Heiter

The Large Sky Area Multi- Object Spectroscopic Telescope (LAMOST)

Chapter 15: Surveying the Stars

THE OBSERVATION AND ANALYSIS OF STELLAR PHOTOSPHERES

Gaia Revue des Exigences préliminaires 1

The Milky Way Galaxy (ch. 23)

New Dimensions of Stellar Atmosphere Modelling

Chapter 10 Measuring the Stars

Hierarchical Bayesian Modeling

SkyMapper and EMP stars

Fundamental (Sub)stellar Parameters: Masses and Radii. PHY 688, Lecture 10

Galactic, stellar (and planetary) archaeology with Gaia: The galactic white dwarf population

Astronomy 102: Stars and Galaxies Sample Review Test for Examination 3

Milky Way S&G Ch 2. Milky Way in near 1 IR H-W Rixhttp://online.kitp.ucsb.edu/online/galarcheo-c15/rix/

Astronomy 421. Lecture 14: Stellar Atmospheres III

Hierarchical Bayesian Modeling

Outline: Cosmological Origins. The true basics of life The age of Earth and the Universe The origin of the heavy elements Molecules in space

The Cosmic Perspective. Surveying the Properties of Stars. Surveying the Stars. How do we measure stellar luminosities?

Delicious Diameters of Dwarfs (in particular, the juicy red ones)

Review: Properties of a wave

Dictionary Learning for photo-z estimation

Stellar Spectra ASTR 2120 Sarazin. Solar Spectrum

Gaia DR2 and/versus RAVE DR5: application for the semi-analytic thin disk model

Chapter 8: The Family of Stars

Astr 323: Extragalactic Astronomy and Cosmology. Spring Quarter 2012, University of Washington, Željko Ivezić. Lecture 1:

Chapter 11 Review. 1) Light from distant stars that must pass through dust arrives bluer than when it left its star. 1)

A100H Exploring the Universe: Discovering Galaxies. Martin D. Weinberg UMass Astronomy

Transcription:

Making precise and accurate measurements with data-driven models David W. Hogg Center for Cosmology and Particle Physics, Dept. Physics, NYU Center for Data Science, NYU Max-Planck-Insitut für Astronomie, Heidelberg Flatiron Institute, Simons Foundation, New York City Hunstead Lecture 2017 November 9

conclusions A data-driven model measures physical parameters of stars with better quality than any physics-driven pipeline. The Cannon no physics is harmed Connections to other physical systems and models. Connections to extra-solar-planet and Milky-Way science. Criticisms of vanilla machine learning. Everything open-source or public-domain. Melissa Ness (MPIA), Andy Casey (Cambridge), Anna Y. Q. Ho (Caltech), Lauren Anderson (Flatiron), Hans-Walter Rix (MPIA)

De-noising (Anderson)

Annie Jump Cannon O B A F G K M temperature sequence! alphabetical order is hydrogen-line-strength order Cannon understood the temperature sequence of stars without the benefit of physical models data-driven non-linear dimensionality reduction manifold learning (using a huge amount of prior knowledge) namesake of The Cannon

the paradoxes of contemporary physics models are incredibly explanatory QCD, ΛCDM, helioseismology and yet...

the paradoxes of contemporary physics models are incredibly explanatory QCD, ΛCDM, helioseismology and yet... models are wrong (ruled out) in detail χ 2 ν The χ 2 statistic is a measure of the size of your data! data are abundant and precise

the paradoxes of contemporary physics models are incredibly explanatory QCD, ΛCDM, helioseismology and yet... models are wrong (ruled out) in detail χ 2 ν The χ 2 statistic is a measure of the size of your data! data are abundant and precise missing physics, approximation, gastrophysics models are fundamentally computational

context: Galactic archaeology stars populate orbits in the Milky Way conserved actions (or chaotic equivalents) stars are formed from particular gas clouds stars have conserved surface abundances the combined action-chemical space will be far more informative than either taken independently

context: Galactic archaeology top priority for many new projects Gaia & Gaia-ESO HERMES & GALAH SDSS-III APOGEE terrifying inconsistencies in current approaches models of stars are amazingly good...... but chemical signatures are incredibly tiny

context: extra-solar planets planets are measured relative to their host stars transits radial-velocity signals astrometric signals some planet measurements are now very precise need stellar properties for accuracy

definition: physics-driven models (my usage) put in everything you know gravity, atomic and molecular transitions, radiation make approximations to make things computable sub-grid models, mixing length, etc. compute like hell

definition: machine learning (my usage) the most extreme of data-driven models the data is the model none of your knowledge is relevant learn (fit) an exceedingly flexible model explain or cluster the data transformation from data to labels concept of non-parametrics concept of train, validate, and test many packages and implementations (and outrageous successes)

When does (vanilla) machine learning help you? train & test situation training data are statistically identical to the test data same noise amplitude same distance or redshift distribution same luminosity distribution never true! training data have accurate and precise labels therefore, we can t use vanilla machine learning! (physicists rarely can)

definition: data-driven models (my usage) make use of things you strongly believe noise model & instrument resolution causal structure (shared parameters) capitalize on huge amounts of data exceedingly flexible model concept of train, validate, and test every situation will be bespoke

label transfer for stars a few of your stars have good labels (from somewhere) can you use this to label the other stars? why would you want to do this?

label transfer for stars a few of your stars have good labels (from somewhere) can you use this to label the other stars? why would you want to do this? you don t have good models at your wavelengths? you want two surveys to be on the same system? you have some stars at high SNR, some at low SNR? you spent human time on some stars but can t on all?

stellar spectra stars are very close to black-bodies to first order, a stellar spectrum depends on effective temperature T eff, surface gravity log g, and metallicity [Fe/H]

stellar spectra stars are very close to black-bodies to first order, a stellar spectrum depends on effective temperature T eff, surface gravity log g, and metallicity [Fe/H] to second order, tens of chemical abundances, rotation, turbulence, activity

stellar spectra all chemical information is in absorption lines corresponding to atomic and molecular transitions some 30 elements are visible in the best stars spectroscopy at is the primary tool R λ > 20, 000 λ

stellar astrophysics 1.0 1.0 0.8 0.8 0.6 1.0 A B Teff = 4750, log g = 3.0, [Fe/H] = 0.15 Teff = 4849, log g = 2.2, [Fe/H] = -1.0 normalized flux f 0.6 0.8 0.6 1.0 0.4 0.8 0.6 0.2 1.0 0.8 Teff = 3614, log g = 0.4, [Fe/H] = -0.68 Teff = 5003, log g = 2.8, [Fe/H] = -0.71 0.6 0.0 0.015200 15400 0.2 15600 158000.4 16000 16200 0.6 16400 16600 0.8 16800 1.0 wavelength λ (Å)

stellar astrophysics

SDSS-III APOGEE Galactic archaeology APOGEE DR12 & DR13: 156,000 stars (98,000 giants) R = 22, 500 spectra in 1.5 < λ < 1.7 µm precise RVs and stellar parameters 15 19 abundances per star (our own home-built and special continuum normalization; ask me!) all data completely public!

SDSS-III APOGEE 1.0 1.0 0.8 0.8 0.6 1.0 A B Teff = 4750, log g = 3.0, [Fe/H] = 0.15 Teff = 4849, log g = 2.2, [Fe/H] = -1.0 normalized flux f 0.6 0.8 0.6 1.0 0.4 0.8 0.6 0.2 1.0 0.8 Teff = 3614, log g = 0.4, [Fe/H] = -0.68 Teff = 5003, log g = 2.8, [Fe/H] = -0.71 0.6 0.0 0.015200 15400 0.2 15600 158000.4 16000 16200 0.6 16400 16600 0.8 16800 1.0 wavelength λ (Å)

train, validate, and test split the data into three disjoint subsets in the training step you set the parameters of your model using the training set the validation set is used to set hyperparameters or model complexity in the test step you apply the model to the test set new data to make predictions or deliver results

The Cannon: Experiment 1: training set 543 stars (too few) from 19 clusters (too few) T eff, log g, [Fe/H] labels from APOGEE calling parameters and abundances labels slight adjustments to labels to get them onto possible isochrones terrible coverage of the main sequence only the Pleiades home-made Pleiades labels (by Ness) no [Fe/H] spread at high log g.

The Cannon: Experiment 1; training set log g (dex) log g (dex) log g (dex) log g (dex) 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 5500 13 Gyr ( 2.35) 12 Gyr ( 2.33) 12.7 Gyr ( 2.06) 13 Gyr ( 1.98) 13 Gyr ( 1.78) M92 M15 M53 N5466 N4147 13 Gyr ( 1.66) 11.7 Gyr ( 1.58) 11.5 Gyr ( 1.5) 13 Gyr ( 1.33) 13 Gyr ( 003) M2 M13 M3 M5 M107 10 Gyr ( 0.82) 1 Gyr ( 0.28) 2 Gyr ( 0.20) 5 Gyr ( 0.03) 3.2 Gyr ( 0.01) M71 N2158 N2420 N188 M67 1.6 Gyr (0.02) 0.15 Gyr (+0.03) 2.5 Gyr (+0.09) 5 Gyr (+0.47) N7789 Pleiades N6819 N6791 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 Teff (K) Teff (K) Teff (K) Teff (K) Teff (K)

The Cannon: Experiment 1; training set log g (dex) log g (dex) log g (dex) log g (dex) 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 5500 13 Gyr ( 2.35) 12 Gyr ( 2.33) 12.7 Gyr ( 2.06) 13 Gyr ( 1.98) 13 Gyr ( 1.78) M92 M15 M53 N5466 N4147 13 Gyr ( 1.66) 11.7 Gyr ( 1.58) 11.5 Gyr ( 1.5) 13 Gyr ( 1.33) 13 Gyr ( 003) M2 M13 M3 M5 M107 10 Gyr ( 0.82) 1 Gyr ( 0.28) 2 Gyr ( 0.20) 5 Gyr ( 0.03) 3.2 Gyr ( 0.01) M71 N2158 N2420 N188 M67 1.6 Gyr (0.02) 0.15 Gyr (+0.03) 2.5 Gyr (+0.09) 5 Gyr (+0.47) N7789 Pleiades N6819 N6791 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 5500 5000 4500 4000 Teff (K) Teff (K) Teff (K) Teff (K) Teff (K)

The Cannon: model a generative model of the APOGEE spectra given label vector l, predict flux vector f probabilistic prediction p(f l, θ) use every spectral pixel s uncertainty variance σλn 2 responsibly details: spectral expectation is quadratic in the labels every wavelength λ treated independently an intrinsic Gaussian scatter s 2 λ at every wavelength λ 80,000 free parameters in θ!

The Cannon: model ln p(f n l n, θ) = L ln p(f λn l n, θ λ, s 2 λ) λ=1 ln p(f λn l n, θ λ, s 2 λ) = 1 [f λn θ λt l n] 2 2 σλn 2 + + ln(σ s2 λn 2 + s 2 λ) λ l T {1, T eff, log g, [Fe/H], Teff, 2 T eff log g,, [Fe/H] 2} θ T { θ λ, s 2 } L λ λ=1

The Cannon: model ln p(f n l n, θ) training step: optimize w.r.t. parameters θ at fixed labels l using training-set data linear least squares every wavelength λ treated independently test step: optimize w.r.t. labels l at fixed parameters θ using test-set (survey) data non-linear optimization every star treated independently

The Cannon: model training

The Cannon: model training

The Cannon: model training cross-validation

The Cannon: results The Cannon is far faster than physical modeling model trains in seconds (thousands of fits) The Cannon labels 10 5 stars per hour (pure Python on a laptop) labels appear sensible The Cannon labels lie near sensible isochrones scatter against APOGEE labels consistent with APOGEE precision successfully puts labels on dwarfs

The Cannon: Experiment 1: comparison with ASPCAP labels

The Cannon: Experiment 1: label veracity

The Cannon: Experiment 1: label veracity

The Cannon: works at low signal-to-noise

The Cannon: works at low signal-to-noise

The Cannon: results The Cannon is far faster than physical modeling model trains in seconds (thousands of fits) The Cannon labels 10 5 stars per hour (pure Python on a laptop) labels appear sensible The Cannon labels lie near sensible isochrones scatter against APOGEE labels consistent with APOGEE precision successfully puts labels on dwarfs

The Cannon: shortcuts and choices no Bayes; no partial or noisy labels quadratic order replacing polynomial with a Gaussian process continuous model complexity; non-parametric spectral representation too-small training set only three labels age, [α/fe] splitting the giant branch how to go to many elements?

The Cannon: label transfer from APOGEE to LAMOST (Casey, Ho)

The Cannon: masses and ages for red giants (Ness)

The Cannon: masses and ages for red giants (Ness)

The Cannon: detailed abundances (Casey) 6000 5000 4000 T eff 5 38 4000 5000 6000 5.0 2.5 0.0 log g 0.00 0.07 0.0 2.5 5.0 0.0 1.5 3.0 [Al/H] 0.00 0.09 0.0 [Ca/H] 0.0 [C/H] 0.0 [Fe/H] 1.5 3.0 0.00 0.07 1.5 3.0 0.00 0.06 1.5 3.0 0.00 0.03

4000 5000 6000 0.0 2.5 5.0 The Cannon: detailed abundances (Casey) 0.0 [Ca/H] 0.0 [C/H] 0.0 [Fe/H] 1.5 3.0 0.00 0.07 1.5 3.0 0.00 0.06 1.5 3.0 0.00 0.03 0.0 [K/H] 0.0 [Mg/H] 0.0 [Mn/H] 1.5 3.0 0.01 0.11 1.5 3.0 0.00 0.04 1.5 3.0 0.00 0.06 0.0 [Na/H] 0.0 [Ni/H] 0.0 [N/H]

The Cannon: detailed abundances (Casey) 3.0 1.5 0.01 0.11 3.0 1.5 0.00 0.04 3.0 1.5 0.00 0.06 3.0 1.5 0.0 0.01 0.17 [Na/H] 3.0 1.5 0.0 0.00 0.05 [Ni/H] 3.0 1.5 0.0 0.00 0.06 [N/H] 3.0 1.5 0.0 0.00 0.07 [O/H] 3.0 1.5 0.0 0.00 0.05 [Si/H] 3.0 1.5 0.0 0.00 0.09 [S/H]

3.0 3.0 3.0 The Cannon: detailed abundances (Casey) 0.0 [O/H] 0.0 [Si/H] 0.0 [S/H] 1.5 3.0 0.00 0.07 1.5 3.0 0.00 0.05 1.5 3.0 0.00 0.09 0.0 [Ti/H] 0.0 [V/H] 1.5 0.01 0.13 3.0 3.0 1.5 0.0 1.5 0.03 0.26 3.0 3.0 1.5 0.0

The Cannon: detailed abundances (Ness)

The Cannon: detailed abundances (Ness)

The Cannon: detailed abundances (Ness)

lessons learned regressions are different from density estimators value of convex regularization

The Cannon: identification of lines (Casey) 1.0 [Al/H] [S/H] [K/H] 0.5 θ/max θ 0.0 0.5 1.0 15200 15280 15360 16650 16700 16750 16800 λ (Å)

The Cannon: discovery of outliers (Ho)

The Cannon: chemical tagging

The future: Unsupervised (Anderson)

The future: Unsupervised (Anderson)

read more original paper on The Cannon and APOGEE: Ness et al., arxiv:1501.07604 labeling LAMOST, RAVE: Ho et al., arxiv:1602.00303, Casey et al., arxiv:1609.02914 chemical abundances: Casey et al., arxiv:1603.03040, Ness et al., arxiv:1701.07829 red-giant masses and ages: Ness et al., arxiv:1511.08204; Ho et al., arxiv:1609.03195 chemical tagging: Hogg et al., arxiv:1601.05413 de-noising Gaia: Anderson et al., arxiv:1706.05055 Eilers in prep: working with missing and noisy labels Price-Whelan in prep: modeling spectroscopic binaries Bedell in prep: extreme-precision radial-velocity

conclusions A data-driven model measures physical parameters of stars with better quality than any physics-driven pipeline. The Cannon no physics is harmed Connections to other physical systems and models. Connections to extra-solar-planet and Milky-Way science. Criticisms of vanilla machine learning. Everything open-source or public-domain. Melissa Ness (MPIA), Andy Casey (Cambridge), Anna Y. Q. Ho (Caltech), Lauren Anderson (Flatiron), Hans-Walter Rix (MPIA)