Self-Organizing-Map and Deep-Learning! application to photometric redshift in HSC

Similar documents
Subaru/WFIRST Synergies for Cosmology and Galaxy Evolution

The Evolution of Massive Galaxies at 3 < z < 7 (The Hawaii 20 deg 2 Survey H2O)

Introduction to SDSS -instruments, survey strategy, etc

CFHT Large Area U-Band Deep Survey

Big Data Inference. Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts. Josh Speagle 1,

The Large Synoptic Survey Telescope

SED- dependent Galactic Extinction Prescription for Euclid and Future Cosmological Surveys

Dark Energy. Cluster counts, weak lensing & Supernovae Ia all in one survey. Survey (DES)

High Redshift Universe

Subaru High-z Exploration of Low-Luminosity Quasars (SHELLQs)

Weak Lensing: Status and Prospects

The shapes of faint galaxies: A window unto mass in the universe

Searching for z>6 Quasars with Subaru / Hyper Suprime-Cam Survey

Large Imaging Surveys for Cosmology:

A Spectroscopically-Confirmed Double Source Plane Lens in the HSC SSP Tanaka, Wong, et al. 2016, ApJ, 826, L19

Probabilistic photometric redshifts in the era of Petascale Astronomy

Separating Stars and Galaxies Based on Color

THE DARK ENERGY SURVEY: 3 YEARS OF SUPERNOVA

LePhare Download Install Syntax Examples Acknowledgement Le Phare

Target Selection for future spectroscopic surveys (DESpec) Stephanie Jouvel, Filipe Abdalla, With DESpec target selection team.

The PRIsm MUlti-object Survey (PRIMUS)

Constraining Fundamental Physics with Weak Lensing and Galaxy Clustering. Roland de Pu+er JPL/Caltech COSMO- 14

The The largest assembly ESO high-redshift. Lidia Tasca & VUDS collaboration

An Introduction to the Dark Energy Survey

The Dark Energy Survey Public Data Release 1

Astronomical image reduction using the Tractor

HSC Supernova Cosmology Legacy Survey with HST

The J-PAS survey: pushing the limits of spectro-photometry

Did low-luminosity quasars reionize the universe? - A view from the Subaru HSC SSP survey -

Open Cluster Research Project

SNAP Photometric Redshifts and Simulations

Supernovae with Euclid

Raven Eyes Elliptical Galaxies and Star Clusters. T. J. Davidge November 24, 2015

Quasar identification with narrow-band cosmological surveys

The J-PAS Survey. Silvia Bonoli

Photometric Redshifts for the NSLS

Supernovae photometric classification of SNLS data with supervised learning

Keck Observations of 150 GRB Host Galaxies Daniel Perley

Surveys at z 1. Petchara Pattarakijwanich 20 February 2013

Introducing DENET, HSC, & WFMOS

EUCLID Legacy with Spectroscopy

AstroBITS: Open Cluster Project

Discovery of Primeval Large-Scale Structures with Forming Clusters at Redshift z=5.7

Photometric Redshifts, DES, and DESpec

Challenges of low and intermediate redshift supernova surveys

The star-formation history of mass-selected galaxies in the VIDEO survey

Hubble s Law and the Cosmic Distance Scale

Modern Image Processing Techniques in Astronomical Sky Surveys

A cooperative approach among methods for photometric redshifts estimation

Cosmology of Photometrically- Classified Type Ia Supernovae

Rick Ebert & Joseph Mazzarella For the NED Team. Big Data Task Force NASA, Ames Research Center 2016 September 28-30

arxiv: v2 [astro-ph] 21 Aug 2007

Subaru Telescope and Its Prospects for Observational Cosmology

Lyman-α Cosmology with BOSS Julián Bautista University of Utah. Rencontres du Vietnam Cosmology 2015

Lecture 11: SDSS Sources at Other Wavelengths: From X rays to radio. Astr 598: Astronomy with SDSS

Cosmology The Road Map

Cross-Correlation of Cosmic Shear and Extragalactic Gamma-ray Background

Cosmological constraints from the 3rd year SNLS dataset

Some issues in cluster cosmology

SNLS supernovae photometric classification with machine learning

Exploring the substellar IMF in the Taurus cloud

MILANO OAB: L. Guzzo, S. de la Torre,, E. Majerotto, U. Abbas (Turin), A. Iovino; MILANO IASF (data reduction center): B. Garilli, M. Scodeggio, D.

LSST, Euclid, and WFIRST

The gas-galaxy-halo connection

Cosmology with the ESA Euclid Mission

Mapping the z 2 Large-Scale Structure with 3D Lyα Forest Tomography

Lensing with KIDS. 1. Weak gravitational lensing

Analysis of the rich optical iron-line spectrum of the x-ray variable I Zw 1 AGN 1H

IRS Spectroscopy of z~2 Galaxies

A Calibration Method for Wide Field Multicolor. Photometric System 1

CHEMICAL ABUNDANCE ANALYSIS OF RC CANDIDATE STAR HD (46 LMi) : PRELIMINARY RESULTS

Clusters, lensing and CFHT reprocessing

LSST Cosmology and LSSTxCMB-S4 Synergies. Elisabeth Krause, Stanford

METAPHOR Machine-learning Estimation Tool for Accurate Photometric Redshifts

The Millennium Simulation: cosmic evolution in a supercomputer. Simon White Max Planck Institute for Astrophysics

Radial selec*on issues for primordial non- Gaussianity detec*on

APLUS: A Data Reduction Pipeline for HST/ACS and WFC3 Images

WISE as the cornerstone for all-sky photo-z samples

Basic BAO methodology Pressure waves that propagate in the pre-recombination universe imprint a characteristic scale on

Where do Luminous Red Galaxies form?

Photometric Products. Robert Lupton, Princeton University LSST Pipeline/Calibration Scientist PST/SAC PST/SAC,

WL and BAO Surveys and Photometric Redshifts

Galaxies & Introduction to Cosmology

Quantifying correlations between galaxy emission lines and stellar continua

Learning algorithms at the service of WISE survey

Constraining Dark Energy: First Results from the SDSS-II Supernova Survey

Shape Measurement: An introduction to KSB

Astronomers discover an active, bright galaxy "in its infancy"

arxiv:astro-ph/ v1 13 Apr 2006

Type Ia Supernovae: Standardizable Candles and Crayons

The Long Faint Tail of the High-Redshift Galaxy Population

STUDIES OF SELECTED VOIDS. SURFACE PHOTOMETRY OF FAINT GALAXIES IN THE DIRECTION OF IN HERCULES VOID

Weak lensing measurements of Dark Matter Halos around galaxies

Searching primeval galaxies through gravitational telescopes

Performance of the NICMOS ETC Against Archived Data

Cherenkov Telescope Array Status Report. Salvatore Mangano (CIEMAT) On behalf of the CTA consortium

PHY323:Lecture 7 Dark Matter with Gravitational Lensing

TA Final Review. Class Announcements. Objectives Today. Compare True and Apparent brightness. Finding Distances with Cepheids

Cluster Multi-Wavelength Studies: HSC/SC-WL + SL+ SZE + X-ray + Dynamics

Durham Lightcones: Synthetic galaxy survey catalogues from GALFORM. Jo Woodward, Alex Merson, Peder Norberg, Carlton Baugh, John Helly

Transcription:

Self-Organizing-Map and Deep-Learning! application to photometric redshift in HSC Atsushi J. Nishizawa (Nagoya IAR/GS of Sci.)! on behalf of HSC collaboration 2016 Nov. 25 @ Hiorshima University

Today s plan Introduction! -What is the Machine Learning?! - Application to the astronomy and astrophysics!! HSC updates! - Current status of the SSP survey! - About the First Data Release (DR1)!! Application to photo-z! - Photometric redshift introductory! - Self Organizing Map (SOM)! - Neural Network (NN) and Deep Learning (DL)!! Beyond the photo-z! - Ability and limitation clustering redshift!! Summary

What is Machine Learning?

Machine Learning in Astrophysics Examples! - galaxy formation history (Harshil+ 2016)! - source classification from image (Salzberg+ 1995)! - photometric redshift (Sadeh+ 2015, Collister & Lahav 2004)! -!! ML can! - extract information from high-dimensional data (data mining)! - deal with general properties of the data, no physical interpretation is required.! - be suited for handle huge (both in size and dimension) and complicated data sets.

Hyper Suprime-Cam ~Subaru Strategic Program~ HSC is installed on the prime focus of Subaru Telescope! HSC wide:1,400 deg 2, deep: 25deg 2, udeep: 5deg 2! Since Feb. 2014, 300nights over 5 years are awarded! Superb observation conditions: seeing~0.7, ~100% transparency! Wide field of view (9 moons in one shot)! Science goals! - weak lensing (w~5%), strong lensing, cluster science, galaxy evolution, AGN, Supernovae, solar system, galactic archeology! Full overlap with SDSS/BOSS and COSMOS/SXDS! Other overlapped surveys:! - VVDS, GAMA, AEGIS, HectoMAP, CFHT, VIKING, UKIDS, Spitzer, ACTPol, XMM, and e-rosita (Russia).! The first data release will be in 2017 Feb.! The first year papers will appear in PASJ special issue in Feb(?)!

Hyper Suprime-Cam ~SSP updates~ We have finished 240 deg2 with g,r,i,z and Y bands to full depths! The first DR includes 100 deg2 full depth full color image/catalogs, observed by the end of Nov. 2015.! Various high-level catalogs (e.g. shear catalog, photo-z catalog, cluster catalog ) will also be available

Photometric Redshift Introductory

Why we need photometric redshift (photo-z)? For cosmic shear, DE is sensitive to both cosmological distance and growth of structure P apple (`) = 9 4 2 mh 4 0 Z 0 H d a 2 P ` f K ( ) ; applez H d 0 n( 0 ) f K( 0 ) f K ( 0 ) 2 Okebe & Smith 2016 For cluster lensing, foreground galaxies may bring signal dilution galaxy studies, AGN sciences, SNe also rely on the photo-z.

What we are required for WL? Precise image reduction from the raw data!! Very precise measurement of the shape of the galaxies (WL is the deformation of galaxies by gravity : modify galaxies ellipticity by 10-4 )!! Very precise measurement of the distance to the lensed galaxies.

Spectra of typical galaxies Elliptical fν(λ) Lyman break Spiral 4000A break Balmer break observed wavelength [Å] Taking spectra is observationally expensive!

Spectra of typical galaxies galaxy spectrum depends on redshift! age of the galaxy! metallicity of the galaxy! SF history of the galaxy! own dust attenuation! reddening by Milky way! f ( obs )=f [(1 + z) emt ; t, Z,, A V, ] In a traditional template fitting (TF) method, we find the optimal solution with the synthetic spectrum of galaxies given all these are free parameters (e.g. minimum chi-square)

Photometric redshift Extracted physical quantities from CCD images

Drawback of the Photo-z

Drawback of the Photo-z cntd. (SED)!! SED! ->!

Photo-z estimation with ML in HSC Photo-z (~ 00s)! In HSC, we have! Neural Network / DL (Sogo Mineo : NAOJ)! k-nearest Neighbor (Jean Coupon : Geneva)! Polynomial Fit (Bau-ching Hseih : ASIAA)! Random Forest / Self Organizing Map / DL (AN : Nagoya)! Hybrid of ML and template fitting (Joshua Speagle : Harvard)! Authentic template fitting (Masayuki Tanaka : NAOJ)!

Self Organizing Map (SOM) SOM is un-supervised machine learning algorithm! SOM is used for classifying the data into small group having similar features! Procedures are as follows!!!! - Decide geometry and resolution of the map (in two dimension)! - Let xi be the data vector for i-th galaxy xi ={xi1, xi2, xi3,., xin}! - Each pixel of map has a weight vector wk ={wk1, wk2, wk3,., wkn}! - Initial value (vector) of the map is random distribution! - Then, begin iteration! Find the closest pixel for each galaxy using (weighted)euclidean distance,! 2v 3! ux! 4t N (x ij w kj ) 2 Best cell! 5 (k=k0) d best i =min k j update the pixel vector by (k0 is the index of best cell)! w k (t + 1) = w k (t)+ (t)e D2 (k,k 0 )/ 2 (t) Repeat the process for the next object (t t+1)! 2 ij k-th cell update influenced region σ(t) and amount α(t) are linearly decreased! Repeat the whole process (typically O(100) times) D(k,k0)

Self Organizing Map (SOM) illustrative Random distribution Find the best cell for the first galaxy list of galaxies = {,,,,,., } Best matched cell update the value of the best cell and vicinities with weight repeat for all galaxies Gaussian/Tophat window narrow the influenced region Next galaxy list of galaxies = {,,,,,., }

k-fold cross validation optimal ML upon human coaching 1 Test set Validation set training set k 2 Test set Validation set training set Test set Training set validation set ML hyper parameters Number of random sampling?! Number of pixels?! Number of iterations? Number of neighbors?! Weighted scheme?! Which attributes are used?! Divide the calibration sample into Test / Validation / Training set! Test data should be untouched! Optimize hyper parameters for given realization of cross-validation set! Take median (or mean, or best) hyper parameters to fix the configuration of ML! Test data is then used for performance check.

Training data and Target data fluxes measured with 5 broadband filters (cmodel flux)! flux ratios, i.e. color (5C2=10)! another flux systems : PSF, dev.+exp., aperture (optional)! second order moment -> size and ellipticity (optional)! Their measurement errors (emulated to wide depth)! COSMOS 30-band photo-z, spec-z, grism-z (Training set only) bright mag-limited spec-z s faint spec-z s +COSMOS

Results HSC DR1 data Training sample 1. Using Training set (color=mean redshift), we make one realization of SOM.! 2. Then find the best matched cell for Test set (color=mean redshift: in reality they are unknown)! 3. For given galaxy, we assign the redshift from the training set averaged within the best matched cell.! 4. Repeat this for N_bootstrapped x N_MCed random samples to get PDF (able to properly propagate the photometric errors) Test sample PRELIMI NARY

Results HSC DR1 data confidencial sorry confidencial sorry

Results HSC DR1 data Bare confidencial sorry Weight Applied confidencial sorry

Photo-z with Deep-Learning Extract physical quantities from CCD images fluxes for 5 filters size of galaxies small dust extinction! small ellipticity large dust extinction! large ellipticity shape of galaxies light profile galaxy Break the degeneracy between dust-reddening and redshift by the ellipticity of galaxies. some of the information is discarded Barden+ 2008 M. Tanaka AN+ in prep. To obtain the most accurate photo-z ever.! How does the information not quantified impact the photo-z accuracies?

Beyond the photo-z clustering-z The spatial correlation between the sample with known redshifts and unknown redshift sample tells us the fractional number of objects at the given redshift ranges. 1 + wsp (zs, rp ) = hnspecz ( 0, zs )nphoto ( 0 + )i =rp Z dn drw (r)wsp (zs, r) / (zs ) dz E Medezinski, AN+ in prep. HSC galaxies (0.5 < zp < 0.7) are cross correlated with BOSS spec-z sample(lowz +CMASS+QSO) This method evaluates the relevance of the photo-z s based on the completely independent information. e.g.) Newman 2008, Menard+2013, etc.

Application (1) -Rahman, Menard et al. 2015Testing procedures 1. take subsample from full spec-z sample! 2. subsample is divided into 1400 redshift bins(δz=8e-4)! 3. sub-divided subsample is cross-correlated with full sample to get dn/dz via cluster-z The clustering-z seems to work for the subsample in the color-space.

For better prediction clustering-z with SOM Galaxies in each SOM cell should have similar properties in terms of color, flux, size, etc and thus are expected to lie in the similar redshift. AN, J. Speagle+ in prep. Totally independent way to measure the redshift of individual galaxy! Limitation = unknown galaxy bias C sp `,i = Z = b s = d W s,i( )W p ( ) 2 P sp (k = `/ ) 1 dn H( z i ) ( z i ) dz ( z i)b s ( z i )b p ( z i )P [k = `/ ( z i )] q C ss `,i /P Bias for spec-z sample : from autocorrelation meas.! Bias for unknown sample : assumption Absolute value is not required but only the redshift evolution is important.

Application (2) -contamination rate for g-lensingz_clusters [0.4, 0.6] Elinor Medezinski Okabe-method to define the background sample Ni w (zi ) / bp (zi )bs (zi ) zi Need to assume full function of bp(z) but! free from noisy signal from higher-z s

Summary Photo-z is a tool for cosmology and galaxy sciences! HSC data will be soon public (Feb. 2017)! Machine Learning opens new windows not only for the photo-z measurement but also various astrophysical data analysis and will help understand the physics behind.! Basically SOM is a method to classify data into small segments consist of similar physical properties.! We expect Deep Learning to discover new quantities to characterize photo-zs.! Cluster-z is a completely independent and complementary method to measure the redshift but limitation lies in understanding the unknown galaxy bias.