METAPHOR Machine-learning Estimation Tool for Accurate Photometric Redshifts

Similar documents
A cooperative approach among methods for photometric redshifts estimation

arxiv: v1 [astro-ph.im] 23 Oct 2018

Mining Digital Surveys for photo-z s and other things

A cooperative approach among methods for photometric redshifts estimation: an application to KiDS data

University of Groningen

Probabilistic photometric redshifts in the era of Petascale Astronomy

arxiv: v1 [astro-ph.im] 26 Jan 2015

arxiv: v1 [astro-ph.im] 7 Dec 2016

arxiv: v1 [astro-ph.im] 9 Jul 2014

Astronomical data mining (machine learning)

arxiv: v1 [astro-ph.im] 5 Jun 2012

arxiv: v3 [astro-ph.im] 30 Jul 2015

Astroinformatics for Dummies

Photometric Redshifts with DAME

DAME: A Web Oriented Infrastructure For Scientific Data

arxiv: v1 [astro-ph.ga] 3 Oct 2017

arxiv: v3 [astro-ph.im] 8 Aug 2012

DAME: A Web Oriented Infrastructure For Scientific Data Mining And Exploration

Stefano Cavuoti Supervisors: G. Longo 1, M. Brescia 1,2 (1) Department of Physics University Federico II Napoli (2) INAF National Institute of

arxiv: v1 [astro-ph.ga] 15 Apr 2015

Learning algorithms at the service of WISE survey

Evolution of galaxy clustering in the ALHAMBRA Survey

arxiv: v1 [astro-ph.ga] 25 Oct 2018

Big Data Inference. Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts. Josh Speagle 1,

Are VISTA/4MOST surveys interesting for cosmology? Chris Blake (Swinburne)

An improved technique for increasing the accuracy of photometrically determined redshifts for blended galaxies

Self-Organizing-Map and Deep-Learning! application to photometric redshift in HSC

Supernovae with Euclid

Inside Catalogs: A Comparison of Source Extraction Software

Subaru/WFIRST Synergies for Cosmology and Galaxy Evolution

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Astroinformatics in the data-driven Astronomy

Exploiting Sparse Non-Linear Structure in Astronomical Data

arxiv: v1 [astro-ph.im] 9 May 2013

arxiv: v2 [astro-ph.ga] 21 May 2017

Modern Image Processing Techniques in Astronomical Sky Surveys

Data Analysis Methods

Classifying Galaxy Morphology using Machine Learning

Photometric Redshifts for the NSLS

MULTIPLE EXPOSURES IN LARGE SURVEYS

Dictionary Learning for photo-z estimation

Statistical Searches in Astrophysics and Cosmology

Quasar Selection from Combined Radio and Optical Surveys using Neural Networks

arxiv: v1 [astro-ph.co] 15 Jul 2011

Science with large imaging surveys

Constraints from Galaxy Cluster in the photometric Euclid Survey

Statistics for classification

WISE as the cornerstone for all-sky photo-z samples

arxiv: v2 [astro-ph.co] 10 May 2016

Morphology and Topology of the Large Scale Structure of the Universe

Data Fitting and Uncertainty

Nonlinear Data Transformation with Diffusion Map

Galaxy and Mass Assembly (GAMA): Evolution of the mass-size relation

SNLS supernovae photometric classification with machine learning

SNAP Photometric Redshifts and Simulations

Inference of Galaxy Population Statistics Using Photometric Redshift Probability Distribution Functions

Dusty Starforming Galaxies: Astrophysical and Cosmological Relevance

VIALACTEA: the Milky Way as a Star Formation Engine

Reading Group on Deep Learning Session 1

The Multiplicity Function of groups of galaxies from CRoNaRio catalogues

arxiv: v4 [astro-ph.im] 9 May 2018

The 2-degree Field Lensing Survey: photometric redshifts from a large new training sample to r < 19.5

arxiv: v2 [astro-ph.co] 13 Jun 2016

Making precise and accurate measurements with data-driven models

Cosmology of Photometrically- Classified Type Ia Supernovae

Mapping the dark universe with cosmic magnification

Results from the Baryon Oscillation Spectroscopic Survey (BOSS)

Cristóbal Sifón, Marcello Cacciato, Henk Hoekstra, et al., 2015, MNRAS, 454, 3938

Lensing with KIDS. 1. Weak gravitational lensing

Weak Lensing: Status and Prospects

Towards More Precise Photometric Redshifts: Calibration Via. CCD Photometry

Quantifying correlations between galaxy emission lines and stellar continua

Target Selection for future spectroscopic surveys (DESpec) Stephanie Jouvel, Filipe Abdalla, With DESpec target selection team.

Durham Lightcones: Synthetic galaxy survey catalogues from GALFORM. Jo Woodward, Alex Merson, Peder Norberg, Carlton Baugh, John Helly

Basics of Photometry

Learning with multiple models. Boosting.

Clusters of Galaxies with Euclid

Chapter 4 Neural Networks in System Identification

The angular homogeneity scale of the Universe

LSST Cosmology and LSSTxCMB-S4 Synergies. Elisabeth Krause, Stanford

Astronomical image reduction using the Tractor

Controlling intrinsic alignments in weak lensing statistics

Neutron inverse kinetics via Gaussian Processes

SED- dependent Galactic Extinction Prescription for Euclid and Future Cosmological Surveys

Cosmological Constraints from the XMM Cluster Survey (XCS) Martin Sahlén, for the XMM Cluster Survey Collaboration The Oskar Klein Centre for

Lecture 4: Feed Forward Neural Networks

Introduction to Machine Learning

Artificial Intelligence

Photometric classification of SNe Ia in the SuperNova Legacy Survey with supervised learning

The angular homogeneity scale of the Universe

The second Herschel -ATLAS Data Release - III: optical and near-infrared counterparts in the North Galactic Plane field

arxiv: v1 [astro-ph.im] 27 Mar 2017

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Artificial Neural Networks. MGS Lecture 2

A MUSE view of cd galaxy NGC 3311

An analysis of feature relevance in the classification of astronomical transients with machine learning methods

Introduction to Logistic Regression and Support Vector Machine

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Some Statistical Aspects of Photometric Redshift Estimation

What Can We Learn from Galaxy Clustering 1: Why Galaxy Clustering is Useful for AGN Clustering. Alison Coil UCSD

Transcription:

METAPHOR Machine-learning Estimation Tool for Accurate Photometric Redshifts V. Amaro 1, S. Cavuoti 2, M. Brescia 2, C. Vellucci 1, G. Longo 1,3 1 - University of Napoli Federico II, Naples 2 - INAF- Astronomical Observatory of Capodimonte, Naples 3 - California Institute of Technology-Center for data driven discovery, Pasadena, USA

A reliable PDF should be able to: 1) evaluate photometric error distributions; 2) assess the correlation between spectroscopic and photometric errors; 3) disentangle photometric uncertainties from those intrinsic to the method itself. Source PDFs contain more information than simple redshift estimates. PDF s are able to improve the accuracy of cosmological measurements (Mandelbaum et al. 2008-2015). Many PDF methods for ML have been developed over the past years, mostly based on: Supervised methods (ANN, RF, MLP, used both as regressors and classifiers) Unsupervised methods (SOMs, random atlas) Rau et al. 2015, MNRAS, 452 Carrasco & Brunner 2013, MNRAS, 442 Bonnet 2013, MNRAS, 449 Sadeh et al. 2015, arxiv:1507.00490 Speagle et al. 2015, arxiv:1510.08073

METAPHOR workflow Data pre-processing: KB preparation + Photometry perturbation Photo-z Prediction: train/test phase by means of any arbitrary ML model (in our case the multi-threading MLPQNA) Probability Density Function Estimation Cavuoti et al., 2017

MLP+ (QNA)/(LEMON) The model architecture is a standard 4-layers Multi Layer Perceptron trained by: METAPhoR embeds a multi-thread version of MLPQNA, made by a Python wrapper, to improve training computational performance QNA rule, based on the L-BFGS algorithm (limited memory BFGS), implementing the Quasi-Newton algorithm One command-line parameter may switch to the use of an alternative learning rule, called LEMON (Levenberg-Marquardt Optimization Network) Regression error: Least Square error + Tikhonov regularization Npatterns y E = i t 2 i i=1 + W 2 λ 2 2

PDF estimation algorithm scheme 1. INPUT: the KB (train + test sets), the photo-z binning step B (by default 0.01) and the zspec Region of Interest (RoI) [Z min, Z max ] (a typical use case is [0,1]); 2. Produce N photometric perturbations, thus obtaining N additional test sets; 3. Perform 1 training (or N +1 trainings) and N +1 tests; 4. Derive: number of photo-z bins (Z max -Z min )/B; N+1 photo-z estimations; the number of photo-z C B,i ϵ [Z i,z i+b [;

The photometry perturbation procedure At the very base of the capability to determine a PDF in ML there is a photometry perturbation law mij = mij + αifij * gaussrandom (μ=0, σ=1) αi is a multiplicative constant defined by the user; Fij (x) is the weighting associated to each specific band used to weight the Gaussian noise contribution to magnitude values; gaussrandom (μ=0, σ=1) is the random value from the normal standard; In particular αi Fij is the term used to generate the set of perturbed replicates of the blind test set. Generally, we can identify and test different types of weighting coefficients: constant weight (flat); individual magnitude errors (indiv.); polynomial fitting (poly.); bimodal function (bimod.). αi= 0.9; thresh= 0.03; Npert= 1000 for all the bands

The photometry perturbation procedure At the very base of the capability to determine a PDF in ML there is a photometry perturbation law mij = mij + αifij * gaussrandom (μ=0, σ=1) αi is a multiplicative F ij has been constant chosen definedas by a the bimodal user; Fij (x) is the function, weightingi.e. associated with a to constant each specific value band used to weight representing the Gaussiana noise threshold contribution under to magnitude which values; the polynomial function is gaussrandom (μ=0, σ=1) is the random value from considered too low to provide a the normal standard; In particular significant αi Kij is the noise term used contribution to generate the to set of perturbed the perturbation replicates of the blind test set. Generally, we can identify and test different types of weighting coefficients: constant weight (flat); individual magnitude errors (indiv.); polynomial fitting (poly.); bimodal function (bimod.).

The photometry perturbation procedure At the very base of the capability to determine a PDF in ML there is a photometry perturbation law αi is a multiplicative constant defined by the user; Fij (x) is the weighting associated to each specific band used to weight the Gaussian noise contribution to magnitude values; gaussrandom (μ=0, σ=1) is the random value from the normal standard; In particular αi Kij is the term used to generate the set of perturbed replicates of the blind test set. Generally, we can identify and test different types of weighting coefficients: mij = mij + αifij * gaussrandom (μ=0, σ=1) In the fainter region F ij follows the derived photometric uncertainties constant weight (flat); individual magnitude errors (indiv.); polynomial fitting (poly.); bimodal function (bimod.).

METAPHOR statistics output Statistics provided for Δz=(zspec-zphot)/(1+zspec) on the blind test set: bias, standard deviation, NMAD, σ 68 (radius of the region including 68% of residuals close to 0), η 0.15 (fraction of outliers with Δz > 0.15), skewness (asymmetry); for the cumulative PDF performance, three estimators on the stacked residuals of the PDFs: f 0.05 : percentage of residuals within 0.05; f 0.15 : percentage of residuals within 0.15; < Δz >: weighted average of all residuals of stacked PDFs.

METAPHOR for KiDS DR3 For the training, we used a KB prepared as follows: 214 tiles of KiDS cross-matched with SDSS DR9 and GAMA DR2; #120,047 objects before the mag distribution tails cuts and NaN removal Parameter Space: a total of 12 magnitudes, in four bands ugri (4,6 apertures + GAaP), and 9 derived natural colours, for a total of 21 features; Two experiments performed with two KBs : i) 0.01 zspec 1 ; ii) 0.01 zspec 3.5 The final KBs consisted in: i) 66,731 train and 16,742 test samples; ii) 70,688 train and 17,559 test samples.

METAPHOR for KiDS DR3: Results de Jong et al. 2017

KiDS DR2 results with METAPHOR KB consisted of 15,180 training and 10,067 blind test objects Cavuoti et al. 2015, MNRAS, 452

Ongoing analysis on KiDS DR3 Comparison of the overall photo-z statistics and of the stacked PDF performance of METAPHOR with respect to BPZ in the COSMOS and GAMA spectroscopic fields Further comparison among methods (Bilicki et al, 2017, in prep.) CM is the cross-match between zcosmosbright and KiDS DR3 training set (SDSS + GAMA)

Ongoing analysis on KiDS DR3: preliminary results On the stacked PDFs KiDS CM GAMA MLPQNA f 0.05 63.2% f 0.15 96.6% <Δz> -0,027 Residuals distribution for the cross-match KiDS DR3- GAMA, comparison between MLPQNA and BPZ

Conclusions General applicability : METAPHOR can be applied with any arbitrary empirical photo-z estimation model. It is able to take into account photometric errors due to both measurements and method itself; Perspectives: Besides the ongoing comparison between METAPHOR and BPZ KiDS DR3 PDFs, METAPHOR is going to be used for cosmology through weak lensing; Under preliminary testing in Euclid PHZ pipeline; Available for other surveys on request.