Overlapping Astronomical Sources: Utilizing Spectral Information

Similar documents
Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Fully Bayesian Analysis of Low-Count Astronomical Images

Luminosity Functions

Bayesian Estimation of log N log S

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

Accounting for Calibration Uncertainty

Challenges and Methods for Massive Astronomical Data

The Expectation Maximization Algorithm

An introduction to Variational calculus in Machine Learning

Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems

Telescopes (Chapter 6)

Bayesian Methods for Machine Learning

Mixture Models and Expectation-Maximization

Hierarchical Bayesian Modeling

The Variational Gaussian Approximation Revisited

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Statistical Filters for Crowd Image Analysis

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Bayesian Inference for the Multivariate Normal

Latent Variable Models and Expectation Maximization

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Time Delay Estimation: Microlensing

Expectation Maximization

Model Selection for Galaxy Shapes

LAT Automated Science Processing for Gamma-Ray Bursts

New Results of Fully Bayesian

Celeste: Variational inference for a generative model of astronomical images

Collaborative topic models: motivations cont

Bayesian Estimation of log N log S

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

K-Means and Gaussian Mixture Models

pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

The Impact of Positional Uncertainty on Gamma-Ray Burst Environment Studies

Fitting Narrow Emission Lines in X-ray Spectra

Latent Variable Models and Expectation Maximization

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Disentangling Gaussians

A Derivation of the EM Updates for Finding the Maximum Likelihood Parameter Estimates of the Student s t Distribution

Information: Measuring the Missing, Using the Observed, and Approximating the Complete

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Automated Segmentation of Low Light Level Imagery using Poisson MAP- MRF Labelling

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Label Switching and Its Simple Solutions for Frequentist Mixture Models

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

Learning an astronomical catalog of the visible universe through scalable Bayesian inference

Mixture Models and EM

Accelerating the EM Algorithm for Mixture Density Estimation

H-means image segmentation to identify solar thermal features

Introduction to Machine Learning

Lecture 6: April 19, 2002

Analysis Challenges. Lynx Data: From Chandra to Lynx : 2017 Aug 8-10, Cambridge MA

U-Likelihood and U-Updating Algorithms: Statistical Inference in Latent Variable Models

Intelligent Systems I

Machine Learning Techniques for Computer Vision

Two Statistical Problems in X-ray Astronomy

STATISTICAL INFERENCE WITH DATA AUGMENTATION AND PARAMETER EXPANSION

STA 4273H: Statistical Machine Learning

Uncertainty quantification and visualization for functional random variables

Hmms with variable dimension structures and extensions

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Non-Parametric Bayes

CS 6140: Machine Learning Spring 2017

MIXTURE MODELS AND EM

1. Give short answers to the following questions. a. What limits the size of a corrected field of view in AO?

Celeste: Scalable variational inference for a generative model of astronomical images

Machine Learning for Signal Processing Expectation Maximization Mixture Models. Bhiksha Raj 27 Oct /

Cheng Soon Ong & Christian Walder. Canberra February June 2017

How to Win With Poisson Data: Imaging Gamma-Rays. A. Connors, Eureka Scientific, CHASC, SAMSI06 SaFDe

Massive Event Detection. Abstract

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

VCMC: Variational Consensus Monte Carlo

Hierarchical Bayesian Modeling

We Prediction of Geological Characteristic Using Gaussian Mixture Model

Introduction. Chapter 1

Convergence Diagnostics For Markov chain Monte Carlo. Eric B. Ford (Penn State) Bayesian Computing for Astronomical Data Analysis June 9, 2017

Non-gaussian spatiotemporal modeling

Robust and accurate inference via a mixture of Gaussian and terrors

IACHEC 13, Apr , La Tenuta dei Ciclamini. CalStat WG Report. Vinay Kashyap (CXC/CfA)

Three data analysis problems

ABSTRACT INTRODUCTION

Review and Motivation

STA 4273H: Statistical Machine Learning

Latent Variable Models and EM algorithm

Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models.

p L yi z n m x N n xi

Multivariate Bayesian Linear Regression MLAI Lecture 11

Estimating the parameters of hidden binomial trials by the EM algorithm

Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

Dimensional reduction of clustered data sets

Probabilistic Graphical Models

Lecture 4: Probabilistic Learning

Transcription:

Overlapping Astronomical Sources: Utilizing Spectral Information David Jones Advisor: Xiao-Li Meng Collaborators: Vinay Kashyap (CfA) and David van Dyk (Imperial College) CHASC Astrostatistics Group April 15, 2014 1 / 30

Introduction X-ray telescope data: spatial coordinates of photon detections photon energy Instrument error: diffraction in the telescope means recorded photon positions are spread out according to the point spread function (PSF) 2 / 30

Introduction PSFs overlap for sources near each other Aim: inference for number of sources and their intensities, positions and spectral distributions Key points: (i) obtain posterior distribution of number of sources, (ii) use spectral information 3 / 30

Basic Model and Notation (x i, y i ) = spatial coordinates of photon i k = # sources µ j = centre of source j s i = latent variable indicating which source photon i is from n j = n i=1 1 {s i =j}= # photons detected from component j {0, 1,..., k} (x i, y i ) s i = j, µ j, k PSF j centred at µ j for i = 1,..., n (n 0, n 1,..., n k ) w, k Mult(n; (w 0, w 1,..., w k )) (w 0, w 1,..., w k ) k Dirichlet(λ, λ,..., λ) µ j k Uniform over the image j = 1, 2,..., k k Pois(θ) Component with label 0 is background and its PSF is uniform over the image (so its centre is irrelevant) Reasonably insensitive to θ, the prior mean number of sources 4 / 30

3rd Dimension: Spectral Data We can distinguish the background from the sources better if we jointly model spatial and spectral information: e i s i = j, α j, β j Gamma(α j, β j ) for j {1,..., k} e i s i = 0 Uniform to some maximum α j Gamma(a α, b α) β j Gamma(a β, b β ) Using a (correctly) informative prior on α si and β si versus a diffuse prior made very little difference to results. 5 / 30

Computation: RJMCMC Similar to Richardson & Green 1997 Knowledge of the PSF makes things much easier Insensitive to the prior k Pois(θ) e.g. posterior when k = 10: (a) θ = 1 (b) θ = 10 Figure: Average posterior probabilities of each value of k across ten datasets 6 / 30

Simulation Study: Example 100 datasets simulated for each configuration Analysis with and without energy data Summarize posterior of k by posterior probability of two sources 7 / 30

Simulation Study: PSF (King 1962) King density has Cauchy tails Gaussian PSF leads to over-fitting in real data 8 / 30

Simulation Study: Data Generation Bright source: Dim source: n 1 Pois(m 1 = 1000) n 2 Pois(m 2 = 1000/r) where r = 1, 2, 10, 50 gives the relative intensity Source region : the region defined by PSF density greater than 10% of the maximum (essentially a circle with radius 1) d = the probability a photon from a source falls within its own region 9 / 30

Simulation Study: Data Generation Background per source region : Pois(bdm 2) where relative background b = 0.001, 0.01, 0.1, 1 Overall background ( ) image area n 0 Pois source region area bdm2 Separation: the distance between the sources. Values: 0.5, 1, 1.5, 2 10 / 30

Simulation Study: Data Generation Note: units should be pulse invariant (PI) channel 11 / 30

Median Posterior Probability at k=2: No Energy 12 / 30

Median Posterior Probability at k=2: Energy 13 / 30

Median SE of Dim Source Posterior Mean Position: No Energy 14 / 30

Median SE of Dim Source Posterior Mean Position: Energy 15 / 30

Strong Background Posterior Mean Positions: No Energy 16 / 30

Strong Background Posterior Mean Positions: Energy 17 / 30

XMM Data Additional question: how do the spectral distributions of the sources compare? 18 / 30

Parameter Inference Table: Posterior parameter estimation for FK Aqr and FL Aqr (using spectral data) µ 11 µ 12 µ 21 µ 22 w 1 w 2 w b α 1 α 2 β 1 β 2 Mean 120.973 124.873 121.396 127.326 0.732 0.189 0.079 3.195 3.121 0.005 0.005 SD 0.001 0.001 0.002 0.002 0.001 0.001 0.000 0.008 0.014 0.000 0.000 SD/Mean 0.000 0.000 0.000 0.000 0.001 0.003 0.004 0.002 0.004 0.002 0.005 19 / 30

Componentwise posterior spectral distributions 20 / 30

Posteriors of source spectral parameters 21 / 30

Chandra Data 22 / 30

Gamma Mixture Spectral Model 23 / 30

Chandra k Results 24 / 30

Locations: 90% credible regions 25 / 30

Chandra data spectral model contribution 1. Spectral model gives some constraints on the spectral distributions helping us to infer source properties more precisely Posterior standard deviations are smaller with the spectral model Without it some fainter sources are occasionally not found 2. It also offers some robustness to chance or systematic variations in the PSF and background Background is more easily distinguished from sources Spectral model plays a large role in the likelihood so a strange shaped source will be unlikely to be split unless the spectral data also supports two sources 26 / 30

Summary and extensions Coherent method for dealing with overlapping sources that uses spectral as well as spatial information Flexibility to include other phenomenon Temporal model? Flares and other activity change the intensity and spectral distribution of sources over time Approximation to full method could be desirable in some cases 27 / 30

S. Richardson, P. J. Green On Bayesian analysis of mixtures with an unknown number of components (with discussion), J. R. Statist. Soc. B, 59, 731792, 1997; corrigendum, 60 (1998), 661. I. King, The structure of star clusters. I. An empirical density law, The Astronomical Journal, 67 (1962), 471. C. M. Bishop, N. M. Nasrabadi, Pattern recognition and machine learning, Vol. 1. New York: springer, 2006. A. P. Dempster, N. M. Laird, D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B (Methodological) (1977): 1-38. S. P. Brooks, A. Gelman, General Methods for Monitoring Convergence of Iterative Simulations, Journal of Computational and Graphical Statistics, Vol. 7, No. 4. (Dec., 1998), pp. 434-455. 28 / 30

XMM data spectral distribution 29 / 30

Four models 30 / 30