Bayesian Estimation of log N log S

Similar documents
Bayesian Estimation of log N log S

Bayesian Modeling of log N log S

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Fitting Narrow Emission Lines in X-ray Spectra

Luminosity Functions

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Bayesian Computation in Color-Magnitude Diagrams

Statistical Tools and Techniques for Solar Astronomers

Two Statistical Problems in X-ray Astronomy

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Accounting for Calibration Uncertainty

Markov Chain Monte Carlo methods

Overlapping Astronomical Sources: Utilizing Spectral Information

Statistical Practice

EM Algorithm II. September 11, 2018

Parameter Estimation

New Results of Fully Bayesian

Fitting Narrow Emission Lines in X-ray Spectra

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian model selection for computer model validation via mixture model estimation

A Note on Bayesian Inference After Multiple Imputation

Bayesian Methods for Machine Learning

Likelihood-based inference with missing data under missing-at-random

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Part 8: GLMs and Hierarchical LMs and GLMs

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Health utilities' affect you are reported alongside underestimates of uncertainty

Basics of Modern Missing Data Analysis

Histograms, Central Tendency, and Variability

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Statistics for extreme & sparse data

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Learning the hyper-parameters. Luca Martino

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Business Statistics. Lecture 10: Course Review

Fully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis

New Bayesian methods for model comparison

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

Stats 579 Intermediate Bayesian Modeling. Assignment # 2 Solutions

Fractional Imputation in Survey Sampling: A Comparative Review

arxiv: v5 [stat.me] 13 Feb 2018

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Inference for a Population Proportion

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

A note on multiple imputation for general purpose estimation

Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets

Special Topic: Bayesian Finite Population Survey Sampling

Module 6: Introduction to Gibbs Sampling. Rebecca C. Steorts

Probabilistic Inference for Multiple Testing

Estimating the Size of Hidden Populations using Respondent-Driven Sampling Data

Quantile POD for Hit-Miss Data

Embedding Astronomical Computer Models into Principled Statistical Analyses. David A. van Dyk

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Miscellanea A note on multiple imputation under complex sampling

CTDL-Positive Stable Frailty Model

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Fully Bayesian Analysis of Low-Count Astronomical Images

Bayesian Inference Technique for Data mining for Yield Enhancement in Semiconductor Manufacturing Data

6 Pattern Mixture Models

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Learning Objectives for Stat 225

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Bayesian inference for multivariate extreme value distributions

Advanced Statistical Computing

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Statistics & Data Sciences: First Year Prelim Exam May 2018

Probability Distributions Columns (a) through (d)

Plausible Values for Latent Variables Using Mplus

Simulating Random Variables

Sequential Monte Carlo Methods for Bayesian Model Selection in Positron Emission Tomography

X-ray Dark Sources Detection

Machine learning: Hypothesis testing. Anders Hildeman

Reconstruction of individual patient data for meta analysis via Bayesian approach

Embedding Supernova Cosmology into a Bayesian Hierarchical Model

CSC321 Lecture 18: Learning Probabilistic Models

Introduction to Bayesian Inference: Supplemental Topics

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

ntopic Organic Traffic Study

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

Parametric fractional imputation for missing data analysis

How to Win With Poisson Data: Imaging Gamma-Rays. A. Connors, Eureka Scientific, CHASC, SAMSI06 SaFDe

Bayesian inference for factor scores

Large Scale Bayesian Inference

Bayes: All uncertainty is described using probability.

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling

MCMC algorithms for fitting Bayesian models

Bootstrap and Parametric Inference: Successes and Challenges

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Challenges and Methods for Massive Astronomical Data

Probabilistic Graphical Models

Transcription:

Bayesian Estimation of log N log S Paul Baines, Irina Udaltsova Department of Statistics University of California, Davis July 12th, 2011

Introduction What is log N log S? Cumulative number of sources detectable at a given sensitivity Defined as: N(> S) = i I {Si >S} i.e., the number of sources brighter than a threshold. Considering the distribution of sources, this is related to the survival function i.e., N(>S)=N (1-F(S)) log N log S refers to the relationship beween (or plot of) log 10 N(> S) and log 10 S. Why do we care? Constrains evolutionary models, dark matter distribution etc.

Inferential Process To infer the log N log S relationship there are a few steps: 1. Collect raw data images 2. Run a detection algorithm to extract sources 3. Produce a dataset describing the sources (and uncertainty about them) 4. Infer the log N log S distribution from this dataset Our analysis is focused on the final step accounting for some (but not all) of the detector-induced uncertainties...

Inferential Process To infer the log N log S relationship there are a few steps: 1. Collect raw data images 2. Run a detection algorithm to extract sources 3. Produce a dataset describing the sources (and uncertainty about them) 4. Infer the log N log S distribution from this dataset Our analysis is focused on the final step accounting for some (but not all) of the detector-induced uncertainties... Adding further layers to the analysis to start with raw images is possible but that is for a later time...

The Data The data is essentially just a list of photon counts with some extra information about the background and detector properties. Src_ID Count Src_area Bkg Off_axis Eff_area 2 270 1720 3.16 4.98 734813.1074 3 117 96 0.19 5.72 670916.3154 7 33 396 0.61 6.17 670916.3154 18 7 128 0.22 6.34 319483.9597 19 12 604 0.96 4.51 670916.3154

Problems The photon counts do not directly correspond to the source fluxes: 1. Background contamination 2. Natural (Poisson) variability 3. Detector efficiencies (PSF etc.) Not all sources in the population will be detected: 1. Low intensity sources 2. Close to the limit: background, natural variability and detection probabilities are important.

Key Ideas Our goals: Provide a complete analysis, accounting for all detector effects (especially those leading to unobserved sources) Allow for the incorporation of prior information Investigate parametric forms (testing) for log N log S (e.g., broken power-laws) Investigate the data-prior inferential limit (e.g., for which Smin does the information come primarily from the model and not the data)

Missing Data Overview There are many potential causes of missing data in astronomical data: Background contamination (e.g., total=source+background) Low-count sources (below detection threshold) Detector schedules (source not within detector range) Foreground contamination (objects between the source and detector) etc. Some are more problematic than others...

Missing Data Mechanisms In the nicest possible case, if the particular data that is missing does not depend on any unobserved values then we can essentially ignore the missing data. In this context, whether a source is observed is a function of its source count (intensity) which is unobserved for unobserved sources. This missing data mechanism is non-ignorable, and needs to be carefully accounted for in the analysis.

The Model N NegBinom (α, β), S i S min, θ iid Pareto (θ, S min ), i = 1,..., N, θ Gamma(a, b), Y src i S i, L i, E i iid Pois (λ(si, L i, E i )), Y bkg i L i, E i iid Pois (k(li, E i )), I i Bernoulli (g (S i, L i, E i )).

The Model It turns out that in many contexts there is strong theory that expects the log N log S to obey a Power law: N(> S) = N I {Si >S} αs θ, S > S min i=1 Taking the logarithm gives the linear log(n) log(s) relationship. The power-law relationship defines the marginal survival function of the population, and the marginal distribution of flux can be seen to be a Pareto distribution: S i S min, θ iid Pareto (θ, S min ), i = 1,..., N. The analyst must specify S min, a threshold above which we seek to estimate θ.

The Model cont... The total number of sources (unobserved and observed), denoted by N, is modeled as: N NegBinom (α, β), We observe photon counts contaminated with background noise and other detector effects, Yi tot = Yi src + Y bkg i, Y src i S i, L i, E i iid Pois (λ(si, L i, E i )), Y bkg i L i, E i iid Pois (k(li, E i )). The functions λ and k represent the intensity of source and background, respectively, for a given flux S i, location L i and effective exposure time E i.

The Model cont... The probability of a source being detected, g (S i, L i, E i ), is determined by the detector sensitivity, background and detection method. The marginal detection probability as a function of θ is defined as: π(θ) = g(s i, L i, E i ) p(s i θ) p(l i, E i ) ds i de i dl i. The prior on θ is assumed to be: θ Gamma(a, b).

The Model cont... The posterior distribution, marginalizing over the unobserved fluxes, can be shown to be: p ( N, θ, S obs Yobs src ) tot n, Yobs p ( N, θ, S obs, S mis, Y src obs, Y src mis, Y tot p (N) p (θ) p (n N, θ) p (S obs n, θ) p ( Yobs tot n, S ) ( obs p Y src tot obs n, Yobs, S obs). mis n, Yobs tot ) dy src mis dy tot mis ds mis

The Model N NegBinom (α, β), S i S min, θ iid Pareto (θ, S min ), i = 1,..., N, θ Gamma(a, b), Y src i S i, L i, E i iid Pois (λ(si, L i, E i )), Y bkg i L i, E i iid Pois (k(li, E i )), I i Bernoulli (g (S i, L i, E i )).

Computational Details The Gibbs sampler consists of four steps: [ Y src obs n, Yobs tot ] [, S obs, Sobs n, Yobs tot, Yobs, src θ ], [θ n, N, S obs ], [N n, θ]. Sample the observed photon counts: Y src obs,i n, Y tot obs,i, S obs,i Binom for i = 1,..., n. ( Y tot obs,i, ) λ(s obs,i, L obs,i, E obs,i ), λ(s obs,i, L obs,i, E obs,i ) + k Sample the fluxes S obs,i, i = 1,..., n (MH using a t proposal). Sample the power-law slope θ (MH using a t proposal). Compute the posterior distribution for the total number of sources, N, using numerical integration: ( ) N Γ(N + α) 1 π(θ) p(n n, θ) I{N n} Γ(N n + 1) β + 1 Note: The (prior) marginal detection probability π(θ) is pre-computed via the numerical integration.

Computational Notes Some important things to note: Computation is fast (secs), and insensitive to the number of missing sources The fluxes of the missing sources are never imputed (only the number of missing sources) Most steps are not in closed form changing (some) assumptions has little computation impact Broken power law (or other forms) can be implemented by changing only one of the steps Fluxes of missing sources can (optionally) be imputed to produce posterior draws of a corrected log N log S

RED = Missing sources, BLACK = Observed sources.

Future Work We currently do not include: 1. False sources (allowing that observed sources might actually be background/artificial) 2. Spatially varying detection probabilities (straightforward, needs implementing)

Simulated Example Assume parameter setting: N NegBinom(α, β), where α = 200 = shape, β = 2 = scale θ Gamma(a, b), where a = 20 = shape, b = 1/20 = scale S i θ Pareto(θ, S min ), where S min = 10 13 Yi src S i, L i, E i Pois(λ(S i, L i, E i )) Y bkg i S i, L i, E i Pois(k(L i, E i )) λ = S i E i γ, where effective area E i (1, 000, 100, 000), and the energy per photon γ = 1.6 10 9 k i = z E i, where the rate of background photon count intensity per million seconds z = 0.0005 n iter = 21, 000, Burnin = 1000

Simulated Example cont... Detection probability: g(λ, k) = 1.0 a 0 (λ + k) a1 e a 2 (λ+k), where a 0 = 11.12, a 1 = 0.83, a 2 = 0.43 Marginal detection probability:

Empirical results of MCMC sampler The actual coverage of nominal percentiles for all parameters for simulated data, for M = 200 validation datasets: Coverage Percentile 50% 80% 90% 95% 98% 99% 99.9% N 0.55 0.83 0.90 0.96 0.98 0.99 1.00 θ 0.50 0.82 0.92 0.97 0.99 0.99 1.00 all S obs 0.51 0.81 0.90 0.95 0.98 0.99 1.00 Mean squared error of different estimators for N and θ for simulated data. MSE N θ Level of Effective Area Median Mean Median Mean Low 215.96 291.82 0.05439 0.07481 Medium 121.26 168.91 0.05558 0.07407 High 68.23 95.36 0.04578 0.05987

MCMC Draws

Posterior Correlations Posterior estimates for the power-law slope and the total number of sources.

MCMC Draws

Simulated log N log S Uncertainties in source fluxes and a display of the power-law relationship. Posterior draws (gray), truth (blue).

The Data Src_ID Count Src_area Bkg Off_axis Eff_area 2 270 1720 3.16 4.98 734813.1074 3 117 96 0.19 5.72 670916.3154 7 33 396 0.61 6.17 670916.3154 18 7 128 0.22 6.34 319483.9597 19 12 604 0.96 4.51 670916.3154

Posteriors of parameters N and θ Zezas et al. (2003) estimated a power-law slope of ˆθ = 0.45. The posterior median from our analysis is θ = 0.38, with the 95% posterior interval consistent with competing estimators.

Note: This a posterior plot for the observed sources only (the corrected plot would be more useful... ) Evidence of a possible break in the power-law in the observed log N log S. Given the possible non-linearity of the log(n) log(s), more work is needed to allow for a broken power-law or more general parametric forms.

References A. Zezas et al. (2004) Chandra survey of the Bar region of the SMC Revista Mexicana de Astronoma y Astrofsica (Serie de Conferencias) Vol. 20. IAU Colloquium 194, pp. 205-205. R.J.A. Little, D.B. Rubin. (2002) Statistical analysis with missing data, Wiley.