Two Statistical Problems in X-ray Astronomy

Similar documents
Fitting Narrow Emission Lines in X-ray Spectra

Statistical Tools and Techniques for Solar Astronomers

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Luminosity Functions

X-ray Dark Sources Detection

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Fully Bayesian Analysis of Low-Count Astronomical Images

Bayesian Estimation of log N log S

Multimodal Nested Sampling

Bayesian Estimation of log N log S

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

CSC 2541: Bayesian Methods for Machine Learning

MCMC Sampling for Bayesian Inference using L1-type Priors

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

Lecture 16: Mixtures of Generalized Linear Models

Embedding Astronomical Computer Models into Principled Statistical Analyses. David A. van Dyk

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Detection ASTR ASTR509 Jasper Wall Fall term. William Sealey Gosset

STA 4273H: Statistical Machine Learning

Large Scale Bayesian Inference

Bayesian Methods for Machine Learning

Probabilistic Catalogs and beyond...

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Spatial Normalized Gamma Process

Principles of Bayesian Inference

Contents. Part I: Fundamentals of Bayesian Inference 1

Modelling Operational Risk Using Bayesian Inference

The Challenges of Heteroscedastic Measurement Error in Astronomical Data

Advances and Applications in Perfect Sampling

Down by the Bayes, where the Watermelons Grow

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007

Fitting Narrow Emission Lines in X-ray Spectra

Hierarchical Bayesian Modeling

Bayesian Computation in Color-Magnitude Diagrams

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Bayesian Linear Regression

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

BXA robust parameter estimation & practical model comparison. Johannes Buchner. FONDECYT fellow

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Line Broadening. φ(ν) = Γ/4π 2 (ν ν 0 ) 2 + (Γ/4π) 2, (3) where now Γ = γ +2ν col includes contributions from both natural broadening and collisions.

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

Galaxy formation and evolution. Astro 850

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

MCMC algorithms for fitting Bayesian models

On Bayesian Computation

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Bayesian Phylogenetics:

Introduction to Bayesian methods in inverse problems

The XXL survey. Jon Willis, Marguerite Pierre, Florian Pacaud, Ben Maughan, Maggie Lieu, Sebastien Lavoie, Adam Mantz et al.

Accounting for Calibration Uncertainty

Markov Chain Monte Carlo methods

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

Big Data Inference. Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts. Josh Speagle 1,

Computer intensive statistical methods

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian linear regression

Overlapping Astronomical Sources: Utilizing Spectral Information

Central image detection with VLBI

A8824: Statistics Notes David Weinberg, Astronomy 8824: Statistics Notes 6 Estimating Errors From Data

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

eqr094: Hierarchical MCMC for Bayesian System Reliability

Modern Image Processing Techniques in Astronomical Sky Surveys

Treatment and analysis of data Applied statistics Lecture 4: Estimation

BigBOSS Data Reduction Software

Probability and Estimation. Alan Moses

Hierarchical Bayesian Modeling

X-ray observations of neutron stars and black holes in nearby galaxies

C. Watson, E. Churchwell, R. Indebetouw, M. Meade, B. Babler, B. Whitney

10. Exchangeability and hierarchical models Objective. Recommended reading

Advanced Statistical Methods. Lecture 6

Biostat 2065 Analysis of Incomplete Data

The lmm Package. May 9, Description Some improved procedures for linear mixed models

Bayesian Linear Models

Bayesian non-parametric model to longitudinally predict churn

Astronomical image reduction using the Tractor

Bayesian Classification and Regression Trees

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Metropolis-Hastings Algorithm

Bayesian model selection for computer model validation via mixture model estimation

A Bayesian Approach to Phylogenetics

SMSTC: Probability and Statistics

Markov Chain Monte Carlo methods

Stat 516, Homework 1

Wald s theorem and the Asimov data set

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

A. Chen (INAF-IASF Milano) On behalf of the Fermi collaboration

Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Physics 403. Segev BenZvi. Classical Hypothesis Testing: The Likelihood Ratio Test. Department of Physics and Astronomy University of Rochester

arxiv:astro-ph/ v1 10 Nov 1999

Bayesian Linear Models

How to Win With Poisson Data: Imaging Gamma-Rays. A. Connors, Eureka Scientific, CHASC, SAMSI06 SaFDe

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

A Probabilistic Framework for solving Inverse Problems. Lambros S. Katafygiotis, Ph.D.

Bayesian hierarchical space time model applied to highresolution hindcast data of significant wave height

Transcription:

Two Statistical Problems in X-ray Astronomy Alexander W Blocker October 21, 2008

Outline 1 Introduction 2 Replacing stacking Problem Current method Model Further development 3 Time symmetry Problem Model Computational approach

Introduction Recent projects have focused on two areas: Analysis of faint (low-count) x-ray data with Bayesian models Analysis of events in time series Each has presented a unique set of challenges

Problem General analysis of faint x-ray sources

Problem General analysis of faint x-ray sources In multiwavelength x-ray studies, astronomers identify potential sources using catalogs in one waveband (typically optical or infrared) and observe the selected sources in x-rays.

Problem General analysis of faint x-ray sources In multiwavelength x-ray studies, astronomers identify potential sources using catalogs in one waveband (typically optical or infrared) and observe the selected sources in x-rays. This frequently leads to a sample containing many faint, undetected sources.

Problem General analysis of faint x-ray sources In multiwavelength x-ray studies, astronomers identify potential sources using catalogs in one waveband (typically optical or infrared) and observe the selected sources in x-rays. This frequently leads to a sample containing many faint, undetected sources. We want to combine information from these undetected sources to make inferences about our selected sample.

Current method Current method: stacking Based on background subtraction For source i, observe c s,i counts in source aperture and c b,i counts in background aperture. Calculate net counts as c n,i = c s,i A s,i A b,i c b,i, where A s,i and A b,i are the effective areas for the source and background regions (taking into account exposures), respectively.

Current method Current method: stacking Based on background subtraction For source i, observe c s,i counts in source aperture and c b,i counts in background aperture. Calculate net counts as c n,i = c s,i A s,i A b,i c b,i, where A s,i and A b,i are the effective areas for the source and background regions (taking into account exposures), respectively. Calculate stacked flux as f x = P ECF i A s,i the mean energy conversion factor. i c n,i, where ECF is

Current method Current method: stacking Based on background subtraction For source i, observe c s,i counts in source aperture and c b,i counts in background aperture. Calculate net counts as c n,i = c s,i A s,i A b,i c b,i, where A s,i and A b,i are the effective areas for the source and background regions (taking into account exposures), respectively. Calculate stacked flux as f x = P ECF i A s,i the mean energy conversion factor. i c n,i, where ECF is i LCF ic n,i, where Calculate stacked luminosity as L x = 1 N LCF i is the luminosity conversion factor for source i LCF i = 4πd 2 l,iecf i A corr,i K corr,i A s,i

Current method Problems with conventional stacking

Current method Problems with conventional stacking Use of background subtraction Gaussian assumption; clearly inappropriate here. Above manifests as negative net counts; for sufficiently faint samples, can lead to negative stacked fluxes and luminosities.

Current method Problems with conventional stacking Use of background subtraction Gaussian assumption; clearly inappropriate here. Above manifests as negative net counts; for sufficiently faint samples, can lead to negative stacked fluxes and luminosities. No clean measure of uncertainties on luminosities.

Current method Problems with conventional stacking Use of background subtraction Gaussian assumption; clearly inappropriate here. Above manifests as negative net counts; for sufficiently faint samples, can lead to negative stacked fluxes and luminosities. No clean measure of uncertainties on luminosities. Solution: model data as Poisson

Model A hierarchical Bayesian model for stacking Observation Model For source i, we assume that c n,i Pois(λ n,i ) Also assume c b,i Pois(λ b,i A b,i A s,i ) Finally, c s,i c n,i Pois(λ b,i )

Model A hierarchical Bayesian model for stacking Observation Model For source i, we assume that c n,i Pois(λ n,i ) Also assume c b,i Pois(λ b,i A b,i A s,i ) Finally, c s,i c n,i Pois(λ b,i ) Intensity Model If redshifts are known, can model luminosities directly & assume L i Lognormal(µ L, σ L ) (or L i Γ(α L, β L ))

Model A hierarchical Bayesian model for stacking Observation Model For source i, we assume that c n,i Pois(λ n,i ) Also assume c b,i Pois(λ b,i A b,i A s,i ) Finally, c s,i c n,i Pois(λ b,i ) Intensity Model If redshifts are known, can model luminosities directly & assume L i Lognormal(µ L, σ L ) (or L i Γ(α L, β L )) Otherwise, can apply analogous framework to flux f i.

Model A hierarchical Bayesian model for stacking Observation Model For source i, we assume that c n,i Pois(λ n,i ) Also assume c b,i Pois(λ b,i A b,i A s,i ) Finally, c s,i c n,i Pois(λ b,i ) Intensity Model If redshifts are known, can model luminosities directly & assume L i Lognormal(µ L, σ L ) (or L i Γ(α L, β L )) Otherwise, can apply analogous framework to flux f i. Generally assume λ b,i Γ(α b, β b )

Model A hierarchical Bayesian model for stacking Observation Model For source i, we assume that c n,i Pois(λ n,i ) Also assume c b,i Pois(λ b,i A b,i A s,i ) Finally, c s,i c n,i Pois(λ b,i ) Intensity Model If redshifts are known, can model luminosities directly & assume L i Lognormal(µ L, σ L ) (or L i Γ(α L, β L )) Otherwise, can apply analogous framework to flux f i. Generally assume λ b,i Γ(α b, β b ) Using noninformative priors on hyperparameters (Jefferys)

Model A hierarchical Bayesian model for stacking, continued Key assumptions

Model A hierarchical Bayesian model for stacking, continued Key assumptions For luminosity-based inference, assuming that redshifts are known Relatively plausible for spectroscopic; not as much for photometric

Model A hierarchical Bayesian model for stacking, continued Key assumptions For luminosity-based inference, assuming that redshifts are known Relatively plausible for spectroscopic; not as much for photometric Assuming the spectra of sources are know & identical Typically assume power law with photon index 1.7

Model A hierarchical Bayesian model for stacking, continued Key assumptions For luminosity-based inference, assuming that redshifts are known Relatively plausible for spectroscopic; not as much for photometric Assuming the spectra of sources are know & identical Typically assume power law with photon index 1.7 Attempting to make inferences only on selected sample, for now; not dealing with selection effects, etc.

Model Computation

Model Computation Using data augmentation algorithm with c n as missing data

Model Computation Using data augmentation algorithm with c n as missing data For Γ hyperdistributions, using Metropolis-Hastings step within Gibbs sampler to draw α & β

Model Computation Using data augmentation algorithm with c n as missing data For Γ hyperdistributions, using Metropolis-Hastings step within Gibbs sampler to draw α & β For Lognormal hyperdistribution, using Gibbs step to draw µ L & σ L ; Metropolis-Hastings step used to draw λ n

Model Computation Using data augmentation algorithm with c n as missing data For Γ hyperdistributions, using Metropolis-Hastings step within Gibbs sampler to draw α & β For Lognormal hyperdistribution, using Gibbs step to draw µ L & σ L ; Metropolis-Hastings step used to draw λ n MH step here is very efficient; using Haley s method to identify posterior modes in parallel and tune proposal distribution.

Model Computation Using data augmentation algorithm with c n as missing data For Γ hyperdistributions, using Metropolis-Hastings step within Gibbs sampler to draw α & β For Lognormal hyperdistribution, using Gibbs step to draw µ L & σ L ; Metropolis-Hastings step used to draw λ n MH step here is very efficient; using Haley s method to identify posterior modes in parallel and tune proposal distribution. From posterior simulations, can retain posterior mean & standard deviation of each source flux (and luminosity, if available) in addition to hyperparameter samples. This provides a great deal of information that is not available with conventional stacking in addition to estimates of sample properties with uncertainties.

Further development Potential directions for further work

Further development Potential directions for further work Currently have a very fast method that requires no more data than conventional stacking (and makes few additional assumptions).

Further development Potential directions for further work Currently have a very fast method that requires no more data than conventional stacking (and makes few additional assumptions). Room for improvement in some areas: Explicit handling of the PSF

Further development Potential directions for further work Currently have a very fast method that requires no more data than conventional stacking (and makes few additional assumptions). Room for improvement in some areas: Explicit handling of the PSF Incorporation of spectral uncertainties

Further development Potential directions for further work Currently have a very fast method that requires no more data than conventional stacking (and makes few additional assumptions). Room for improvement in some areas: Explicit handling of the PSF Incorporation of spectral uncertainties Incorporation of photometric redshift uncertainties

Problem Testing time symmetry for astronomical events We have a set of x-ray light curves like the above, each of which is believed to contain an event (in this case, an occultation).

Problem Testing time symmetry for astronomical events We have a set of x-ray light curves like the above, each of which is believed to contain an event (in this case, an occultation). Interested in testing if the event (a dimming, in this case) is time-symmetric.

Problem Testing time symmetry for astronomical events Even for the Gaussian case, this is not entirely straightforward.

Problem Testing time symmetry for astronomical events Even for the Gaussian case, this is not entirely straightforward. Question of how much structure to place on shape of event. Taking maximum over possible centers of event for less structured approach complex distribution of test statistic.

Problem Testing time symmetry for astronomical events Even for the Gaussian case, this is not entirely straightforward. Question of how much structure to place on shape of event. Taking maximum over possible centers of event for less structured approach complex distribution of test statistic. With Poisson data, we really need a structured model.

Model Intensity model Define λ t to be the intensity (count-rate) of our source at time t

Model Intensity model Define λ t to be the intensity (count-rate) of our source at time t We model λ t as: λ t = c αg(t; τ, θ) where lim t g(t; τ, θ) = lim t g(t; τ, θ) = 0 and sup R g(t; τ, θ) = g(τ; τ, θ) = 1

Model Intensity model Define λ t to be the intensity (count-rate) of our source at time t We model λ t as: λ t = c αg(t; τ, θ) where lim t g(t; τ, θ) = lim t g(t; τ, θ) = 0 and sup R g(t; τ, θ) = g(τ; τ, θ) = 1 Thus, c characterizes our baseline source intensity, α characterizes the extent of the deviation from this baseline during the event, and g(t; τ, θ) characterizes the shape of the event itself.

Model Observation model Given our series of intensities λ t, we then model the observed counts at time t as: n t Pois (λ t )

Model Observation model Given our series of intensities λ t, we then model the observed counts at time t as: n t Pois (λ t ) This approach generalizes easily to the high count regime with only minor modifications.

Model Testing We can then test the hypothesis of time symmetry by placing the appropriate restrictions on θ and calculating a likelihood-ratio test statistic. The challenge is then to find a parsimonious yet flexible form for the event profile g(t; τ, θ).

Model Testing We can then test the hypothesis of time symmetry by placing the appropriate restrictions on θ and calculating a likelihood-ratio test statistic. The challenge is then to find a parsimonious yet flexible form for the event profile g(t; τ, θ). One possibility: a bilogistic event profile g(t; τ, h 1, h 2, k 1, k 2 ) = h t = k t = 1 + e ht kt 1 + e t τ h t kt { h1 t < τ h 2 t τ { k1 t < τ t τ k 2

Model Testing We can then test the hypothesis of time symmetry by placing the appropriate restrictions on θ and calculating a likelihood-ratio test statistic. The challenge is then to find a parsimonious yet flexible form for the event profile g(t; τ, θ). One possibility: a bilogistic event profile g(t; τ, h 1, h 2, k 1, k 2 ) = h t = k t = 1 + e ht kt 1 + e t τ h t kt { h1 t < τ h 2 t τ { k1 t < τ t τ k 2

Model Testing, continued Can also use Gaussian profile for event; tradeoff between degrees of freedom to characterize event and computational requirements.

Model Testing, continued Can also use Gaussian profile for event; tradeoff between degrees of freedom to characterize event and computational requirements. Because data is non-gaussian, still need to simulate under null hypothesis to obtain actual distribution of test statistic (cannot necessarily rely on χ 2 approximation).

Computational approach Maximizing the likelihood Another challenge: maximizing the likelihood for this model

Computational approach Maximizing the likelihood Another challenge: maximizing the likelihood for this model It is very multimodal (lots of small, annoying, local maxima)

Computational approach Maximizing the likelihood Another challenge: maximizing the likelihood for this model It is very multimodal (lots of small, annoying, local maxima) The good news: only the location parameter τ is truly troublesome

Computational approach Maximizing the likelihood Another challenge: maximizing the likelihood for this model It is very multimodal (lots of small, annoying, local maxima) The good news: only the location parameter τ is truly troublesome A solution: 1 Randomly draw a set of starting values for τ (possibly based on scan statistics or another simple method). 2 For each starting value, run a fast, local optimization algorithm (such as Gauss-Newton) until convergence. 3 Take the maximum of the values given by the local algorithms.

Computational approach Maximizing the likelihood Another challenge: maximizing the likelihood for this model It is very multimodal (lots of small, annoying, local maxima) The good news: only the location parameter τ is truly troublesome A solution: 1 Randomly draw a set of starting values for τ (possibly based on scan statistics or another simple method). 2 For each starting value, run a fast, local optimization algorithm (such as Gauss-Newton) until convergence. 3 Take the maximum of the values given by the local algorithms. This approach parallelizes extremely well, making it ideal for use in a cluster environment (such as Odyssey).