Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Similar documents
Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Accounting for Calibration Uncertainty

Embedding Astronomical Computer Models into Principled Statistical Analyses. David A. van Dyk

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

New Results of Fully Bayesian

Fully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

MONTE CARLO METHODS FOR TREATING AND UNDERSTANDING HIGHLY-CORRELATED INSTRUMENT CALIBRATION UNCERTAINTIES

JEREMY DRAKE, PETE RATZLAFF, VINAY KASHYAP AND THE MC CALIBRATION UNCERTAINTIES TEAM MONTE CARLO CONSTRAINTS ON INSTRUMENT CALIBRATION

Fully Bayesian Analysis of Low-Count Astronomical Images

High Energy Astrophysics:

Fitting Narrow Emission Lines in X-ray Spectra

Fitting Narrow Emission Lines in X-ray Spectra

Overlapping Astronomical Sources: Utilizing Spectral Information

SEARCHING FOR NARROW EMISSION LINES IN X-RAY SPECTRA: COMPUTATION AND METHODS

Bayesian Estimation of log N log S

Statistical Tools and Techniques for Solar Astronomers

Multistage Anslysis on Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Challenges and Methods for Massive Astronomical Data

Bayesian Estimation of log N log S

Analysis Challenges. Lynx Data: From Chandra to Lynx : 2017 Aug 8-10, Cambridge MA

IACHEC 13, Apr , La Tenuta dei Ciclamini. CalStat WG Report. Vinay Kashyap (CXC/CfA)

Bayesian Computation in Color-Magnitude Diagrams

Incorporating Spectra Into Periodic Timing: Bayesian Energy Quantiles

The High Energy Transmission Grating Spectrometer

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Using Bayes Factors for Model Selection in High-Energy Astrophysics

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Time Delay Estimation: Microlensing

Three data analysis problems

Luminosity Functions

Advanced Statistical Methods. Lecture 6

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Supplementary Note on Bayesian analysis

Bayesian Regression Linear and Logistic Regression

Two Statistical Problems in X-ray Astronomy

Introduction to Bayesian Methods

Solar Spectral Analyses with Uncertainties in Atomic Physical Models

X-ray observations of neutron stars and black holes in nearby galaxies

Bayesian Inference in Astronomy & Astrophysics A Short Course

Strong Lens Modeling (II): Statistical Methods

Spectral Analysis of High Resolution X-ray Binary Data

STA 4273H: Sta-s-cal Machine Learning

Bayesian Estimation of Hardness Ratios: Modeling and Computations

Introduction to Probabilistic Machine Learning

VCMC: Variational Consensus Monte Carlo

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Fast Likelihood-Free Inference via Bayesian Optimization

Bayesian Inference and MCMC

Large Scale Bayesian Inference

X-ray spectral Analysis

arxiv: v1 [astro-ph.im] 15 Oct 2015

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

An introduction to Bayesian statistics and model calibration and a host of related topics

Embedding Computer Models for Stellar Evolution into a Coherent Statistical Analysis

Bayesian Deep Learning

Hierarchical Bayesian Modeling

Model Selection for Galaxy Shapes

Introduction to Bayesian Data Analysis

Introduction to Bayes

Non-Parametric Bayes

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

SEARCH FOR AXION- LIKE PARTICLE SIGNATURES IN GAMMA-RAY DATA

Statistical Methods for Astronomy

Using training sets and SVD to separate global 21-cm signal from foreground and instrument systematics

Hierarchical Bayesian Modeling

Introduction to Computer Vision

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

A New Method for Characterizing Unresolved Point Sources: applications to Fermi Gamma-Ray Data

Embedding Supernova Cosmology into a Bayesian Hierarchical Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

The microwave sky as seen by Planck

Bayesian Networks BY: MOHAMAD ALSABBAGH

The Quality and Stability of Chandra Telescope Pointing and Spacial Resolution

arxiv:astro-ph/ v2 18 Oct 2002

Tools for Parameter Estimation and Propagation of Uncertainty

Bayesian Quadrature: Model-based Approximate Integration. David Duvenaud University of Cambridge

Contents. Part I: Fundamentals of Bayesian Inference 1

XMM-Newton Chandra Blazar Flux Comparison. M.J.S. Smith (ESAC) & H. Marshall (MIT) 7 th IACHEC, Napa, March 2012

Statistical Data Analysis Stat 3: p-values, parameter estimation

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

An introduction to Bayesian reasoning in particle physics

Statistical techniques for data analysis in Cosmology

Upper Limits. David A. van Dyk. Joint work with Vinay Kashyap 2 and Members of the California-Boston AstroStatistics Collaboration.

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Bayesian model selection: methodology, computation and applications

Rejection sampling - Acceptance probability. Review: How to sample from a multivariate normal in R. Review: Rejection sampling. Weighted resampling

Patterns of Scalable Bayesian Inference Background (Session 1)

State Space and Hidden Markov Models

Bayesian room-acoustic modal analysis

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Multimodal Nested Sampling

A. Chen (INAF-IASF Milano) On behalf of the Fermi collaboration

an introduction to bayesian inference

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Approximate Bayesian Computation and Particle Filters

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

A novel determination of the local dark matter density. Riccardo Catena. Institut für Theoretische Physik, Heidelberg

Transcription:

Accounting for in Spectral Analysis David A van Dyk Statistics Section, Imperial College London Joint work with Vinay Kashyap, Jin Xu, Alanna Connors, and Aneta Siegminowska IACHEC March 2013 David A van Dyk Accounting for in Spectral Analysis

Outline 1 Bayesian Statistical Methods Components of a Statistical Model Statistical Computation 2 The Calibration Sample The Effect of 3 Pragmatic and Fully Bayesian Solutions Empirical Illustration David A van Dyk Accounting for in Spectral Analysis

Outline Bayesian Statistical Methods Components of a Statistical Model Statistical Computation 1 Bayesian Statistical Methods Components of a Statistical Model Statistical Computation 2 The Calibration Sample The Effect of 3 Pragmatic and Fully Bayesian Solutions Empirical Illustration David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Bayesian Statistical Analyses: Likelihood Likelihood Functions: The distribution of the data given the model parameters Eg, Y dist Poisson(λ S ): likelihood(λ S ) = e λ S λ Y S /Y! Maimum Likelihood Estimation: Suppose Y = 3 likelihood 000 010 020 0 2 4 6 8 10 12 lambda Can estimate λ S and its error bars The likelihood and its normal approimation David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Bayesian Analyses: Prior and Posterior Dist ns Prior Distribution: Knowledge obtained prior to current data Bayes Theorem and Posterior Distribution: posterior(λ) likelihood(λ)prior(λ) Combine past and current information: prior 000 010 020 030 0 2 4 6 8 10 12 lambda posterior 000 010 020 0 2 4 6 8 10 12 lambda Bayesian analyses rely on probability theory David A van Dyk Accounting for in Spectral Analysis

Multi-Level Models Components of a Statistical Model Statistical Computation A Poisson Multi-Level Model: LEVEL 1: Y Y B, λ S dist Poisson(λ S ) Y B, LEVEL 2: Y B λ B dist Pois(λ B ) and X λ B dist Pois(λ B 24), LEVEL 3: specify a prior distribution for λ B, λ S Each level of the model specifies a dist n given unobserved quantities whose dist ns are given in lower levels 00 01 02 03 04 05 06 07 prior 0 2 4 6 8 lams 00 01 02 03 04 05 06 07 posterior 0 2 4 6 8 lams lamb 15 20 25 08 05 02 005 joint posterior with flat prior 0 1 2 3 4 lams David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Multi-Level Models: X-ray Spectral Analysis counts per unit counts per unit counts per unit 0 50 150 250 0 50 100 150 0 40 80 120 1 2 3 4 5 6 energy (kev) 1 2 3 4 5 6 energy (kev) absorbtion and submaimal effective area instrument response pile-up background counts per unit counts per unit 0 50 150 0 40 80 120 1 2 3 4 5 6 energy (kev) 1 2 3 4 5 6 energy (kev) 1 2 3 4 5 6 energy (kev) PyBLoCXS: Bayesian Low-Count X-ray Spectral Analysis David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Embedding into a Statistical Model We aim to include calibration uncertainty as a component of the multi-level statistical model fit in PyBLoCXS Build a multi-level model In addition to spectral parameters, calibration products are treated as unknown quantities Calibration products are high dimensional unknowns Calibration scientists provide valuable prior information about these quantities We must quantify this information into a prior distribution Computation becomes a real issue! David A van Dyk Accounting for in Spectral Analysis

Bayesian Statistical Methods Components of a Statistical Model Statistical Computation Markov Chain Monte Carlo Eploring the posterior distribution via Monte Carlo prior lams 0 2 4 6 8 00 02 04 06 0 2 4 6 8 10 00 02 04 posterior lams joint posterior with flat prior lams lamb 0 2 4 6 8 15 20 25 David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Model Fitting: Comple Posterior Distributions Highly non-linear relationship among parameters David A van Dyk Accounting for in Spectral Analysis

Components of a Statistical Model Statistical Computation Model Fitting: Comple Posterior Distributions The classification of certain stars as field or cluster stars can cause multiple modes in the distributions of other parameters David A van Dyk Accounting for in Spectral Analysis

Outline Bayesian Statistical Methods The Calibration Sample The Effect of 1 Bayesian Statistical Methods Components of a Statistical Model Statistical Computation 2 The Calibration Sample The Effect of 3 Pragmatic and Fully Bayesian Solutions Empirical Illustration David A van Dyk Accounting for in Spectral Analysis

The Basic Statistical Model The Calibration Sample The Effect of Computer Models Background Basic Physics Parametric Models Photon Absorption Observed Data (mean=λ) Multi-Scale Models High-Energy Telescope Effective Area (ARF) Blurring (RMF) Optical Telescope Gaussian Measurement Errors Embed physical models into multi-level statistical models Must account for compleities of data generation State of the art computational techniques enable us to fit the resulting highly-structured model David A van Dyk Accounting for in Spectral Analysis

Calibration Products The Calibration Sample The Effect of Analysis is highly dependent on Calibration Products: Effective Area Cuves Energy Redistribution Matrices Eposure Maps Point Spread Functions In this talk we focus on uncertainty in the effective area ACIS S effective area (cm 2 ) 0 200 400 600 800 02 1 10 E [kev] A Chandra effective area Sample Chandra psf s (Karovska et al, ADASS X) EGRET eposure map (area time) David A van Dyk Accounting for in Spectral Analysis

Calibration Sample The Calibration Sample The Effect of Calibration sample of 1000 representative curves 1 Sample ehibits comple variability Requires storing many high dimensional calibration products ACIS S effective area (cm 2 ) 0 200 400 600 800 default subtracted effective area (cm 2 ) 50 0 50 02 1 10 E [kev] 02 1 10 E [kev] 1 Thanks to Jeremy Drake and Pete Ratzlaff David A van Dyk Accounting for in Spectral Analysis

The Calibration Sample The Effect of Generating Calibration Products on the Fly We use Principal Component Analysis to represent uncertainty: A A 0 δ m e j r j v j, j=1 A 0 : default effective area, δ: mean deviation from A 0, r j and v j : first m principle component eigenvalues & vectors, e j : independent standard normal deviations Capture 99% of variability with m = 18 David A van Dyk Accounting for in Spectral Analysis

Checking the PCA Emulator The Calibration Sample The Effect of A rep! A o (cm 2 ) 100 50 0 50 03 1 10 E (kev) 0000 0010 @ 10 kev 500 550 600 A (cm 2 ) 0000 0004 0008 0012 @ 15 kev 600 650 700 750 800 A (cm 2 ) 0000 0005 0010 0015 @ 25 kev 550 600 650 700 A (cm 2 ) David A van Dyk Accounting for in Spectral Analysis

The Simulation Studies The Calibration Sample The Effect of Simulated Spectra Spectra were simulated using an absorbed power law, f (E j ) = αe N Hσ(E j ) E Γ j, accounting for instrumental effects; E j is the energy of bin j Parameters (Γ and N H ) and sample size/eposure times: Effective Area Nominal Counts Spectal Model Default Etreme 10 5 10 4 Harder Softer SIM 1 X X X SIM 2 X X X SIM 3 X X X An absorbed powerlaw with Γ = 2, N H = 10 23 /cm 2 An absorbed powerlaw with Γ = 1, N H = 10 21 /cm 2 David A van Dyk Accounting for in Spectral Analysis

The Calibration Sample The Effect of 30 Most Etreme Effective Areas in Calibration Sample A! A (cm 2 ) 50 o 0 50 03 1 8 E (kev) 15 largest and 15 smallest determined by maimum value David A van Dyk Accounting for in Spectral Analysis

The Calibration Sample The Effect of The Effect of SIMULATION 1 SIMULATION 2 0 5 10 15 22 N H(10 cm "2 ) 90 100 110 0 1 2 3 4 17 18 19 20 21 22 23! 17 18 19 20 21 22 23! 0 20 40 22 N H(10 cm "2 ) 007 010 013 0 50 150 090 095 100 105 110 115! 090 095 100 105 110 115! Columns represent two simulated spectra True parameters are horizontal lines Posterior under default calibration is plotted in black The posterior is highly sensitive to the choice of effective area! 90 95 100 105 N H(10 22 cm "2 ) 008 010 012 014 N H(10 22 cm "2 ) David A van Dyk Accounting for in Spectral Analysis

The Effect of Sample Size The Calibration Sample The Effect of 0 1 2 3 4 5 6 10 4 counts 16 18 20 22 24 Γ 0 5 10 15 10 5 counts 16 18 20 22 24 Γ 00 05 10 15 85 90 95 100 105 110 115 N H(10 22 cm 2 ) 0 1 2 3 4 5 85 90 95 100 105 110 115 N H(10 22 cm 2 ) The effect of is more pronounced with larger sample sizes David A van Dyk Accounting for in Spectral Analysis

Outline Bayesian Statistical Methods Pragmatic and Fully Bayesian Solutions Empirical Illustration 1 Bayesian Statistical Methods Components of a Statistical Model Statistical Computation 2 The Calibration Sample The Effect of 3 Pragmatic and Fully Bayesian Solutions Empirical Illustration David A van Dyk Accounting for in Spectral Analysis

Pragmatic and Fully Bayesian Solutions Empirical Illustration Accounting for We consider inference under: A PRAGMATIC BAYESIAN TARGET: π 0 (A, θ) = p(a)p(θ A, Y ) THE FULLY BAYESIAN POSTERIOR: π(a, θ) = p(a Y )p(θ A, Y ) Here: p(θ A, Y ): PyBLoCXS posterior with known effective area, A p(a Y ): The posterior distribution for A p(a): The prior distribution for A Should we let the current data inform inference for calibration products? David A van Dyk Accounting for in Spectral Analysis

Two Possible Target Distributions Pragmatic and Fully Bayesian Solutions Empirical Illustration We consider inference under: A PRAGMATIC BAYESIAN TARGET: π 0 (A, θ) = p(a)p(θ A, Y ) THE FULLY BAYESIAN POSTERIOR: π(a, θ) = p(a Y )p(θ A, Y ) Concerns: Statistical Fully Bayesian target is correct Cultural Astronomers have concerns about letting the current data influence calibration products Computational Both targets pose challenges, but pragmatic Bayesian target is easier to sample Practical How different are p(a) and p(a Y )? David A van Dyk Accounting for in Spectral Analysis

0 5 10 15 Bayesian Statistical Methods Pragmatic and Fully Bayesian Solutions Empirical Illustration Using Multiple Imputation to Account for Uncertainty N H (10 22 cm "2 ) 90 100 110 0 1 2 3 4 17 18 19 20 21 22 23! 17 18 19 20 21 22 23! 90 95 100 105 N H (10 22 cm "2 ) 0 20 40 N H (10 22 cm "2 ) 007 010 013 MI is a simple, but approimate, method: 0 50 150 090 095 100 105 110 115! 090 095 100 105 110 115! Fit the model M 1000 times, each with random effective area curve from the calibration sample Estimate parameters by averaging the M fitted values 008 010 012 014 N H (10 22 cm "2 ) Estimate uncertainty by combining the within and between fit errors David A van Dyk Accounting for in Spectral Analysis

Densi 0 5 10 Pragmatic and Fully Bayesian Solutions Empirical Illustration The Multiple Imputation 17 18 19 Combining 20 21 22 23 Rules B = 1 M 1 N H (10 22 cm "2 ) 90 100 110 ˆθ = 1 M 0 1 2 3 4 17 18 19 20 21 22 23! M ˆθ m, m=1 90 95 100 105 N H (10 22 cm "2 )! W = 1 M Densi 0 20 N H (10 22 cm "2 ) 007 010 013 M Var(ˆθ m ), m=1 M (ˆθ m ˆθ)(ˆθ m ˆθ), T = W m=1 0 50 150 ( 1 1 M 090 095 100! 090 095 100! ) B The total variance combines the variances within and between the M analyses 008 010 N H (10 22 c David A van Dyk Accounting for in Spectral Analysis

Pragmatic and Fully Bayesian Solutions Empirical Illustration Accounting for with MCMC When using MCMC in a Bayesian setting we can: Sample a different effective area from the calibration sample at each iteration according to: Pragmatic Bayes: p(a) Fully Bayes: p(a Y ) Computational challenges arise in both cases We focus on comparing the results of the two methods A Simulation Sampled 10 5 counts from a power law spectrum: E 2 A true is 15σ from the center of the calibration sample David A van Dyk Accounting for in Spectral Analysis

Sampling From the Full Posterior Pragmatic and Fully Bayesian Solutions Empirical Illustration Default Effective Area Pragmatic Bayes Fully Bayes θ 2 196 200 204 090 095 100 105 θ 2 196 200 204 090 095 100 105 θ 2 196 200 204 090 095 100 105 θ 1 Spectral Model (purple bullet = truth): θ 1 θ 1 f (E j ) = θ 1 e θ 1σ(E j ) E θ 2 j Pragmatic Bayes is clearly better than current practice, but a Fully Bayesian Method is the ultimate goal David A van Dyk Accounting for in Spectral Analysis

Bayesian Statistical Methods Pragmatic and Fully Bayesian Solutions Empirical Illustration The Effect in an Analysis of a Quasar Spectrum 050 055 060 065 110 120 130 140 Default Effective Area 1000θ 1 θ 2 050 055 060 065 110 120 130 140 Pragmatic Bayes 1000θ 1 θ 2 050 055 060 065 110 120 130 140 Fully Bayes 1000θ 1 θ 2 006 008 010 012 014 110 120 130 140 Default Effective Area θ 3 θ 2 006 008 010 012 014 110 120 130 140 Pragmatic Bayes θ 3 θ 2 006 008 010 012 014 110 120 130 140 Fully Bayes θ 3 θ 2 (obs ID = 866) David A van Dyk Accounting for in Spectral Analysis

Pragmatic and Fully Bayesian Solutions Empirical Illustration Results: 95% Intervals Standardized by Standard Fit σ fi(γ) 002 005 010 020 050 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 σ fi(γ) 002 005 010 020 050 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 4 2 0 2 4 µ^prag(γ) 4 2 0 2 4 µ^full(γ) For large spectra calibration uncertainty swamps statistical error In large spectra fully Bayes identifies A and shifts interval David A van Dyk Accounting for in Spectral Analysis

Thanks Bayesian Statistical Methods Pragmatic and Fully Bayesian Solutions Empirical Illustration Collaborative work with: Vinay Kashyap Jin Xu Alanna Connors Hyunsook Lee Aneta Siegminowska California-Harvard Astro-Statistics Collaboration If you want more details Lee, H, Kashyap, V, van Dyk, D, Connors, A, Drake, J, Izem, R, Min, S, Park, T, Ratzlaff, P, Siemiginowska, A, and Zezas, A Accounting for Calibration Uncertainties in X-ray Analysis: Effective Area in Spectral Fitting The Astrophysical Journal, 731, 126 144, 2011 David A van Dyk Accounting for in Spectral Analysis