Fully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis

Similar documents
New Results of Fully Bayesian

Accounting for Calibration Uncertainty

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

A Review of Pseudo-Marginal Markov Chain Monte Carlo

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

Fitting Narrow Emission Lines in X-ray Spectra

Embedding Astronomical Computer Models into Complex Statistical Models. David A. van Dyk

Embedding Astronomical Computer Models into Principled Statistical Analyses. David A. van Dyk

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Introduction to Machine Learning

Multimodal Nested Sampling

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

Lecture 7 and 8: Markov Chain Monte Carlo

Using Bayes Factors for Model Selection in High-Energy Astrophysics

Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants

Multistage Anslysis on Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

An introduction to Markov Chain Monte Carlo techniques

STA 4273H: Statistical Machine Learning

MONTE CARLO METHODS FOR TREATING AND UNDERSTANDING HIGHLY-CORRELATED INSTRUMENT CALIBRATION UNCERTAINTIES

Fitting Narrow Emission Lines in X-ray Spectra

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

JEREMY DRAKE, PETE RATZLAFF, VINAY KASHYAP AND THE MC CALIBRATION UNCERTAINTIES TEAM MONTE CARLO CONSTRAINTS ON INSTRUMENT CALIBRATION

An ABC interpretation of the multiple auxiliary variable method

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Markov chain Monte Carlo

Bayesian Inference and MCMC

Bayesian linear regression

Inexact approximations for doubly and triply intractable problems

Bayesian Methods for Machine Learning

Bayesian Estimation of log N log S

Zig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017

Bayesian Linear Regression

Bayesian Methods in Multilevel Regression

NuSTAR observation of the Arches cluster: X-ray spectrum extraction from a 2D image

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture : Probabilistic Machine Learning

Fully Bayesian Analysis of Low-Count Astronomical Images

Evidence estimation for Markov random fields: a triply intractable problem

STA 4273H: Sta-s-cal Machine Learning

SEARCHING FOR NARROW EMISSION LINES IN X-RAY SPECTRA: COMPUTATION AND METHODS

Bagging During Markov Chain Monte Carlo for Smoother Predictions

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Spectral Analysis of High Resolution X-ray Binary Data

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Statistical Tools and Techniques for Solar Astronomers

Computer intensive statistical methods

Corrections for time-dependence of ACIS gain

Lecture 6: Graphical Models: Learning

MARKOV CHAIN MONTE CARLO

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study

MCMC algorithms for fitting Bayesian models

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

17 : Markov Chain Monte Carlo

Introduction to Probabilistic Machine Learning

Embedding Supernova Cosmology into a Bayesian Hierarchical Model

Bayesian Learning in Undirected Graphical Models

Online Appendix: Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Using training sets and SVD to separate global 21-cm signal from foreground and instrument systematics

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Introduction to Bayesian Methods

MCMC Methods: Gibbs and Metropolis

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Paul Karapanagiotidis ECO4060

CPSC 540: Machine Learning

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

Monte Carlo Inference Methods

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Markov chain Monte Carlo

IACHEC 13, Apr , La Tenuta dei Ciclamini. CalStat WG Report. Vinay Kashyap (CXC/CfA)

Bayesian Estimation of log N log S

Some Statistical Aspects of Photometric Redshift Estimation

F denotes cumulative density. denotes probability density function; (.)

Likelihood-free MCMC

Markov chain Monte Carlo

Part 8: GLMs and Hierarchical LMs and GLMs

High Energy Astrophysics:

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Marginal Specifications and a Gaussian Copula Estimation

Learning the hyper-parameters. Luca Martino

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Advanced Statistical Methods. Lecture 6

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

Bayesian model selection: methodology, computation and applications

ST 740: Markov Chain Monte Carlo

Point spread function reconstruction from the image of a sharp edge

The Recycling Gibbs Sampler for Efficient Learning

Bayesian Computation in Color-Magnitude Diagrams

Principles of Bayesian Inference

Approximate Inference using MCMC

Transcription:

In High Energy Spectral Analysis Department of Statistics, UCI February 26, 2013

Model Building Principle Component Analysis Three Inferencial Models Simulation Quasar Analysis Doubly-intractable Distribution Other Calibration Uncertainty

High-Energy Astrophysics Spectral Analysis Calibration Products Scientific Goals

High-Energy Astrophysics Provide understanding into high-energy regions of the Universe. Chandra X-ray Observatory is designed to observe X-rays from high-energy regions of the Universe. X-ray detectors typically count a small number of photons in each of a large number of pixels. Spectral Analysis aims to explore the parameterized pattern between the photon counts and energy.

Quasar The Chandra X-ray image of the quasar PKS 1127-145, a highly luminous source of X-rays and visible light about 10 billion light years from Earth.

An Example of One Dataset TITLE = EXTENDED EMISSION AROUND A GIGAHERTZ PEAKED RADIO SOURCE DATE = 2006-12-29 T 16:10:48

Calibration Uncertainty Effective area records sensitivity as a function of energy. Energy redistribution matrix can vary with energy/location. Point Spread Functions can vary with energy and location. ACIS S effective area (cm 2 ) 0 200 400 600 800 0.2 1 10 E [kev]

Incorporate Calibration Uncertainty Calibration Uncertainty in astronomical analysis has been generally ignored. No robust principled method is available. Our goal is to incorporate the uncertainty by Bayesian Methods. In this talk, we will focus on uncertainty in the effective area.

Problem Description The true effective area curve can t be observed. No parameterized form for the density of effective area curve complicates model fitting. Simple MCMC is quite expensive, due to the complexity of the astronomical model.

Generating Calibration Sample Drake et al. (2006), suggest generating calibration sample of effective area curves to represent the uncertainty. The plot shows the coverage of a sample of 1000 effective area curves, and the default one (A 0 ) is a black line. Calibration Sample: {A 1, A 2, A 3,..., A L }

Model Building Principle Component Analysis Three Inferencial Models Three Main Steps Use Principle Component Analysis to parameterize effective area curve. Model Building, that is combining source model with calibration uncertainty. Three Inferencial Models.

A simplified model of telescope Model Building Principle Component Analysis Three Inferencial Models E(Y (E i )) = A(E i ) S(E i ); Y (E i ) Poisson(E(Y (E i ))) Y (E i ): S(E i ): A(E i ): α(e i ): Observed Photon in certain energy bin E i True Source Model, we set it as, S(E i ) = exp( n H α(e i )) a E ( Γ) i + b Effective Area Curve θ: source parameter, θ = {n H, a, Γ, b} photo-electric cross-section b: background intensity

Model Building Principle Component Analysis Three Inferencial Models Use PCA to represent effective area curve A = A 0 + δ + m j=1 e jr j v j A 0 : default effective area, δ : mean deviation from A 0, r j and v j e j : first m principle component eigenvalues & vectors, : independent standard normal deviations. Capture 95% of uncertainty with m = 6-9. (Lee et al. 2011, ApJ)

Three Inferencial Models Model Building Principle Component Analysis Three Inferencial Models Fixed Effective Area Model(Standard Approach) Pragmatic Bayesian Model Original Pragmatic Bayesian Scheme (Lee et al. 2011, ApJ) Efficient Pragmatic Bayesian Scheme Fully Bayesian Model Gibbs Sampling Scheme Importance Sampling Scheme

Model Building Principle Component Analysis Three Inferencial Models Model One: Fixed Effective Area (Standard Approach) Model: p(θ Y, A 0 ) We assume A = A 0, where A 0 is the default affective area curve, and may not be the true one, This model doesn t incorporate calibration uncertainty, which is widely used because of its simplicity. The estimation may be biased and error bars may be underestimated. Only one sampling step involved: p(θ Y, A 0 ) L(Y θ, A 0 )π(θ) A mixed approach of Metropolis and Metropolis-hastings is used in the sampling

Model Building Principle Component Analysis Three Inferencial Models Model Two: Pragmatic Bayesian(Lee et al, 2011, ApJ) Model: P prag (θ, A Y ) = p(θ A, Y ) π(a) Doubly-intractable Distribution! Main purpose is to reduce complexity of sampling. Step One: sample A from π(a) Step Two: sample θ from p(θ Y, A) L(Y θ, A)π(θ) A mixed approach of Metropolis and Metropolis-hastings is used in the Step Two

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian After each draw of A i (i from 1 to n) from π(a), we have to find the best Metropolis-hastings proposal for p(θ Y, A), which costs a long and relatively constant time, say, T 1. (ML involved.) Once the proposal distribution is fixed given A i, each draw of θ from p(θ Y, A) costs a rather short time, say, T 2. (T 1 > T 2 ) In order to obtain the most effective samples for θ, we sample m θ s given A i, say, θ ij. (j from 1 to m)

Example Outline Model Building Principle Component Analysis Three Inferencial Models nh 0.08 0.09 0.10 0.11 0.12 0.13 nh 0.08 0.09 0.10 0.11 0.12 0.13 0 200 400 600 800 0 200 400 600 800 Index Fully Bayesian Analysis Indexof Calibration Uncertainty

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian Then this problem could be simplified into one optimization problem. Minimize: Var( 1 n i ( 1 m j θ ij)) Subject to: T = nt 1 + nmt 2 T is the total time, and when m = 1, the scheme turns into original Pragmatic Bayesian, Lee et al(2011, ApJ) We can get simple analytical solution: n = BT BT1 + ; m = WT1 BT2 WT 2 T 1 Here, B = σθ 2 σ2 θ A, W = σ2 θ A, and we assume θ s given A is independent to each other Notice, m is not related to T.

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian The assumption that θ s given A is independent to each other can be achieved if we thin the iterations within one A by a big number. If we assume θ s given one A are AR(1), neighbor correlation is ρ Then Var(Ȳ ) = 1 1+ρ n (B + W m(1 ρ) ) we can get still get similar optimization solutions as above, only need to replace W by W 1+ρ 1 ρ

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian Sampling Two ways of Efficient Pragmatic Bayesian Sampling of N θ s (n 1, m 1 ) (ˆB, Ŵ ) ( ˆm) (ˆn = N n 1m 1 ˆm ) while {n 0 < N} do { update B and W; calculate m; sample A; sample m θ s from p(θ Y, A); n 0 = n 0 + m;} The second adaptive scheme hasn t been verified yet!

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian Sampling Here are two chains, separately from Pragmatic Bayesian and Efficient Pragmatic Bayesian Samplings for quasar dataset 3104. abs.nh 0.02 0.04 0.06 0.08 abs.nh 0.02 0.04 0.06 0.08 0 1000 2000 3000 Original Pragmatic Bayesian 0 1000 2000 3000 Efficient Pragmatic Bayeisan

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian Sampling QQ plot of these two chains. Efficient Pragmatic Bayeisan 0.03 0.04 0.05 0.06 0.07 0.02 0.04 0.06 0.08 Original Pragmatic Bayesian

Model Building Principle Component Analysis Three Inferencial Models Model Two: Efficient Pragmatic Bayesian Sampling for dataset 3104: Before Efficient Pragmatic Bayesian Sample, T 1, T 2, B and W are estimated. T 1 = 7.1sec, T 2 = 0.045sec, B=4.01e-5, W=2.40e-5. Then the optimal m = 10, n = 300, if we want to draw 3000 θ s. ˆµ abs nh ˆσ abs 2 nh ˆσ 2ˆµ T Pragmatic Bayesian 0.0447 6.36e-05 2.12e-08 5.9hrs Efficient Pragmatic Bayesian 0.0443 6.10e-05 1.13e-07 0.6hrs Ratio 0.187 10 EPB almost doubles effective sample size.

Model Building Principle Component Analysis Three Inferencial Models Model Three: Fully Bayesian using Gibbs sampling Uses correct Bayesian Approach: P full (θ, A Y ) = p(θ A, Y ) p(a Y ) This Model allows the current data to influence calibration products, Step One: sample A from p(a Y, θ) L(Y θ, A)π(A) Step Two: sample θ from p(θ Y, A) L(Y θ, A)π(θ) A mixed approach of Metropolis and Metropolis-hastings is used in the both steps Most difficult approach to sample.

Model Building Principle Component Analysis Three Inferencial Models Importance sampling for Fully Bayesian Fully Bayesian using Gibbs sampling involves a lot of choice of proposal distributions, and the choice of proposal distributions highly influences the performance of the chains. Here, we introduce importance sampling for Fully Bayesian, which takes advantage of the draws from Pragmatic Bayesian Model. (Pragmatic Bayesian Model has larger variance.) Fixed ARF Pragmatic Bayesian Fully Bayesian p1.gamma 0.85 0.95 1.05 1.15 0.06 0.10 0.14 abs1.nh p1.gamma 0.85 0.95 1.05 1.15 0.06 0.10 0.14 abs1.nh p1.gamma 0.85 0.95 1.05 1.15 0.06 0.10 0.14 abs1.nh

Model Building Principle Component Analysis Three Inferencial Models Importance sampling for Fully Bayesian Steps: Get the draws from Pragmatic Bayesian Method Approximate P prag (θ A, Y ) as Multivariate Normal distribution, we call it P new prag (θ A, Y ). Pprag (θ A, Y ) can t be calculated because of doubly-intractable distribution. 18 parameters from A are all independent standard normal. Get new draws by sampling π(a) and P new prag (θ A, Y ) Calculate the ratio r = P fully (A, θ Y )/P new prag (A, θ Y ) Use the ratios to do resampling for Fully Bayesian.

Model Building Principle Component Analysis Three Inferencial Models Importance sampling for Fully Bayesian The great benefit from the new scheme is everything can work automatically, saving the trouble of choosing nice proposal distributions in the Fully Bayesian Model using Gibbs sampling. The disadvantage is that every time we need to fit Fully Bayesian Model, we have to do Pragmatic Bayesian first. Usually, astronomers would like to use all three Models and to do the comparison. The results of Fully Bayesian using Gibbs sampling and Importance sampling are usually identical to each other.

Model Building Principle Component Analysis Three Inferencial Models Importance sampling for Fully Bayesian Here are two scatter plots, separately from Fully Bayesian using Gibbs sampling(black) and Importance sampling(red) for quasar dataset 3105. Γ 1.45 1.50 1.55 1.60 0.000 0.005 0.010 0.015 0.020 nh

Simulation Quasar Analysis Eight simulated data sets The first four data sets were all simulated without background contamination using the XSPEC model wabs*powerlaw, nominal default effective area A 0 from the calibration sample of Drake et al. (2006), and a default RMF. Simulation 1: Γ = 2,N H = 2 23 cm 2, and 10 5 counts; Simulation 2: Γ = 1,N H = 2 21 cm 2, and 10 5 counts; Simulation 3: Γ = 2,N H = 2 23 cm 2, and 10 4 counts; Simulation 4: Γ = 1,N H = 2 21 cm 2, and 10 4 counts; The other four data sets (Simulation 5-8) were generated using an extreme instance of an effective area.

for Simulation 2 Simulation Quasar Analysis [SIM 2] NH=10 21 ; Γ=1; N=10 5 0 10 20 30 40 fix ARF fully bayesian pragmatic bayesian 0.8 0.9 1.0 1.1 1.2 1.3 Γ

for Simulation 3 Simulation Quasar Analysis [SIM 3] NH=10 23 ; Γ=2; N=10 4 0 1 2 3 4 fix ARF fully bayesian pragmatic bayesian 1.6 1.8 2.0 2.2 2.4 2.6 2.8 Γ

for Simulation 6 Simulation Quasar Analysis 0 5 10 15 20 25 30 [SIM 6] NH=10 21 ; Γ=1; N=10 5 fix ARF fully bayesian pragmatic bayesian 0.8 0.9 1.0 1.1 1.2 1.3 Γ

for Simulation 7 Simulation Quasar Analysis 0 1 2 3 4 5 [SIM 7] NH=10 23 ; Γ=2; N=10 4 fix ARF fully bayesian pragmatic bayesian 1.6 1.8 2.0 2.2 2.4 2.6 2.8 Γ

Simulation Quasar Analysis Quasar results 16 Quasar data sets were fit by these three models: 377, 836, 866, 1602, 3055, 3056, 3097, 3098, 3100, 3101, 3102, 3103, 3104, 3105, 3106, 3107. Most interesting founding for Fully Bayesian model is shift of parameter fitting, besides the change of standard errors. Both comparisons of mean and standard errors among three models are shown below.

mean: fix-prag Outline Simulation Quasar Analysis Fixed Effective Area Curve Model has almost the same parameter fitting as Pragmatic Bayesian Model. µ prag (Γ) 0.0 0.5 1.0 1.5 2.0 2.5 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 0.0 0.5 1.0 1.5 2.0 2.5

Simulation Quasar Analysis mean: fix-full Fully Bayesian model shifts the parameter fitting, which mean data itself influence the choice of effective area curve. µ full (Γ) 0.0 0.5 1.0 1.5 2.0 2.5 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 0.0 0.5 1.0 1.5 2.0 2.5

sd: fix-prag Outline Simulation Quasar Analysis Pragmatic Bayesian has larger parameter standard deviation than Fixed Effective Area Curve Model, especially for large datasets. σ prag (Γ) 0.01 0.02 0.05 0.10 0.20 0.50 1.00 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 0.01 0.02 0.05 0.10 0.20 0.50 1.00

sd: fix-full Outline Simulation Quasar Analysis Fully Bayesian usually has larger parameter standard deviation than Fixed Effective Area Curve Model, too. σ full (Γ) 0.01 0.02 0.05 0.10 0.20 0.50 1.00 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 0.01 0.02 0.05 0.10 0.20 0.50 1.00

Simulation Quasar Analysis sd: prag-full It can be observed that generally parameter standard deviation from Fully Bayesian Model is between Pragmatic Bayesian and Fixed Effective Area Curve Models. σ full (Γ) 0.01 0.02 0.05 0.10 0.20 0.50 1.00 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 0.01 0.02 0.05 0.10 0.20 0.50 1.00 σ prag(γ)

more plots Outline Simulation Quasar Analysis ˆµ prag (Γ) = µprag (Γ) µ fix (Γ) σ fix (Γ), these lines cover 2 sd. σ fix (Γ) 0.02 0.05 0.10 0.20 0.50 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 4 2 0 2 4 µ^prag(γ)

more plots Outline Simulation Quasar Analysis ˆµ full (Γ) = µ full (Γ) µ fix (Γ) σ fix (Γ), these lines cover 2 sd. σ fix (Γ) 0.02 0.05 0.10 0.20 0.50 377 836 866 1602 3055 3056 3097 3098 3100 3101 3102 3103 3104 3105 3106 3107 4 2 0 2 4 µ^full(γ)

Doubly-intractable Distribution Other Calibration Uncertainty Doubly-intractable Distribution Murray et al (2012) defines doubly-intractable distribution this way: p(θ y) = ( f (y;θ)π(θ) Z(θ) )/p(y) Here, Z(θ) = f (y; θ)dy, is called as unknown constant and can t be computed. Doubly-intractable Distributions widely exist, and most famous method is introduced by Møller, Jesper, et al.(2006), called Auxiliary Variable Method. One of most popular model involving Doubly-intractable Distribution is Ising Model.

Doubly-intractable Distribution Doubly-intractable Distribution Other Calibration Uncertainty Our Pragmatic Bayesian Model involves: L(Y θ,a) π(θ) p(y A) P prag (θ, A Y ) = p(θ A, Y ) π(a) = π(a) Here, p(y A) is unknown normalizing constant. Interestingly, unlike other Doubly-intractable Distributions, our Pragmatic Bayesian Model can avoid this constant easily and get the posterior draws. However, when we use these draws to help sample Fully Bayesian Model by importance sampling, we find out that p(y A) p(y ) ratio =, can t avoid the constant!!! That s the reason we re-estimate the draws density by Multivariate Normal, and use new draws from the Multi-Gaussian to do the importance sampling. Avoid the constant again!!!

Other Calibration Uncertainty Doubly-intractable Distribution Other Calibration Uncertainty For effective area curve, Lee et al. firstly gets 1000 curve samples and then use PCA to parameterize the effective area curve.

Doubly-intractable Distribution Other Calibration Uncertainty Other Calibration Uncertainty PCA has its unique advantages: easy to parameterize vector samples; dimension highly reduced; easy to sample afterwards. However, there is no principled method of PCA to parameterize matrix samples. Other Calibration Uncertainty, rmf and psf are both matrix. Our goal is to parameterize these matrix samples, define the probability density of the matrix, and nicely sample new matrix.

Doubly-intractable Distribution Other Calibration Uncertainty Other Calibration Uncertainty Some possible approaches: Diffusion map allows mapping data into a coordinate system that efficiently reveals the geometric structure of data. However, because of its nonlinear transformation, diffusion map is not reversible. Delaigle et al (2010) introduced a way to define probability density for a distribution of random functions. However, how to extend it into 2-dimension space is under question. Wavelets technique can provide nice way to extract information from 2-dimension data. Besides, complementary wavelets allows us to recover the original information with minimal loss.

Doubly-intractable Distribution Other Calibration Uncertainty Other Calibration Uncertainty Here is our potential approach: Use wavelets to analyze all the matrix samples; Use PCA to summarize wavelet parameters from step one; By sampling independent standard normal deviations to construct wavelet parameters; Use new wavelet parameter to construct new matrix sample.

Citations Outline Doubly-intractable Distribution Other Calibration Uncertainty Lee, Hyunsook, et al. Accounting for calibration uncertainties in X-ray analysis: effective areas in spectral fitting. The Astrophysical Journal 731.2 (2011): 126. Murray, Iain, Zoubin Ghahramani, and David MacKay. MCMC for doubly-intractable distributions. arxiv preprint arxiv:1206.6848 (2012). Delaigle, Aurore, and Peter Hall. Defining probability density for a distribution of random functions. The Annals of Statistics 38.2 (2010): 1171-1193. Lee, Ann B., and Larry Wasserman. Spectral connectivity analysis. Journal of the American Statistical Association 105.491 (2010): 1241-1255.

1878 Model: xsphabs.abs1*(xsapec.kt1+xsapec.kt2) Set kt2.abundanc = kt1.abundanc Six source parameters: abs1.nh, kt1.kt, kt1.abundanc, kt1.norm, kt2.kt, kt2.norm Sherpa fit() using Neldermead motheds to find MLE fails Stuck at boundaries: abs.nh=0 and kt1.abundanc=5 Multiple modes

Fixed ARF The results are highly correlated to starting values. Although the chains might seem converge, it s just stuck in the local mode. kt1.kt 0.1 0.2 0.3 0.4 0.5 0.6 0.00 0.05 0.10 0.15 0.20 0.25 abs1.nh

Fully Bayeisan Similarly, it happens to Fully Bayesian. Once its stuck in the mode, it seems impossible to accept a new ARF. abs1.nh 0.010 0.015 0.020 0.025 0.030 0.035 0 500 1000 1500 2000 2500 3000 Index

Redistribution Matrix File (RMF) Matrix consists the probability that an incoming photon of energy E will be detected in the output detector channel I. The matrix has dimension 1078*1024. But the data is stored in a compressed format. For example rmf 0001: n grp = UInt64[1078], stores the number of non-zero groups in each row f chan = UInt32[1395], stores starting point of each group n chan = UInt32[1395], stores the element number of each group matrix = Float64[384450], stores all the non-zero values

Redistribution Matrix File (RMF) The sum of each row equals to one sum(n grp)=1395 sum(n chan)=384450 sum(matrix)=1078

log(rmf0001+1e-6)

log(rmf0002+1e-6)

prepare for PCA raw data dimension (1000,1078*1024) discard bottom-right zeros, dimension (1000, 445380) I use python DMP NIPALS algorithm (Nonlinear Iterative Partial Least Squares) (1000, 445380) MemoryError!!! (100, 445380) MemoryError!!! (1000, 100000) MemoryError!!! (1000, 10000) 1 hours, I got 5 principal components variance: (3.6e-02, 8.8e-03, 1.02e-04, 3.05e-05, 9.74e-06)

simulating rmf