Fitting Narrow Emission Lines in X-ray Spectra
|
|
- Candace Thornton
- 5 years ago
- Views:
Transcription
1 Outline Fitting Narrow Emission Lines in X-ray Spectra Taeyoung Park Department of Statistics, University of Pittsburgh October 11, 2007
2 Outline of Presentation Outline This talk has three components: A. Motivation Looking for narrow emission lines in high energy spectra. Spectral analysis of the high redshift quasar, PG B. Building and fitting highly structured spectral models. Model-based statistical inference for line locations. Posterior predictive checks for a model component. C.
3 Astronomical Source Motivation Spectrum of a Quasar Typical Questions of Interest
4 Chandra X-ray Observatory Spectrum of a Quasar Typical Questions of Interest
5 Spectrum of a Quasar Typical Questions of Interest Spectrum of an Astronomical Source An observed spectrum consists of photon counts in a number of energy bins. Use non-homogenous Poisson process. The photons are originated from two groups: (1) the continuum and (2) several emission lines. Observed photon counts per energy bin Expected photon counts per energy bin
6 Typical Questions of Interest Spectrum of a Quasar Typical Questions of Interest Observed spectrum of the high redshift quasar, PG : Observed photon counts per energy bin Estimate parameters that describe a spectrum. 2. Investigate the uncertainty of the estimates. 3. Test the evidence for the emission lines.
7 The Basic Source Model A simplified Poisson process for the scientific model, ( ) X i Poisson Λ i = αe β i }{{} Continuum + λπ i (µ, σ 2 ) }{{} Emission line where i = 1, 2,..., n index the energy bins. Construct a narrow emission line model using the delta function so that 1 the emission line is contained entirely in one energy bin, but 2 we do not know which bin. That is, {π i } can be parameterized in terms of a single location parameter µ, and σ 2 is set equal to zero.
8 The Basic Source Model (Cont d) ( ) The Poisson process: X i Poisson Λ i = αe β i + λπ i (µ). Using Data Augmentation to fit this finite mixture model: ( ) indicator that photon l in bin i Z il =. corresponds to the emission line With the target posterior distribution p(z, ψ, µ X) where ψ = (α, β, λ), the Gibbs sampler iterates among ( ) 1 Draw Z p(z ψ, µ, X), via Z il Bernoulli λπ i (µ) αe β i + λπ i (µ), 2 Draw ψ p(ψ Z, µ, X), and 3 Draw µ p(µ Z, ψ, X). In This Case, the Gibbs Sampler Breaks Down!
9 The Basic Source Model (Cont d) ( ) The Poisson process: X i Poisson Λ i = αe β i + λπ i (µ). Using Data Augmentation to fit this finite mixture model: ( ) indicator that photon l in bin i Z il =. corresponds to the emission line With the target posterior distribution p(z, ψ, µ X) where ψ = (α, β, λ), the Gibbs sampler iterates among ( ) 1 Draw Z p(z ψ, µ, X), via Z il Bernoulli λπ i (µ) αe β i + λπ i (µ), 2 Draw ψ p(ψ Z, µ, X), and 3 Draw µ p(µ Z, ψ, X). In This Case, the Gibbs Sampler Breaks Down!
10 Why the Gibbs Sampler Fails Total counts per bin Total counts, X µ (0) π i (µ (0) ) Continuum counts Line counts µ (1) ( iid Z il Ber Z = 0 Z = 1 λπ i (µ) αe β i +λπ i (µ) )
11 Why the Gibbs Sampler Fails Total counts per bin Total counts, X µ (0) π i (µ (0) ) Continuum counts Line counts µ (1) ( iid Z il Ber Z = 0 Z = 1 λπ i (µ) αe β i +λπ i (µ) )
12 Improving the Convergence of the Gibbs Sampler The Gibbs sampler breaks down because the line location µ cannot move from its starting value. Because of the binning, we are left with a finite number of possible line locations. Thus, we can evaluate the observed posterior probability of µ at the midpoint of each bin. The observed posterior distribution for µ is given by ( p(µ ψ, X) = Multinomial 1; { } ) p(µ ψ, X) µ=ei. To improve convergence, can we capitalize on p(µ ψ, X)? Note that p(µ ψ, X) p(z, ψ, µ X) dz! }{{} target distribution
13 Introducing Incompatibility into the Gibbs Sampler The target posterior distribution is given by p(z, ψ, µ X). Iteratively sample from complete conditional distributions: Step 1 : Draw Z p(z ψ, µ, X), Step 2 : Draw ψ p(ψ Z, µ, X), and Step 3 : Draw µ p(µ Z, ψ, X). Form a Markov chain with stationary distribution p(z, ψ, µ X). Basic Question : When p(µ ψ, X) is available, use it or not? And the answer is... USE WITH CAUTION!
14 Introducing Incompatibility into the Gibbs Sampler The target posterior distribution is given by p(z, ψ, µ X). Iteratively sample from complete conditional distributions: Step 1 : Draw Z p(z ψ, µ, X), Step 2 : Draw ψ p(ψ Z, µ, X), and Step 3 : Draw µ p(µ Z, ψ, X). Form a Markov chain with stationary distribution p(z, ψ, µ X). Basic Question : When p(µ ψ, X) is available, use it or not? And the answer is... USE WITH CAUTION!
15 Construction of Partially Collapsed Gibbs Samplers Steps of Construction p(z ψ, µ, X) p(ψ Z, µ, X) p(µ Z, ψ, X) p(z ψ, µ, X) p(ψ Z, µ, X) p(z, µ ψ, X) General Principle We move Z to the left of the conditioning sign in Step 3. This does not alter the stationary distribution, but improves the rate of convergence. p(z, µ ψ, X) We permute the order of the steps. This can p(z ψ, µ, X) have minor effects on the rate of convergence, p(ψ Z, µ, X) but does not affect the stationary distribution. p(µ ψ, X) We remove Z from the draws in Step 1, since p(z ψ, µ, X) the transition kernel does not depend on this p(ψ Z, µ, X) quantity. We refer to these three steps as Marginalization, Permutation, and Trimming. They form a general strategy for constructing partially collapsed Gibbs (PCG) samplers.
16 Construction of Partially Collapsed Gibbs Samplers Steps of Construction p(z ψ, µ, X) p(ψ Z, µ, X) p(µ Z, ψ, X) p(z ψ, µ, X) p(ψ Z, µ, X) p(z, µ ψ, X) General Principle We move Z to the left of the conditioning sign in Step 3. This does not alter the stationary distribution, but improves the rate of convergence. p(z, µ ψ, X) We permute the order of the steps. This can p(z ψ, µ, X) have minor effects on the rate of convergence, p(ψ Z, µ, X) but does not affect the stationary distribution. We remove Z from the draws in Step 1, since p(z, µ ψ, X) the transition kernel does not depend on this p(ψ Z, µ, X) quantity. We refer to these three steps as Marginalization, Permutation, and Trimming. They form a general strategy for constructing partially collapsed Gibbs (PCG) samplers.
17 Improved Convergence Characteristics Theorem (Park and van Dyk, 2007) Sampling more model components in any single step of a Gibbs sampler reduces the resulting maximal autocorrelation. Directly assessing the convergence rate is difficult. The spectral radius of a forward operator generally governs the convergence (Liu et al, 1995), and is bounded above by the maximal autocorrelation. By reducing conditioning in any step (i.e., partial marginalization) we reduce the maximal autocorrelation, the upper bound of the spectral radius. We thus expect partially collapsed Gibbs samplers to have better convergence than ordinary Gibbs samplers.
18 Highly Structured Multilevel Model Actually we do not observe the latent Poisson process, ( ) X i Poisson Λ i = αe β i + λπ i (µ). Rather, we observe Y j Poisson (γ j M ijλ i + θ B i j ), because the photons are subject to a series of non-trivial physical processes: γ j : non-homogeneous stochastic censoring (absorption), M ij : stochastic redistribution (blurring), and θ B j : background contamination.
19 Data Augmentation Motivation We consider a hierarchical missing data structure: Variable Ideal data separated by mixture indicators Notation Z Mixed ideal data Mixed ideal data after absorption X = {X i } Mixed ideal data after absorption and blurring Mixed ideal data after absorption, blurring, and background contamination, i.e., Y obs = {Y j } observed data
20 The Gibbs Sampler Motivation The target posterior distribution is p(x, Z, ψ, µ Y obs ). The Gibbs sampler now iterates among Step 1 : Draw (X, Z ) p(x, Z ψ, µ, Y obs ), Step 2 : Draw ψ p(ψ X, Z, µ, Y obs ), and Step 3 : Draw µ p(µ X, Z, ψ, Y obs ), where ψ represents all model parameters but µ. With a Delta Function Line Model, the Gibbs Sampler Fails! The observed posterior distribution for µ is computed as ( { } ) p(µ ψ, Y obs ) = Multinomial 1; p(µ ψ, Y obs ) µ=ej, j. To improve convergence, we capitalize on p(µ ψ, Y obs ).
21 PCG Sampler I Motivation p(µ ψ, Y obs ) is available. Partially marginalize over all of the missing data, (X, Z ). p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(µ X, Z, ψ, Y obs ) Marginalize p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(x, Z, µ ψ, Y obs ) Permute p(x, Z, µ ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) Trim p(µ ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs )
22 PCG Sampler I Motivation p(µ ψ, Y obs ) is available. Partially marginalize over all of the missing data, (X, Z ). p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(µ X, Z, ψ, Y obs ) Marginalize p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(x, Z, µ ψ, Y obs ) Permute p(x, Z, µ ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) Trim p(x, Z, µ ψ, Y obs ) p(ψ X, Z, µ, Y obs ) The resulting sampler corresponds to a blocked version of the original Gibbs sampler!
23 PCG Sampler II Motivation p(µ X, ψ, Y obs ) is also available and quick to compute. Partially marginalize over part of the missing data, Z. p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(µ X, Z, ψ, Y obs ) Marginalize p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(z, µ X, ψ, Y obs ) Permute p(z, µ X, ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) Trim p(µ X, ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) The resulting sampler cannot be blocked! It is the partially collapsed Gibbs sampler.
24 PCG Sampler II Motivation p(µ X, ψ, Y obs ) is also available and quick to compute. Partially marginalize over part of the missing data, Z. p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(µ X, Z, ψ, Y obs ) Marginalize p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) p(z, µ X, ψ, Y obs ) Permute p(z, µ X, ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) Trim p(µ X, ψ, Y obs ) p(x, Z ψ, µ, Y obs ) p(ψ X, Z, µ, Y obs ) The resulting sampler cannot be blocked! It is the partially collapsed Gibbs sampler.
25 Computational Gains Motivation Compare the Gibbs sampler, PCG I, and PCG II. The Gibbs sampler does not move from its starting value. PCG I has better convergence characteristics than PCG II. However, each iteration of PCG I is more expensive. PCG II PCG I μ (kev) μ (kev) Iterations Autocorrelation Autocorrelation Lag log λ (log counts) log λ (log counts) μ (kev) Iterations Lag μ (kev)
26 A Simulation Study Motivation To illustrate the statistical properties of fitted spectra, we run a simulation. We generate 10 data sets (1500 counts) from 6 spectra. We use a typical continuum, effective area, and instrumental response function. spectrum Case 1 Case 3 spectrum Case 2 Case 4 There are 0, 1, or 2 lines. Each line is of moderate width (SD = 4 bins) or of broad width (SD = 21 bins), and weak or strong. Fitted models include one or two emission lines. spectrum spectrum Case 5 spectrum spectrum Case 6
27 Results: Highly Multimodal Posterior Distributions Run PCG samplers for model fitting. Results with one delta function line in the fitted model. The marginal posterior density is estimated using Gaussian kernel smoothing (SD = 0.01). Smooth out lower posterior probabilities while not flattening higher posterior probabilities. The posterior density is highly multimodal. Posterior probability Posterior probability Posterior probability density function density function density function Case 1 Case 3 Case 5 Posterior probability Posterior probability Posterior probability density function density function density function Case 2 Case 4 Case 6
28 Posterior Regions Motivation The highly multimodal posterior distribution cannot be summarized in standard ways familiar to astronomers (e.g., 5 ± 2 or, for asymmetric intervals, ). Use a transformation of the posterior density function to visualize the HPD regions of varying levels. Level of HPD Level of HPD Case 1 Line location (kev) Case 3 Level of HPD Level of HPD Case 2 Line location (kev) Case 4 Construct a HPD region for each of the 100 different levels equally spaced between 1% and 100%. Summarize the highly multimodal posterior distribution using a HPD distribution with 100 different levels. Level of HPD Line location (kev) Case 5 Line location (kev) Level of HPD Line location (kev) Case 6 Line location (kev)
29 Fitting Two Emission Lines The multimodal posterior distribution may suggest multiple lines. The joint posterior distribution of two line locations with data generated under Case 2 (one narrow weak line at 2.85 kev). Second line location Posterior probability Posterior probability density function density function Case 2 Line location closest to the MLE Case 2 First ine location Other line location
30 Fitting Two Emission Lines (Cont d) The joint posterior distribution of two line locations with data generated under Case 5 (one narrow weak line at 2.85 kev and one narrow strong line at 1.2 kev). Second line location Posterior probability Posterior probability density function density function Case 5 Line location closest to the MLE Case 5 First ine location Other line location
31 An Advantage of Model Misspecification? Consider a simple Gaussian model with known SD: Y N(µ, σ) A 95% CI for µ is given by Y ± 1.96σ. Misspecification of σ with ς < σ results in a shorter interval, with lower coverage. Misspecification of the width of a spectral line: Might there be an advantage of using a delta function line rather than a Gaussian line (with fitted width) if we know the spectral line is not too broad? The tradeoff is not as simple as in the simple Gaussian model, so we return to the simulation study.
32 More Precise Inference? Delta Function Line Gaussian Line Test data index Case 1 Test data index Case 2 Test data index Case 1 Test data index Case 2 Test data index Case 3 Test data index Case 4 Test data index Case 3 Test data index Case 4 Test data index Case 5 Test data index Case 6 Test data index Case 5 Test data index Case 6
33 A Closer Look Motivation Delta function line Gaussian line Case Line type 1 Coverage 2 Mean length Coverage 2 Mean length 1 no lines NA NA one narrow 90% % one wide 50% % one narrow 100% % two narrow 90% % two narrow 100% % total narrow 95% % Narrow lines are 17 bins wide (four SDs); wide lines are 85 bins wide. 2 Coverage: % of ten 95% HPD regions containing at least one true line location Misspecification appears to give better mean length in all cases and better coverage for narrow lines
34 Proceed with Caution Motivation Exhaustive simulations are difficult. Fitting involves MCMC sampling, which is slow and requires some supervision. Results may depend on the line location, line strength, line width, characteristics of the continuum, sample size, etc. Delta functions emission lines are useful for exploratory data analysis and for inference when the true line is believed to be narrow, and also show promise for use for inference with moderately broad lines.
35 Posterior Distribution of the Line Location Posterior probability density function Posterior probability density function Six Observations were independently observed for PG Delta Function Line obs id 47 obs id 69 Posterior probability density function Posterior probability density function obs id 62 obs id 70 p(µ Y obs ) = = Z Z 6Y Z i=1 6Y i=1 6 Y i=1 p(µ, ψ i Y obs i )dψ 1 dψ 6 p(µ, ψ i Yi obs )dψ i p(µ Y obs i ). Posterior probability density function obs id 71 Posterior probability density function obs id 1269 Posterior probability density function Delta function line profile Posterior probability density function Delta function line profile
36 Results Motivation Under the delta function emission line model: Given all six observations, the posterior mode of the line location is identified at kev. The nominal 95% posterior region consists of (2.83 kev, 2.92 kev) with 94.8% and (0.50 kev, 0.51 kev) with 2.2%. The detected line is red-shifted to 6.69 kev in the quasar rest frame, which indicates the ionization state of iron.
37 Posterior Predictive Checks Quantify the evidence of the emission line in the spectrum using ppp-values (Meng, 1994; Protassov et al, 2002). Compare the following three models: MODEL 0 : No emission line in the spectrum. MODEL 1 : One line at 2.74 kev with unknown intensity. MODEL 2 : One line with unknown location and intensity. The test statistic is the sum of the log likelihood ratios: { 6 T m (Ỹ (l) supθ Θm L(θ Ỹ (l) } i ) ) = log i=1 sup θ Θ0 L(θ Ỹ (l), m = 1 and 2, i ) Θ m : the parameter space under MODEL m. Ỹ (l) : the simulated data sets under MODEL 0.
38 Posterior Predictive P-Values The posterior predictive method is a parameterized bootstrap that accounts for posterior uncertainty in the parameters. Small ppp-values give strong evidence for the presence of the delta function emission line in the X-ray spectrum. Model 0 vs. Model 1 Model 0 vs. Model 2 ppp-value = ppp-value = T1(Y obs )=7.450 T2(Y obs )= T1(Ỹ (l) ) T2(Ỹ (l) )
39 Motivation Estimating the location of narrow emission lines causes problems for standard statistical methods and algorithms. Devise partially collapsed Gibbs (PCG) samplers that generalize the composition of conditional distributions, generalize a blocked Gibbs sampler (Liu et al., 1995), are viewed as a stochastic counterpart to the ECME algorithm (Liu and Rubin, 1994) and the AECM algorithm (Meng and van Dyk, 1997). Summarize the highly multimodal posterior distribution using a HPD distribution with varying levels. Use posterior predictive checks (Protassov et al., 2002) to test for the evidence of a narrow emission line.
40 Thank you!
Fitting Narrow Emission Lines in X-ray Spectra
Fitting Narrow Emission Lines in X-ray Spectra Taeyoung Park Department of Statistics, Harvard University October 25, 2005 X-ray Spectrum Data Description Statistical Model for the Spectrum Quasars are
More informationSEARCHING FOR NARROW EMISSION LINES IN X-RAY SPECTRA: COMPUTATION AND METHODS
The Astrophysical Journal, 688:807Y825, 2008 December 1 # 2008. The American Astronomical Society. All rights reserved. Printed in U.S.A. SEARCHING FOR NARROW EMISSION LINES IN X-RAY SPECTRA: COMPUTATION
More informationAccounting for Calibration Uncertainty
: High Energy Astrophysics and the PCG Sampler 1 Vinay Kashyap 2 Taeyoung Park 3 Jin Xu 4 Imperial-California-Harvard AstroStatistics Collaboration 1 Statistics Section, Imperial College London 2 Smithsonian
More informationPartially Collapsed Gibbs Samplers: Theory and Methods. Ever increasing computational power along with ever more sophisticated statistical computing
Partially Collapsed Gibbs Samplers: Theory and Methods David A. van Dyk 1 and Taeyoung Park Ever increasing computational power along with ever more sophisticated statistical computing techniques is making
More informationPartially Collapsed Gibbs Samplers: Theory and Methods
David A. VAN DYK and Taeyoung PARK Partially Collapsed Gibbs Samplers: Theory and Methods Ever-increasing computational power, along with ever more sophisticated statistical computing techniques, is making
More informationUsing Bayes Factors for Model Selection in High-Energy Astrophysics
Using Bayes Factors for Model Selection in High-Energy Astrophysics Shandong Zhao Department of Statistic, UCI April, 203 Model Comparison in Astrophysics Nested models (line detection in spectral analysis):
More informationAccounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk
Accounting for in Spectral Analysis David A van Dyk Statistics Section, Imperial College London Joint work with Vinay Kashyap, Jin Xu, Alanna Connors, and Aneta Siegminowska IACHEC March 2013 David A van
More informationEmbedding Supernova Cosmology into a Bayesian Hierarchical Model
1 / 41 Embedding Supernova Cosmology into a Bayesian Hierarchical Model Xiyun Jiao Statistic Section Department of Mathematics Imperial College London Joint work with David van Dyk, Roberto Trotta & Hikmatali
More informationTwo Statistical Problems in X-ray Astronomy
Two Statistical Problems in X-ray Astronomy Alexander W Blocker October 21, 2008 Outline 1 Introduction 2 Replacing stacking Problem Current method Model Further development 3 Time symmetry Problem Model
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationAccounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk
Bayesian Analysis of Accounting for in Spectral Analysis David A van Dyk Statistics Section, Imperial College London Joint work with Vinay Kashyap, Jin Xu, Alanna Connors, and Aneta Siegminowska ISIS May
More informationAges of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008
Ages of stellar populations from color-magnitude diagrams Paul Baines Department of Statistics Harvard University September 30, 2008 Context & Example Welcome! Today we will look at using hierarchical
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis
In High Energy Spectral Analysis Department of Statistics, UCI February 26, 2013 Model Building Principle Component Analysis Three Inferencial Models Simulation Quasar Analysis Doubly-intractable Distribution
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationHigh Energy Astrophysics:
High Energy Astrophysics: What do Statistical Methods Have to Offer? David van Dyk Department of Statistics University of California, Irvine Joint work with The California-Harvard Astrostatistics Collaboration
More informationBayesian Computation in Color-Magnitude Diagrams
Bayesian Computation in Color-Magnitude Diagrams SA, AA, PT, EE, MCMC and ASIS in CMDs Paul Baines Department of Statistics Harvard University October 19, 2009 Overview Motivation and Introduction Modelling
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationYaming Yu Department of Statistics, University of California, Irvine Xiao-Li Meng Department of Statistics, Harvard University.
Appendices to To Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency Yaming Yu Department of Statistics, University of
More informationMeasurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007
Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationNew Results of Fully Bayesian
UCI April 3, 2012 Calibration Samples Principle Component Analysis Model Building Three source parameter sampling schemes Simulation Quasar data sets Speed Up Pragmatic Bayesian Method Frequency Analysis
More information1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008
1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008 1.1 Introduction Why do we need statistic? Wall and Jenkins (2003) give a good description of the scientific analysis
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationBayesian Estimation of log N log S
Bayesian Estimation of log N log S Paul Baines, Irina Udaltsova Department of Statistics University of California, Davis July 12th, 2011 Introduction What is log N log S? Cumulative number of sources detectable
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationX-ray Dark Sources Detection
X-ray Dark Sources Detection Lazhi Wang Department of Statistics, Harvard University Nov. 5th, 2013 1 / 19 Data Y i, background contaminated photon counts in a source exposure over T = 48984.8 seconds
More informationSummary STK 4150/9150
STK4150 - Intro 1 Summary STK 4150/9150 Odd Kolbjørnsen May 22 2017 Scope You are expected to know and be able to use basic concepts introduced in the book. You knowledge is expected to be larger than
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationBayesian inference for multivariate skew-normal and skew-t distributions
Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationDynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models
6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationMH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution
MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationAdvanced Statistical Methods. Lecture 6
Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationManifold Monte Carlo Methods
Manifold Monte Carlo Methods Mark Girolami Department of Statistical Science University College London Joint work with Ben Calderhead Research Section Ordinary Meeting The Royal Statistical Society October
More informationBayesian Inference in Astronomy & Astrophysics A Short Course
Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning
More informationInfering the Number of State Clusters in Hidden Markov Model and its Extension
Infering the Number of State Clusters in Hidden Markov Model and its Extension Xugang Ye Department of Applied Mathematics and Statistics, Johns Hopkins University Elements of a Hidden Markov Model (HMM)
More informationBayesian Estimation of log N log S
Bayesian Estimation of log N log S Paul D. Baines Department of Statistics University of California, Davis May 10th, 2013 Introduction Project Goals Develop a comprehensive method to infer (properties
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationStatistical Tools and Techniques for Solar Astronomers
Statistical Tools and Techniques for Solar Astronomers Alexander W Blocker Nathan Stein SolarStat 2012 Outline Outline 1 Introduction & Objectives 2 Statistical issues with astronomical data 3 Example:
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte
More informationMARGINAL MARKOV CHAIN MONTE CARLO METHODS
Statistica Sinica 20 (2010), 1423-1454 MARGINAL MARKOV CHAIN MONTE CARLO METHODS David A. van Dyk University of California, Irvine Abstract: Marginal Data Augmentation and Parameter-Expanded Data Augmentation
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationComputer Vision Group Prof. Daniel Cremers. 14. Sampling Methods
Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More information10. Exchangeability and hierarchical models Objective. Recommended reading
10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationGibbs Sampling. Héctor Corrada Bravo. University of Maryland, College Park, USA CMSC 644:
Gibbs Sampling Héctor Corrada Bravo University of Maryland, College Park, USA CMSC 644: 2019 03 27 Latent semantic analysis Documents as mixtures of topics (Hoffman 1999) 1 / 60 Latent semantic analysis
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationpyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa
pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa Aneta Siemiginowska Harvard-Smithsonian Center for Astrophysics Vinay Kashyap (CfA), David Van Dyk (UCI), Alanna Connors (Eureka), T.Park(Korea)+CHASC
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationLikelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009
with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods
Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationLuminosity Functions
Department of Statistics, Harvard University May 15th, 2012 Introduction: What is a Luminosity Function? Figure: A galaxy cluster. Introduction: Project Goal The Luminosity Function specifies the relative
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationBayesian Nonparametric Models
Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior
More informationTo Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency
To Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency Yaming Yu Department of Statistics, University of California, Irvine
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More information