SAMSI Astrostatistics Tutorial. Models with Gaussian Uncertainties (lecture 2)

Similar documents
SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

arxiv:astro-ph/ v1 14 Sep 2005

Introduction to Bayesian Data Analysis

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis

A Bayesian Analysis of Extrasolar Planet Data for HD 73526

Detecting Extra-solar Planets with a Bayesian hybrid MCMC Kepler periodogram

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4)

Bayesian Inference in Astronomy & Astrophysics A Short Course

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

A BAYESIAN ANALYSIS OF EXTRASOLAR PLANET DATA FOR HD 73526

Extra-solar Planets via Bayesian Fusion MCMC

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Bayesian room-acoustic modal analysis

STA 4273H: Sta-s-cal Machine Learning

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Physics 509: Bootstrap and Robust Parameter Estimation

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

STA414/2104 Statistical Methods for Machine Learning II

STA 4273H: Statistical Machine Learning

Bayesian Model Selection & Extrasolar Planet Detection

Teaching Bayesian Model Comparison With the Three-sided Coin

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

STA 4273H: Statistical Machine Learning

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

Statistical Methods for Astronomy

{2{ radio outbursts with a period of 26.5 days (Taylor and Gregory 1982, 1984) and there is recent evidence for an approximately 5 fold day m

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Advanced Statistical Methods. Lecture 6

Bayesian Exoplanet tests of a new method for MCMC sampling in highly correlated model parameter spaces

Bayesian Regression Linear and Logistic Regression

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Line Broadening. φ(ν) = Γ/4π 2 (ν ν 0 ) 2 + (Γ/4π) 2, (3) where now Γ = γ +2ν col includes contributions from both natural broadening and collisions.

Copyright license. Exchanging Information with the Stars. The goal. Some challenges

G. Larry Bretthorst. Washington University, Department of Chemistry. and. C. Ray Smith

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Statistical techniques for data analysis in Cosmology

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Signal Modeling, Statistical Inference and Data Mining in Astrophysics

Bayesian re-analysis of the Gliese 581 exoplanet system

Dynamic System Identification using HDMR-Bayesian Technique

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Non-Imaging Data Analysis

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

EXONEST The Exoplanetary Explorer. Kevin H. Knuth and Ben Placek Department of Physics University at Albany (SUNY) Albany NY

Lecture 5: Bayes pt. 1

Principles of Bayesian Inference

Supplementary Note on Bayesian analysis

Introduction to Bayes

Incorporating Spectra Into Periodic Timing: Bayesian Energy Quantiles

Principles of Bayesian Inference

an introduction to bayesian inference

STA 4273H: Statistical Machine Learning

Bayesian Analysis of RR Lyrae Distances and Kinematics

Introduction to Machine Learning CMU-10701

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Principles of Bayesian Inference

AST 418/518 Instrumentation and Statistics

Frequentist-Bayesian Model Comparisons: A Simple Example

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Why Try Bayesian Methods? (Lecture 5)

Statistical Searches in Astrophysics and Cosmology

Markov Chain Monte Carlo methods

A523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

Recall that the AR(p) model is defined by the equation

EIE6207: Estimation Theory

Period. End of Story?

Statistical validation of PLATO 2.0 planet candidates

Statistical Data Analysis Stat 3: p-values, parameter estimation

Abstract. Statistics for Twenty-first Century Astrometry. Bayesian Analysis and Astronomy. Advantages of Bayesian Methods

Fundamental Probability and Statistics

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Introduction to Statistical Methods for High Energy Physics

17 : Markov Chain Monte Carlo

Hierarchical Bayesian Modeling

Lecture 6: Markov Chain Monte Carlo

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Statistics and Data Analysis

Convergence Diagnostics For Markov chain Monte Carlo. Eric B. Ford (Penn State) Bayesian Computing for Astronomical Data Analysis June 9, 2017

FOURIER TRANSFORMS. f n = 1 N. ie. cyclic. dw = F k 2 = N 1. See Brault & White 1971, A&A, , for a tutorial introduction.

6. Advanced Numerical Methods. Monte Carlo Methods

STAT 518 Intro Student Presentation

arxiv: v1 [astro-ph.ep] 18 Feb 2009

Bayesian search for other Earths

Confidence Distribution

Multivariate statistical methods and data mining in particle physics

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Data Analysis in Asteroseismology

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Detection ASTR ASTR509 Jasper Wall Fall term. William Sealey Gosset

Bayesian Support Vector Machines for Feature Ranking and Selection

Transcription:

SAMSI Astrostatistics Tutorial Models with Gaussian Uncertainties (lecture 2) Phil Gregory University of British Columbia 2006

The rewards of data analysis: 'The universe is full of magical things, patiently waiting for our wits to grow sharper. Eden Philpotts (1862-1960) Author and playwright

Resources and solutions Book Preface (11 Kb) Mathematica Tutorial (1415 Kb) Errata (206 Kb) Revisions (268 Kb) Book website: www.cambridge.org/052184150x

Bayesian Logical Data Analysis for the Physical Sciences Contents: 1. Role of probability theory in science 2. Probability theory as extended logic 3. The how-to of Bayesian inference 4. Assigning probabilities 5. Frequentist statistical inference 6. What is a statistic? 7. Frequentist hypothesis testing 8. Maximum entropy probabilities 9. Bayesian inference (Gaussian errors) 10. Linear model fitting (Gaussian errors) 11. Nonlinear model fitting 12. Markov chain Monte Carlo 13. Bayesian spectral analysis 14. Bayesian inference (Poisson sampling) Appendix A. Singular value decomposition Appendix B. Discrete Fourier Transform Appendix C. Difference in two samples Appendix D. Poisson ON/OFF details Appendix E. Multivariate Gaussian from maximum entropy.

Outline 1. More on Gaussian uncertainties 2. A Bayesian revolution in spectral/(time series) analysis Fourier power spectrum, new Bayesian insight Bayesian spectral analysis with strong prior information about the signal model Bretthorst periodogram Kepler periodogram Bayesian spectrum analysis with weak prior information about the signal model Gaussian version of the GL periodogram Applications 3. Conclusions 1 2 3 4 5 6 7 8 9

Gaussian uncertainties According to the Central Limit Theorem the probability distribution of the average of m independent measurements of some quantity tends to a Gaussian as m increases regardless of the probability distribution of individual measurements of the quantity, provided it has a finite variance. Generalization of the Central Limit Theorem: "Any quantity that stems from a large number of sub-processes is expected to have a Gaussian (normal) distribution." The reason that the normal (Gaussian) distribution occurs so often is because measured quantities are frequently the result of a large number of effects, i.e., is some kind of averaged resultant of these effects. Since the distribution of the average of a collection of random variables tends to a Gaussian, this is often what we observe. Of course, by working with average values we can ensure the Likelihood function is well described by a Gaussian.

Gaussian uncertainties, alternate justification Chapter 8 In Bayesian inference, probabilities are a measure of our state of knowledge about nature, not a measure of nature itself. equivalently In Bayesian inference, probabilities are a measure of our state of ignorance about nature, not a measure of nature itself. What if we have very limited knowledge about the choice of sampling distribution to be used in calculating the likelihood and not enough measurements (<5) to ensure a Gaussian distribution by averaging? We appeal to the Maximum Entropy Principle. It says that unless we have some additional prior information which justifies the use of some other sampling distribution, then use a Gaussian sampling distribution. It makes the fewest assumptions about the information you don't have and will lead to the most conservative estimates (i.e., greater uncertainty than you would get from choosing a more appropriate distribution based on more information). The Maximum Entropy Principle, developed by E. T. Jaynes, is based on Claude Shannon s landmark 1948 paper on the use of entropy to quantify the uncertainty of a discrete probability distribution.

Panel (a) shows the true bimodal distribution of measurement errors for some instrument. Panels (b), (c), (d), (e) & (f) show a comparison of the posterior PDF's derived from five simulated data derived from (a). Each sample consists of the two data points indicated by the two arrows at the top of each panel. The solid curve shows the result obtained using the true sampling distribution. The dotted curve shows the result using a Gaussian with unknown σ and marginalizing σ.

A Bayesian Revolution in Spectral Analysis 1) Fourier Power Spectrum ( periodogram) The use of the Discrete Fourier Transform (DFT) is ubiquitous in spectral analysis as a result of the FFT introduced by Cooley and Tukey in 1965. periodogram = C(f n ) = = 1 N 1 N N k = 1 FFT d 2 e i2π n f k t k 2) New Insights on the periodogram from Bayesian Probability Theory (BPT) In 1987 E. T. Jaynes derived the DFT and periodogram directly from the principles of BPT and showed that the periodogram is an optimum statistic for the detection of a single stationary sinusoidal signal in the presence of independent Gaussian noise. He showed that the probability of the frequency of a periodic signal is given to a very good approximation by, p(f n D,I) C(f exp σ n 2 ) 2

p(f n D,I) C(f exp σ n 2 ) Thus C(f n ) is indeed fundamental to spectral analysis but not because it is itself a satisfactory spectrum estimator. The proper algorithm to convert C(f n ) to p(f n D,I) involves first dividing C(f n ) by the noise variance and then exponentiation. This naturally suppresses spurious ripples at the base of the periodogram as well as linear smoothing; but does it by attenuation rather than smearing, and therefore does not lose any resolution. The Bayesian nonlinear processing of C(f n ) also yields, when the data give evidence for them, arbitrarily sharp spectral peaks.

Stronger signal example

What if σ is unknown n D,I) C(f exp σ n 2 This equation assumes that the noise variance is a known quantity. In some situations, the noise is not well understood, i.e., our state of knowledge is less certain. Even if the measurement apparatus noise is well understood, the data may contain a greater complexity of phenomena than the current signal model incorporates. p(f Again, Bayesian inference can readily handle this situation by treating the noise variance as a nuisance parameter with a prior distribution reflecting our uncertainty in this parameter. We saw how to do that when estimating a mean in the previous lecture. The resulting posterior can be expressed in the form of a Student's t distribution. The corresponding result for estimating the frequency of single sinusoidal signal (Bretthorst 1988, Bayesian Spectrum Analysis and Parameter Estimation, Springer) is given approximately by ) where

The Bretthorst periodogram: A Bayesian generalization of the Lomb-Scargle periodogram Bretthorst, G.L. (2001), American Institute of Physics Conference Proceedings, 568, pp. 241. Bretthorst (2001) generalized Jaynes' insights to a broader range of single-frequency estimation problems and sampling conditions. In the course of this development, Bretthorst established a connection between the Bayesian results and an existing frequentist statistic known as the Lomb-Scargle periodigram, which is a widely used replacement for the Schuster periodogram in the case of non-uniform sampling. Bretthorst s analysis allows for the following complications: 1. Either real or quadrature data sampling. Quadrature data involves measurements of the real and imaginary components of a complex signal. The figure show an example of quadrature signals occurring in NMR.

The Bretthorst periodogram d I t j ' Let d R Ht i L denote the real data at time t i and denote the imaginary data at time. There are N R real samples and N I imaginary samples for a total of N = N R + N I samples. 2. Allows for uniform or non-uniform sampling and for quadrature data with non-simultaneous sampling. ' The analysis does not require the t i and t j to be simultaneous and successive samples can be unequally spaced in time. 3. Allows for non-stationary single sinusoid model of the form d R Ht i L = Acos H2 p ft i -q L Z Ht i L + Bsin H2 p ft i -q L Z Ht i L + er Ht i L, d I I ' t j M = Acos I ' 2 p ft j -q M Z I ' tj M ' + Bsin I 2 p ft j -q M Z I ' ' tj M + e I I t j M, The function Z Ht i L describes an arbitrary modulation of the amplitude, e.g., exponential decay as exhibited in NMR signals. Z Ht i L is sometimes called a weighting function or apodizing function. In this analysis Z Ht i L is assumed to be known, but in principle it might have unknown parameters.

The Bretthorst periodogram The angle θ is defined in such a way as to make the cosine and sine functions orthogonal on the discretely sampled times. In general, θ is frequency dependent. e R t i e I t j ' 4. The noise terms and are assumed to be IID Gaussian with an unknown σ. Thus, is a nuisance parameter, which is assumed to have a Jeffreys prior. By marginalizing over σ, any variability in the data that is not described by the model is assumed to be noise.

The Bretthorst periodogram In this problem the parameter of interest is the frequency f. The other nuisance parameters are the amplitudes A, B as well as σ, the standard deviation of the Gaussian noise probabilities used to assign the likelihoods. To compute p (f D,I) we need to marginalize over the three nuisance parameters A, B, σ. p f D, I = A B s p f, A, B, s D R,D I,I The RH side of this equation can be factored using Bayes theorem and the product rule to yield p Hf» D, IL = A B s p Hf» IL p HA» IL p HB» IL p Hs» IL â p HD R» f, A, B, s, IL p HD I» f, A, B, s, IL Bretthorst assigns uniform priors for f, A & B and a Jeffreys prior for σ.

The Bretthorst periodogram It turns out that the triple integral can be performed analytically using simple changes in the variables. The final Bayesian expression for p (f D,I), after marginalizing over amplitudes A, B & σ (assuming independent uniform priors), is given by where

The Bretthorst periodogram

The Bretthorst periodogram Simplifications 1. When the data are real and the sinusoid is stationary, the sufficient statistic for single frequency estimation is the Lomb-Scargle periodogram; not the Schuster periodogram (power spectrum). 2. When the data are real, but Z(t) is not constant, then generalizes the Lomb-Scargle periodogram in a very straightforward manner to account for the decay of the signal. 3. For uniformly sampled quadrature data when the sinusoid is stationary, reduces to a Schuster periodogram of the data. The Schuster periodogram is not a sufficient statistic for frequency estimation in real, i.e., nonquadrature, data. However, the Schuster periodogram is often an excellent approximation and is much faster to compute.

Application to extra-solar planet data Precision radial velocity measurements for HD 73526 (ref. Tinney, G. C. 2003, Astrophsical Journal, 587, p. 423) V elocity Hms -1 L 50 0-50 -100-150 0 200 400 600 800 1000 1200 Julian day number - 2,451,212.1302

Power density Power densit y 70000 60000 50000 40000 30000 20000 10000 Lomb-Scargle Periodogram Lomb-Scargle Comparison of Lomb-Scargle to Bretthorst s Bayesian generalization of Lomb-Scarge. Log 10 (PDF) Probability P densit density y Probability density L og 1 0 H Probabilit y d ensit yl 0.2 0.15 0.1 0.05-1 -2-3 -4-5 -6-7 0 0 1 10 100 1000 Period HdL Bretthorst s Bayesian Lomb-Scargle Periodogram 1 10 100 1000 Period HdL Bretthorst s Bayesian Lomb-Scargle Periodogram 128 day Bretthorst s Bayesian generalization of the Lomb-Scargle periodogram 190 day Period (d) 1 10 100 1000 376 day

HD 73526 orbital solution of Tinney et al. (2003) Obtained from a nonlinear least squares fit procedure. -

Problem The Lomb-Scargle periodogram, and Bretthorst s Bayesian generalization, assume a sinusoidal signal which is only optimum for circular orbits. Why not develop a Bayesian Kepler periodogram designed for all Kepler orbits. This has recently been accomplished using Markov chain Monte Carlo (MCMC) algorithms by two different individuals. Probability density Ford, E. B. (2005), AJ, 129, 1706 Gregory, P. C. (2005), Ap. J., 631, 1198, 2005 Bretthorst s Bayesian generalization of the Lomb-Scargle periodogram Gregory, P. C. (2005), AIP Conference Proceeding 803, p. 139 Period (d)

Data Model Prior information D M I Target Posterior ph8x a <»D,M,IL n = no. of iterations 8X a < init = start parameters 8s a < init = start proposal s's 8b< = Temperinglevels MCMC Nonlinear Model Fitting Program Control system parameters n 1 = major cycle iterations n 2 = minor cycle iterations l=acceptance ratio g=damping constant - Control systemdiagnostics - 8X a < iterations - Summary statistics - Best fit model & residuals - 8X a < marginals - 8X a < 68.3% credible regions - phd»m,il global likelihood for model comparison Schematic of a Bayesian Markov chain Monte Carlo program for nonlinear model fitting. The program incorporates a control system that automates the selection of Gaussian proposal distribution σ s.

Data Model Prior information D M I Target Posterior ph8x a <»D,M,IL n = no. of iterations 8X a < init = start parameters 8s a < init = start proposal s's 8b< = Temperinglevels MCMC Nonlinear Model Fitting Program Control system parameters n 1 = major cycle iterations n 2 = minor cycle iterations l=acceptance ratio g=damping constant - Control systemdiagnostics - 8X a < iterations - Summary statistics - Best fit model & residuals - 8X a < marginals - 8X a < 68.3% credible regions - phd»m,il global likelihood for model comparison If you input a Kepler model the algorithm becomes A Kepler periodogram Optimum for finding all Kepler orbits and evaluating their probabilities. Capable of simultaneously fitting multiple planet models. A multi-planet Kepler periodogram Schematic of a Bayesian Markov chain Monte Carlo program for nonlinear model fitting. The program incorporates a control system that automates the selection of Gaussian proposal distribution σ s.

Advantages of the Bayesian Multi-planet Kepler Periodogram 1) Good initial guesses of the parameter values are not required. The algorithm is capable of efficiently exploring all regions of joint prior parameter space having significant probability. 2) The mathematical form of a planetary signal embedded in the host star's reflex motion is well known and is built into the Bayesian analysis as part of the prior information. There is thus no need to carry out a separate search for possible orbital period(s). 3) The method is capable of fitting a portion of an orbital period, so search periods longer than the duration of the data are possible. 4) More complicated (two or more planet) models contain larger numbers of parameters and thus incur a larger Occam penalty. A quantitative Occam s razor is automatically incorporated in a Bayesian model selection analysis. The analysis yields the relative probability of each of the models explored. 5) The analysis yields the full marginal posterior probability density function (PDF) for each model parameter, not just the maximum a posterior (MAP) probability values.

HD 73526 results for a one planet model 0 w=-1.75884 ; chain b=1. ; Max log 10 HPrior ä Likelihood b L =-38.9007 b L L og 1 0 H P rio r ä Likelihoo d -1-2 -3-4 -5 100 150 200 250 300 350 Period HdL A plot of Log 10 (prior x likelihood) versus orbital period P for the β = 1 simulation. The upper envelope of the points is the projection of the joint posterior for the parameters onto the period-log 10 (prior x likelihood) plane. The solid vertical lines are located at 1/2, 1, 3/2, 2, 3 times the 128 day period of the highest peak. The dashed vertical line corresponds to the earth s orbital period.

Best fit orbits versus phase P = 128 days RMS residual = 11 m s -1 Mean data error = 10 m s -1 Velocity m s 1 150 100 50 0 50 100 150 0 0.5 1 1.5 2 Orbital phase 50 P = 190 days RMS residual = 15 m s -1 Velocity m s 1 0 50 100 150 0 0.5 1 1.5 2 P Phase P = 376 days RMS residual = 14 m s -1 Velocity m s 1 50 0 50 100 150 200 0 0.5 1 1.5 2 Orbital phase

Bayesian Spectrum Analysis with Weak Prior Information of the Signal Model In the early 1990 s, Tom Loredo and I attached the question of How to detect a signal of unknown shape in a time series We developed a useful Bayesian solution to this problem. We were initially motivated by the problem of detecting periodic signals in X-ray astronomy data. In this case the time series consisted of individual photon arrival times where the appropriate sampling distribution is the Poisson distribution. X-ray pulsars exhibit a wide range of pulse shapes.

Gregory-Loredo (GL) Method (Poisson case Ap.J. 398,1992) To address the detection problem we compute the ratio of the probabilities (odds) of two models M Per and M 1. Model M per is a family of periodic models capable of describing a background + periodic signal of arbitrary shape. Each member of the family is a histogram with m bins where m > or = 2. Three examples are shown. M m represents the periodic model with m bins. m = 2 m = 8 m = 15 Model M 1 assumes the data is consistent with a constant event rate A. M 1 is a special case of M m with m = 1 bin m = 1

GL Method continued Search parameters for one of the m bin periodic models unknown period, phase + m shape parameters Special feature of the histogram models is that the search in the m shape parameters can be carried out analytically, permitting the method to be computationally tractable. The Bayesian posterior probability for a particular periodic model, M m, contains a term* which quantifies Occam s razor, penalizing successively more complicated periodic models (increasing m) for their greater complexity. * This term arises automatically in any Bayesian model comparison. It is not added in some a hoc fashion. The calculation thus balances model simplicity with goodness of fit, allowing us to determine both: - whether there is evidence for a periodic signal, and - the optimum number of bins for describing the structure in the data. The Bayesian solution in the Poisson case is very satisfying: In the absence of knowledge about the shape of the signal the method identifies the most organized (minimum entropy) significant periodic structure in the model parameter space. What structure is significant is determined through built in quantified Occam s penalties in the calculation.

The GGL periodogram # is a version of the original Gregory-Loredo algorithm modified for Gaussian uncertainties. PDF PDF 0.8 0.6 0.4 0.2 GGL periodogram for HD 73526 Highest peak P = 128 days a 200 400 600 800 1000 1200 P d Both assume the shape of a periodic signal is unknown but can be modelled with a family of histograms with differing In the number of bins. PDF PDF 0.5 0.4 0.3 0.2 0.1 b 140 160 180 200 220 240 260 P d Allows for different Gaussian noise σ for each point. PDF PDF 0.8 0.6 0.4 c 0.2 # Gregory, Ap. J. 520, pp. 361-375, 1999 126.5 127 127.5 128 128.5 129 129.5 130 P d Period (d)

Periodic Radio Outbursts from LSI+61 0 303 2.2 GHz Green Bank Interferometer-NASA-Naval Observatory Radio Monitoring P 1 ~26.5 d Julian Day - 2440000.0 8.3 GHz Julian Day - 2440000.0 1 year 23 years of data since discovery in 1977 The last 6 years are from the GBI radio monitoring program Return

-5 0 5 10 15 rêr * Discovery of a 4.6 year modulation of the phase of the radio outbursts 4.6 yr Outburst timing residuals span ~ 0.5 of an orbit rêr* 6 4 2 0-2 -4-6 0.4 0.3 0.2 0.5 0.6 0.1 1. 0.9 0.8 0.7 e = 0.82

Conclusions 1. In these examples I have tried to demonstrate the power of the Bayesian approach in the arena of spectral analysis. 2. A Bayesian analysis can sometimes yield orders of magnitude improvement in model parameter estimation, by the inclusion of valuable prior information such as our knowledge of the signal model. 3. For some problems a Bayesian analysis may lead to a familiar statistic. Even in this situation (as we saw with the periodogram) it often leads to new insights concerning the interpretation and generalization of the statistic. 4. As a theory of extended logic it can be used to address the question of what is the optimal answer to a particular scientific question for a given state of knowledge, in contrast to a numerical recipe approach. 5. With all useful theoretical advances we expect to gain powerful new insights. This has already been amply demonstrated in the case of BPT and it is still early days. One of these important new insights : BPT provides a means of assessing competing theories at the forefront of science by quantifying Occam s razor,