Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Similar documents
1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Parameter estimation in epoch folding analysis

IACHEC 13, Apr , La Tenuta dei Ciclamini. CalStat WG Report. Vinay Kashyap (CXC/CfA)

The Challenges of Heteroscedastic Measurement Error in Astronomical Data

Statistical Methods for Astronomy

Introduction to astrostatistics

Statistical Data Analysis Stat 3: p-values, parameter estimation

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Statistical techniques for data analysis in Cosmology

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Censoring and truncation in astronomical surveys. Eric Feigelson (Astronomy & Astrophysics, Penn State)

Two Statistical Problems in X-ray Astronomy

Hypothesis testing:power, test statistic CMS:

arxiv:cond-mat/ v1 [cond-mat.stat-mech] 25 Feb 2003

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Model Fitting, Bootstrap, & Model Selection

Some Statistics. V. Lindberg. May 16, 2007

Linear regression issues in astronomy. G. Jogesh Babu Center for Astrostatistics

Probability distributions. Probability Distribution Functions. Probability distributions (contd.) Binomial distribution

Discussion of Maximization by Parts in Likelihood Inference

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

The γ-ray Milky Way above 10 GeV: Distinguishing Sources from Diffuse Emission

Period. End of Story?

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

If we want to analyze experimental or simulated data we might encounter the following tasks:

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Recall the Basics of Hypothesis Testing

Monte Carlo Simulations

Model Fitting and Model Selection. Model Fitting in Astronomy. Model Selection in Astronomy.

Practical Statistics

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Probability Distributions

On characterizing the variability properties of X-ray light curves from active galaxies

Fourier analysis of AGN light curves: practicalities, problems and solutions. Iossif Papadakis

Data modelling Parameter estimation

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

The Spectral Density Estimation of Stationary Time Series with Missing Data

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

FOURIER TRANSFORMS. f n = 1 N. ie. cyclic. dw = F k 2 = N 1. See Brault & White 1971, A&A, , for a tutorial introduction.

Statistical Methods in Particle Physics

Fully Bayesian Analysis of Low-Count Astronomical Images

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Using Bayes Factors for Model Selection in High-Energy Astrophysics

Composite Hypotheses and Generalized Likelihood Ratio Tests

The Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13

Physics 509: Bootstrap and Robust Parameter Estimation

Primer on statistics:

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Testing Algebraic Hypotheses

STATISTICS SYLLABUS UNIT I

Chapter 11. Hypothesis Testing (II)

Cosmic Statistics of Statistics: N-point Correlations

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Maximum Likelihood Estimation. only training data is available to design a classifier

Robustness and Distribution Assumptions

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

Non-Imaging Data Analysis

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Hypothesis Testing, Bayes Theorem, & Parameter Estimation

Statistical Methods for Astronomy

Physics 403 Probability Distributions II: More Properties of PDFs and PMFs

Declarative Statistics

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Bayesian Methods for Machine Learning

Introduction to Bayesian Methods

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Loglikelihood and Confidence Intervals

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

arxiv: v2 [astro-ph.co] 21 Dec 2015

Why Try Bayesian Methods? (Lecture 5)

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Lectures in AstroStatistics: Topics in Machine Learning for Astronomers

1 Random walks and data

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4)

James Theiler, NIS-2 Jeff Bloch, NIS-2

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

EE/CpE 345. Modeling and Simulation. Fall Class 10 November 18, 2002

Summer School in Statistics for Astronomers & Physicists June 5-10, 2005

Adjusted Empirical Likelihood for Long-memory Time Series Models

EE/CpE 345. Modeling and Simulation. Fall Class 9

Algorithms for Higher Order Spatial Statistics

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

X-ray spectral Analysis

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

REGULARIZED ESTIMATION OF LINE SPECTRA FROM IRREGULARLY SAMPLED ASTROPHYSICAL DATA. Sébastien Bourguignon, Hervé Carfantan, Loïc Jahan

15.6 Confidence Limits on Estimated Model Parameters

Zero-Inflated Models in Statistical Process Control

Second Workshop, Third Summary

Introduction to Statistics and Error Analysis II

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Transcription:

Statistical Applications in the Astronomy Literature II Jogesh Babu Center for Astrostatistics PennState University, USA 1

The likelihood ratio test (LRT) and the related F-test Protassov et al. (2002, ApJ, 571, 545) This paper by an astrostatistical group discusses how astronomers often misuse the likelihood ratio test (LRT) (known in X-ray astronomy as Cash's C-statistic) for comparing two models. The standard chi-squared probabilities are incorrect to establish whether a faint additional feature (e.g. spectral line) is present within a simpler (e.g. continuum) model. Known to statisticians as Nested models

The LRT and the F-test, popularized in astrophysics by Eadie and coworkers in 1971 Bevington in 1969 Lampton, Margon & Bowyer in 1976 Cash in 1979 Avni in 1978 do not (even asymptotically) adhere to their nominal χ 2 and F-distributions in many statistical tests, thereby casting many marginal line or source detections and nondetections into doubt.

There are numerous cases of the inappropriate use of the LRT and similar tests in the literature, bringing substantive scientific results into question. It is common practice to use the LRT or the F-test to detect a line in a spectral model or a source above background despite the lack of certain required regularity conditions. In these and other settings that involve testing a hypothesis that is on the boundary of the parameter space, contrary to common practice, the nominal χ 2 distribution for the LRT or the F-distribution for the F- test should not be used. In this paper, the authors characterize an important class of problems in which the LRT and the F-test fail and illustrate this nonstandard behavior.

To understand the difficulty with using these tests in this setting, begin with a formal statement of the asymptotic result that underlies the LRT. Suppose( x 1 ;... ; x n ) is an independent sample (e.g., measured counts per PHA [pulse height analyzer] channel or counts per imaging pixel) from the common probability distribution f(x) =f(x θ), with parameters θ=(θ 1 ;... ; θ p ). We denote the the numerator, the θ Τ terms represent parameters that are not varied but held at their true values, i.e., the values assumed under the null model. Under some regularity conditions, if (θ 1 ;... ; θ p ) actually equals (θ Τ 1 ;... ; θτ p ), the distribution of the LRT statistic converges to a Chi-square distribution with q degrees of freedom as the sample size n increases without bound. The degrees of freedom is the difference between the number of free parameters specified.

STATISTICS: HANDLE WITH Although the LRT is a valuable statistical tool with many astrophysical applications, it is not a universal solution for model selection. When testing a model on the boundary of the parameter space (e.g., testing for a spectral line), the (asymptotic) distribution of the LRT statistic is unknown. Using this LRT and its nominal Chi-square distribution can lead to unpredictable results (e.g., false positive rates varying from 1.5% to 31.5% in the nominal 5% false positive rate test in Monte Carlo studies). There is no replacement for an appreciation of the subtleties involved in any statistical method. Practitioners of statistics are forever searching for statistical black boxes. For sophisticated models that are common in spectral, spatial, as well as other applications in astrophysics, such black boxes simply do not exist. The highly hierarchical structures inherent in the data must be, at some level, reflected in the statistical model.

Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data Detection of periodic signal hidden in noise This paper studies the reliability and efficiency of detection with the periodogram, when the observation times are unevenly spaced. A modification of the classical definition of periodogram is suggested. With this modification, periodogram 7

A physical variable X is measured at a set of times tj, the resulting time series data {X(tj), j= 1,2,..., N}, are sum of signal and random observational errors (noise): Xj= X(tj)= Xs(tj)+ R(tj) The signal Xs is taken to be strictly periodic, R (tj) are i.i.d. normal with mean 0 & variance 1.

The Periodogram The basic tool of spectral analysis is the discrete Fourier transform (DFT) FTX(ω) = N j=1 X(tj) exp(-iωtj) The classical periodogram is PX(ω) = (1/N) FTX(ω) 2 = (1/N)[( j X(tj) cos ωtj) 2 +( j X(tj) sin ωtj) 2 ] (1) This expression is traditionally evaluated at M=N/2. 10

If X contains a sinusoidal component of frequency ω0 then P would be large at and near ω= ω0 If the observational times are evenly spaced, say tj=j, PX(ω) = (1/N) j X(tj) exp(-ij ω ) 2 (2) Proposed version ( j X(tj) cos ωtj) 2 ( j X(tj) sin ωtj) 2 PX(ω) = (1/2) --------------------- + ----------------- j cos 2 ωtj This does not have exponential distribution unless, j sin 2 ωtj

Invariance to time translation is a useful property possessed by the classical periodogram, i.e., it is unchanged if tj is changed to T+tj for every j. There are many ways to restore invariance. [ j X(tj) cos ω(tj-τ)] 2 [ j X(tj) sin ω (tj-τ)] 2 PX(ω) = (1/2) ------------------------ + ------------------------, j cos 2 ω (tj-τ) (tj-τ) where j sin 2ωtj τ= (1/2ω) arctan -------------- j cos 2ωtj j sin 2 ω

Unevenly-sampled signals: a general formalism of the Lomb-Scargle periodogram R. Vio1, P. Andreani, and A. Biggs A&A preprint doi http://dx.doi.org10.1051/0004-6361/201014079 The authors suggest methods to handle data with non-gaussian errors.

These issues were discussed in Babu & Feigelson (2006 ASPC, 351, 127). Hou et al. (2009 ApJ 702, 1199) This is a nice and straightforward study of nonparametric goodness-of-fit tests in the context of a problem in extragalactic astronomy, It shows that the poorly-known Anderson-Darling test performs better than the popular Kolmogorov-Smirnov test. However, they still make an error by not noting that the tabulated probabilities are incorrect for models with parameters derived from the same dataset.

Hinshaw et al. (2003 ApJ S, 148, 35 & Genovese et al. (2004) http://projecteuclid.org/dpubs? service=ui&version=1.0&verb=display&handle=euclid.ss/ 1105714165 Here are two contrasting ways to fit functions derived from astrophysical theory to a dataset. The problem is estimation of the LambdaCDM cosmological model parameters from WMAP cosmic microwave background radiation fluctuations. Hinshaw et al. (Appendix A) give a classical maximum likelihood estimation approach with error analysis from the Fisher Information Matrix. Genovese et al. are statisticians who use new techniques of `semiparametric regression' to estimate the curve (and its uncertainty) without assuming a cosmological model. A brief discussion, of when parametric vs. semi-parametric modeling is appropriate, is given by Feigelson (2007) http:// adsabs.harvard.edu/abs/2007aspc..371..280f).

Hundreds of astronomical papers seek the slope of a power-law distribution of an astronomical properties in a sample by binning the data and computing the least-squares regression line. This is done for cosmic ray energy spectra, stellar Initial Mass Functions, interstellar molecular cloud clump functions, galaxy luminosity function, and so forth. Yet, the method gives both biased and inaccurate results, even for large samples. A very simple maximum likelihood estimator has been known since 1957 with excellent performance that does not require binning. Here are a few papers on the issue. http://adsabs.harvard.edu/abs/1970apj...162..405c http://adsabs.harvard.edu/abs/2004epjb...41..255g http://adsabs.harvard.edu/abs/2007epjb...58..167b

http://en.wikipedia.org/wiki/ Binomial_proportion_confidence_interval http://www.projecteuclid.org/dpubs? service=ui&version=1.0&verb=display&handle=euclid.ss/ 1009213286 http://adsabs.harvard.edu/abs/2006apj...652..610p Astronomers sometimes want to compute the ratios of two Poisson-distributed quantities. This might be a signal-to-noise ratio or a hardness ratio in fields such as X-ray or gamma-ray astronomy. The ratio of two Poissons (known in statistics as the `binomial proportion problem') is surprisingly tricky; for example, the maximum likelihood estimator is unstable and chaotic. Several analytical solutions are discussed but none is obviously best. When background is subtracted from the numerator and/or denominator, the problem requires numerical