Period. End of Story?

Similar documents
Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Lecture 10: F -Tests, ANOVA and R 2

lightcurve Data Processing program v1.0

Take the measurement of a person's height as an example. Assuming that her height has been determined to be 5' 8", how accurate is our result?

TIME SERIES ANALYSIS

appstats27.notebook April 06, 2017

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Descriptive Statistics (And a little bit on rounding and significant digits)

ORF 245 Fundamentals of Statistics Practice Final Exam

Chapter 1: January 26 January 30

Parameter estimation in epoch folding analysis

Solving simple equations, a review/primer

THE SIMPLE PROOF OF GOLDBACH'S CONJECTURE. by Miles Mathis

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

LECTURE 15: SIMPLE LINEAR REGRESSION I

A=randn(500,100); mu=mean(a); sigma_a=std(a); std_a=sigma_a/sqrt(500); [std(mu) mean(std_a)] % compare standard deviation of means % vs standard error

Up/down-sampling & interpolation Centre for Doctoral Training in Healthcare Innovation

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Chapter 27 Summary Inferences for Regression

POL 681 Lecture Notes: Statistical Interactions

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

False Alarm Probability based on bootstrap and extreme-value methods for periodogram peaks

Sampler of Interdisciplinary Measurement Error and Complex Data Problems

Algebra Year 10. Language

Algebra Exam. Solutions and Grading Guide

My data doesn t look like that..

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Equivalent Width Abundance Analysis In Moog

Signal Modeling, Statistical Inference and Data Mining in Astrophysics

Homework #2 Due Monday, April 18, 2012

Introduction to Random Effects of Time and Model Estimation

Chapter 14. Simple Linear Regression Preliminary Remarks The Simple Linear Regression Model

BRIDGE CIRCUITS EXPERIMENT 5: DC AND AC BRIDGE CIRCUITS 10/2/13

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

6.867 Machine learning

ISQS 5349 Spring 2013 Final Exam

Sums of Squares (FNS 195-S) Fall 2014

Conditional probabilities and graphical models

LINEAR MODELS IN STATISTICAL SIGNAL PROCESSING

MITOCW 6. Standing Waves Part I

Section 3: Simple Linear Regression

Semester 2, 2015/2016

Please bring the task to your first physics lesson and hand it to the teacher.

Introduction to Algebra: The First Week

Math (P)Review Part I:

Data Analysis and Statistical Methods Statistics 651

Take the Anxiety Out of Word Problems

Finite Mathematics : A Business Approach

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Algebra Review. Finding Zeros (Roots) of Quadratics, Cubics, and Quartics. Kasten, Algebra 2. Algebra Review

ECON The Simple Regression Model

STA121: Applied Regression Analysis

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Generalized Linear Models for Non-Normal Data

Chapter 26: Comparing Counts (Chi Square)

A Re-Introduction to General Linear Models (GLM)

UCSD SIO221c: (Gille) 1

Introduction to Within-Person Analysis and RM ANOVA

CS 5014: Research Methods in Computer Science

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Linear Systems Theory Handout

Regression Analysis: Basic Concepts

PS2: Two Variable Statistics

Supplemental Resource: Brain and Cognitive Sciences Statistics & Visualization for Data Analysis & Inference January (IAP) 2009

6.435, System Identification

Linear Regression. Chapter 3

Introduction to Logarithms What Is a Logarithm? W. L. Culbertson, Ph.D.

In economics, the amount of a good x demanded is a function of the price of that good. In other words,

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

IB Paper 6: Signal and Data Analysis

The Simple Linear Regression Model

Simple Linear Regression Using Ordinary Least Squares

The Derivative of a Function

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

arxiv: v1 [astro-ph.sr] 6 Jul 2013

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Time Series. Anthony Davison. c

WEIGHTED LEAST SQUARES. Model Assumptions for Weighted Least Squares: Recall: We can fit least squares estimates just assuming a linear mean function.

Statistical View of Least Squares

Bridging the gap between GCSE and A level mathematics

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Chapter 1 Review of Equations and Inequalities

Confidence intervals CE 311S

MA 1128: Lecture 08 03/02/2018. Linear Equations from Graphs And Linear Inequalities

R 2 and F -Tests and ANOVA

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Investigating the use of the Lomb-Scargle Periodogram for Heart Rate Variability Quantification

Should the Residuals be Normal?

L6: Regression II. JJ Chen. July 2, 2015

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.

Point-to-point response to reviewers comments

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Unstable Oscillations!

Transcription:

Period. End of Story? Twice Around With Period Detection: Updated Algorithms Eric L. Michelsen UCSD CASS Journal Club, January 9, 2015

Can you read this? 36 pt This is big stuff 24 pt This makes a point 20 pt This is very legible 18 pt This may be the minimum usable 16 pt For reference only 12 pt Are you kidding me? 10 pt Don t even try it January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 2

Papers Zechmeister, M. and M. K urster, The generalised Lomb- Scargle periodogram, Astronomy & Astrophysics, January 20, 2009 Swarzenberg-Czerny, The distribution of empirical periodograms: Lomb Scargle and PDM spectra, A. Schwarzenberg-Czerny, Monthly Notices of the Royal Astronomical Society, 301, 831 840 (1998), and several similar. Some of my own work: http://physics.ucsd.edu/~emichels/, Funky Mathematical Physics Concepts, 2014. January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 3

My one sentence Detecting periods is harder than you think 70 s and 80 s algorithms are out of date L-S, PDM, EF, (DFT was already unusable) Seriously suboptimal Standard formulas incorrect Updates exist: join the 21 st century Fit for DC Use your uncertainties Use the right statistic Accomodate your non-normal residuals January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 4

The task: finding a periodic component in a sequenc of measurements It may be a time series Light curve: intensity vs. time Or a function of space Temperature vs. position intensity time January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 5

Four common problems with period finding algorithms 1. Subtracting DC (instead of fitting) 2. Ignoring individual uncertainties 3. Computing the wrong statistic Most commonly, using s A2 /s T2 instead of s A2 /s E2 as the F- statistic 4. Assuming gaussian residuals January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 6

0. Background 1. Subtracting DC (instead of fitting) 2. Ignoring individual uncertainties 3. Computing the wrong statistic 4. Assuming gaussian residuals

Use your words Uncertainty: the 1σ unknowable noise in your measurements DC: the constant a 0 in the linear model yt () b bf() t... 0 1 1 Residuals: what s left after subtracting the fit Errors: we don t use that word It s too vague, and therefore confusing January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 8

Periodicity: The gopher in the garden Sometimes, you know the gopher is there, even if you can t see him You can see his residue But you can t tell what color his coat is Periodicity is the same way You may be confident there is some periodicity, even if you can t describe it January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 9

Linear fit (aka linear regression ) Sinusoidal fits are linear regression But Lomb-Scargle uses nonlinear algorithm PDM is linear regression Analysis of Variance (ANOVA) is linear regression The master equation of linear regression: the sum-of-squares identity: SST SSA SSE total variation modeled variation unmodeled variation + noise Absolute mathematical identity for linear least-squares fit January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 10

Gaussian noise special case For pure, gaussian noise, SSA and SSE also provide independent unbiased estimates of the population noise variance n n n 2 2 2 yi y ymod, i y yi ymod, i i 1 i 1 i 1 SST SSA SSE Phase Dispersion Minimization (PDM) is a single-treatment p-level ANOVA Deprecated Lomb-Scargle is a 2-parameter (p = 2) linear function fit January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 11

Notation confusion Some references use k number of parameters Some use p = k + 1 number of parameters I use this This affects how we write the dof But not what the dof actually is January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 12

Fit for DC 0. Background 1. Fit for DC 2. Ignoring individual uncertainties 3. Computing the wrong statistic 4. Assuming gaussian residuals

DC matters (for sinusoids) DC? Can t I just subtract it and forget it? No Simultaneous fit (DC and signal) is better than one-at-a-time with subtraction You would never fit a straight line without fitting for DC Why would you do it with a cosine? y b x 1 best-fit is good best-fit is heavily suppressed best-fit slope = 0 This is what Scargle-1982 does! January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 14

DC doesn t matter (for PDM) Not a theoretical issue with PDM Round-off error usually insignificant I disagree with Swarzenberg-Czerny on this The fit-functions subsume a DC component I.e., DC is redundant with the other functions folding period fit functions are orthogonal time January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 15

Uncertainty Matters 0. Background 1. Fit for DC 2. Use your uncertainties 3. Computing the wrong statistic 4. Assuming gaussian residuals

Uncertainty matters Just putting weights in the obvious places doesn t always work Some things aren t obvious E.g., the correct weighted sample variance is complicated: n 2 w y y i i n n 2 i 1 2 1 i, 2 i V1 V2/ V1 i 1 i 1 s where V w V w Doesn t work for (deprecated) Lomb-Scargle, either Uses a non-linear trick to compute a linear fit I saw two papers that used incorrect (but obvious ) formulas January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 17

Unweighted vs. Weighted unweighted weighted Adam s data uncertainties ~.004 -.008 PDM F-statistic frequency (cpd) January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 18

Transformation from heteroskedastic to equivalent homoskedastic measurements Scale both measurements and predictors by 1/uncertainty Allows using standard (homoskedastic) library routines Doesn t work for nonlinear algorithms E.g., Lomb-Scargle doesn t work τ nonlinear in y i OK, cuz L-S is dead, anyway 2 (y i, σ) 1 (y i, u i ) 1 2 3 January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 19

Use the Right Statistic What the F? 0. Background 1. Fit for DC 2. Use your uncertainties 3. Use the proper F statistic 4. Assuming gaussian residuals

Correct and stable The PDF for the traditional Lomb- Scargle and PDM detection parameters is a beta( ) Not F! SSA/ dof SST / dof But... beta( ) is numerically unstable near the critical values Right where we need it to be accurate I recommend using the correct F statistic, which is wellbehaved [Schwarzenberg-Czerny] SSA/ dof f Pure noise f 1 SSE / dof Not the incorrect F statistic in [Scargle 1982] Shuffle simulations hide some of these common errors But it s best to do it right, and also do shuffle simulations A T β A E similar to 2 SSA r SST f January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 21

Lomb-Scargle was a good horse while it rode... But it s got 3½ broken legs It doesn t include DC: leaves extra noise It doesn t handle uncertainties Scargle recommended the wrong detection statistic January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 22

But sinusoidal periodograms live on The concept is perfectly valid The periodogram concepts of my previous journal club are still relevant Zechmeister s Modified Lomb-Scargle is just a weighted linear fit, straight out of the book E.g., Bevington Crucially, it includes DC and uncertainties Lomb-Scargle is dead I call it CSD: Cosine Sine DC Cosine + sine is usually not convenient People want amplitude and phase, not cosine & sine Especially, people want the uncertainty in phase Can approximate U(phase) from U(cosine) and U(sine) January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 23

Cycles of dependence: pdf xy, pdf ( x)pdf ( y) xy, x y spectral window function Relevant to sinusoidal detection: CSD Or DFT, and deprecated L-S Only indirectly connected to PDM and Epoch-Folding 1 0.8 correlation r 2 0.6 0.4 0.2 0 2 4 6 8 10 12 Frequency (cpd) January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 24

Escape From Normalcy 0. Background 1. Fit for DC 2. Use your uncertainties 3. Use the proper F statistic 4. Shuffle for your critical values

Your residuals aren t normal It s not personal Nobody s are But it s OK Your residuals are fine, just the way they are The critical values must be determined from your actual data The Astronomy Shuffle January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 26

Comparison of gaussian vs. uniform residuals Non-gaussian squared residuals are not χ 2 Using gaussian-based statistics will mask otherwise detectable signals uniform residual PDF sum of squares of 50 RVs gaussian residual January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 27

The Astronomy Shuffle The purpose of shuffling is to determine the critical values for declaring a detection from your actual data You set your critical values (thresholds) according to your chosen p FA (aka ) The purpose is not to examine the spectral window function January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 28

The shuffle process (1) Randomize the order of your measurements But not predictors Uncertainties that are independent of measurements remain fixed Uncertainties tied to measurements (e.g., photon counts) get shuffled with the data Combinations should be separated into independent & dependent parts Removes any periodicity from the data Be careful not to use the look for an opening algorithm Fast: O(n) operations Repeat the following 1,000 to 10,000 times: Compute a periodogram from randomized data Save the max detection parameter, D max, in a list Different frequencies are dependent, so we can only count on one independent frequency per periodogram (slow) Then... : 1.4 0.8 1.6 0.7 1.2 : January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 29

The shuffle process (2) Sort list of D max Find your D critical from p FA using the sorted list Shuffling covers-up a lot of statistical processing errors But it is best to use proper F-ish statistics, and shuffle simulations 99% at 5 for gaussian noise peak detection level 14 12 10 8 6 4 2 Adam s Data 99% at 8.4 < 90% p FA = 1% 99% 8.5 8.3 7.9 : 1.2 0.8 0.7 0.4 0 0 200 400 600 800 1000 January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 30

Future Directions

Future directions Schwarzenberg-Czerny has a paper on complex orthogonal polynomials as basis functions Similar to fitting for the first few components of a Fourier Series Allows for uncertainties and DC You choose how many components you want (~3 or 4) Small computational cost: similar to CSD January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 32

Summary Linear fits: Use your uncertainties 3-parameter CSD fit (cosine, sine, DC ) replaces Lomb-Scargle Fit for DC Stabilizes and speeds search for best-fit sinusoid over freq Also provides real power, real phase Lomb-Scargle is dead Phase Dispersion Minimization is a non-sinusoidal period-finding algorithm So is Epoch Folding Use the correct F-statistic: f (SSA/dof A )/(SSE/dof E ) Not the wrong one in [Scargle 1982] and other PDM papers Use shuffle simulations for your critical values Other tips: Use simultaneous fits for multiple frequencies Don t subtract one frequency at a time (From last time) Know the difference between aliasing, window function correlation, and sidebands These speak to the underlying physics January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 33

Thank you to... Eve Armstrong for helpful comments on my practice presentation January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 34

PDM vs. Epoch Folding vs. Lomb-Scargle PDM and EF respond at all subharmonics CSD doesn t PDM and EF respond to any waveform CSD responds to sinusoidal components Most realistic periodic signals have strong sinusoidal components Exceptions include planet transits January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 35

My recommendations for period finding of smooth-signal 1. Make dense CSD periodogram n f ~= 2n, but max of 4000 You have to shuffle to determine critical values The pure noise critical values are way off 2. Fit and subtract a sinusoid near the CSD peak f 1 3. From residuals, make another CSD periodogram Note the peak, f 2 4. From original data, simultaneously fit two sinusoids starting from f 1 and f 2 Simultaneous regression is better than step-wise 5. From these residuals, look at the periodogram January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 36

My recommendation for finding a nonsmooth signal (e.g., transit) Make dense PDM periodogram n f ~= 2n, but max of 4000 Look at the folded (aka phased ) signal Do any patterns catch your eye? January 9, 2015 Eric L. Michelsen, UCSD CASS Journal Club 37