Parameter estimation in epoch folding analysis

Similar documents
Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Basic Structure of the High-energy Observations. Preparation of the Data for the Analysis. Methods to Examine the Periodic Signal

SUPPLEMENTARY INFORMATION

Unstable Oscillations!

Variable and Periodic Signals in Astronomy

TIME SERIES ANALYSIS

Variable and Periodic Signals in Astronomy

Fourier analysis of AGN light curves: practicalities, problems and solutions. Iossif Papadakis

Period. End of Story?

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 :

Lecture 4 - Spectral Estimation

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University. False Positives in Fourier Spectra. For N = DFT length: Lecture 5 Reading

Frequency analysis of five short periodic pulsators

On characterizing the variability properties of X-ray light curves from active galaxies

Statistical Data Analysis Stat 3: p-values, parameter estimation

SEISMIC WAVE PROPAGATION. Lecture 2: Fourier Analysis

Results of the ESO-SEST Key Programme: CO in the Magellanic Clouds. V. Further CO observations of the Small Magellanic Cloud

1 Statistics Aneta Siemiginowska a chapter for X-ray Astronomy Handbook October 2008

Tutorial Sheet #2 discrete vs. continuous functions, periodicity, sampling

ANALOG AND DIGITAL SIGNAL PROCESSING ADSP - Chapter 8

Correlator I. Basics. Chapter Introduction. 8.2 Digitization Sampling. D. Anish Roshi

PAijpam.eu ADAPTIVE K-S TESTS FOR WHITE NOISE IN THE FREQUENCY DOMAIN Hossein Arsham University of Baltimore Baltimore, MD 21201, USA

Notes on Noncoherent Integration Gain

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

The Same Physics Underlying SGRs, AXPs and Radio Pulsars

ON THE SHORT-TERM ORBITAL PERIOD MODULATION OF Y LEONIS

arxiv:astro-ph/ v1 17 Oct 2001

Do red giants have short mode lifetimes?

We will of course continue using the discrete form of the power spectrum described via calculation of α and β.

Degrees-of-freedom estimation in the Student-t noise model.

THE FOURIER TRANSFORM (Fourier series for a function whose period is very, very long) Reading: Main 11.3

Statistical Comparison and Improvement of Methods for Combining Random and Harmonic Loads

Signal Modeling, Statistical Inference and Data Mining in Astrophysics

NOTES AND CORRESPONDENCE. A Quantitative Estimate of the Effect of Aliasing in Climatological Time Series

Fit assessment for nonlinear model. Application to astrometric binary orbits

A METHOD FOR NONLINEAR SYSTEM CLASSIFICATION IN THE TIME-FREQUENCY PLANE IN PRESENCE OF FRACTAL NOISE. Lorenzo Galleani, Letizia Lo Presti

Pulsation of the δ Scuti star VZ Cancri

arxiv:astro-ph/ v1 15 Dec 2006

The Spectral Density Estimation of Stationary Time Series with Missing Data

arxiv:astro-ph/ v1 16 Dec 1999

arxiv: v1 [astro-ph.sr] 6 Jul 2013

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 :

p(d θ ) l(θ ) 1.2 x x x

arxiv:astro-ph/ v1 2 Nov 2000

Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation

Time-Delay Estimation *

arxiv: v1 [astro-ph] 22 Dec 2008

PROGRAM WWZ: WAVELET ANALYSIS OF ASTRONOMICAL SIGNALS WITH IRREGULARLY SPACED ARGUMENTS

Distortion Analysis T

Physics 215b: Problem Set 5

Frequency estimation uncertainties of periodic signals

V551 Aur, an oea binary with g-mode pulsations?

Practical Statistics

An extensive photometric study of the recently discovered intermediate polar V647 Aur (1RXS J )

Noncoherent Integration Gain, and its Approximation Mark A. Richards. June 9, Signal-to-Noise Ratio and Integration in Radar Signal Processing

arxiv: v1 [astro-ph.sr] 2 Dec 2015

Mass hierarchy determination in reactor antineutrino experiments at intermediate distances. Promises and challenges. Petr Vogel, Caltech

Correlation spectroscopy

Lecture 9. PMTs and Laser Noise. Lecture 9. Photon Counting. Photomultiplier Tubes (PMTs) Laser Phase Noise. Relative Intensity

Estimate of solar radius from f-mode frequencies

Randomly Modulated Periodic Signals

1 Data Arrays and Decompositions

Analysing the Hipparcos epoch photometry of λ Bootis stars

Error Reporting Recommendations: A Report of the Standards and Criteria Committee

Chapter 4 Discrete Fourier Transform (DFT) And Signal Spectrum

3.1.1 Lightcurve, colour-colour and hardness intensity diagram

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Helioseismology: GONG/BiSON/SoHO

arxiv:astro-ph/ v1 6 Jun 2002

Continuous Fourier transform of a Gaussian Function

arxiv:astro-ph/ v2 7 Apr 1999

Some Expectations of a Non-Central Chi-Square Distribution With an Even Number of Degrees of Freedom

FOURIER TRANSFORMS. f n = 1 N. ie. cyclic. dw = F k 2 = N 1. See Brault & White 1971, A&A, , for a tutorial introduction.

The search for continuous gravitational waves: analyses from LIGO s second science run

Statistical Methods in Particle Physics

Composite Hypotheses and Generalized Likelihood Ratio Tests

2 u 1-D: 3-D: x + 2 u

Extra RR Lyrae Stars in the Original Kepler Field of View

arxiv: v1 [astro-ph] 15 Jun 2007

Physics 380H - Wave Theory Homework #10 - Solutions Fall 2004 Due 12:01 PM, Monday 2004/11/29

Geophysical Data Analysis: Discrete Inverse Theory

False Alarm Probability based on bootstrap and extreme-value methods for periodogram peaks

10. Composite Hypothesis Testing. ECE 830, Spring 2014

An Analytic Iterative Approach to Solving the Time-Independent Schrödinger Equation CHAD JUNKERMEIER, MARK TRANSTRUM, MANUEL BERRONDO

RAD229: Midterm Exam 2015/2016 October 19, Minutes. Please do not proceed to the next page until the exam begins.

A BINARY STAR WITH A SCUTI COMPONENT: AB CASSIOPEIAE E. Soydugan, 1 O. Dem_ircan, 1 M. C. Akan, 2 and F. Soydugan 1

arxiv: v2 [astro-ph.sr] 16 May 2013

EE301 Signals and Systems In-Class Exam Exam 3 Thursday, Apr. 20, Cover Sheet

PDV Uncertainty Estimation & Method Comparisons

Time and Spatial Series and Transforms

5.74 Introductory Quantum Mechanics II

ENSC327 Communications Systems 2: Fourier Representations. Jie Liang School of Engineering Science Simon Fraser University

16 SUPERPOSITION & STANDING WAVES

9.1 Years of All-Sky Hard X-ray Monitoring with BATSE

AUTOMATIC MORPHOLOGICAL CLASSIFICATION OF GALAXIES. 1. Introduction

Analysis of Variance

Region 16 Board of Education. Precalculus Curriculum

CHAPTER 4 FOURIER SERIES S A B A R I N A I S M A I L

LIMITS AND DERIVATIVES

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2013 Lecture 4 For next week be sure to have read Chapter 5 of

Transcription:

ASTRONOMY & ASTROPHYSICS MAY II 1996, PAGE 197 SUPPLEMENT SERIES Astron. Astrophys. Suppl. Ser. 117, 197-21 (1996) Parameter estimation in epoch folding analysis S. Larsson Stockholm Observatory, S-13336 Saltsjöbaden, Sweden Internet: stefan@astro.su.se Received January 16; accepted October 3, 1995 Abstract. We describe a procedure that we have implemented to use epoch folding to estimate pulse period, shape, amplitude and phase (with uncertainties) for coherent oscillations in time series data. We improve on traditional techniques by fitting the χ 2 as a function of test period with a response function which takes into account both data sampling and the oscillation pulse shape. It is shown that the epoch folding procedure makes optimum usage of the full pulse shape information, or equivalently all its Fourier components, in the period determination. An analytic expression for the period error is also given for this general, non-sinusoidal, case. The error estimate, which is equivalent to that for a least-squares fit of a set of harmonically related Fourier components, is verified by Monte Carlo simulations. Key words: methods: data analysis 1. Introduction The problem of estimating periodicities in time series data is common in astronomy as well as in many other sciences. A large variety of methods, with different statistical properties, are in use. The choice of the method, such as Fourier transform, epoch folding, Rayleigh folding, etc., will depend on signal-to-noise ratio, data length, evenness of sampling and on the character of the signal to be analyzed. In this paper we will consider epoch folding, which is often used in cases where one wants to search for a coherent signal in large amounts of data with low or moderate signal-to-noise ratio. One advantage of epoch folding is that it is more easily applied to cases of nonevenly sampled data than e.g. Fourier methods. Investigations of the statistical properties of epoch folding have mainly concerned the estimate of detection significances (Schwarzenberg-Czerny 1989 and Davies 199, 1991). In this paper we describe a procedure by which we use epoch folding to estimate pulse period, shape, amplitude and phase (with uncertainties) for a single coherent oscillation in time series data. Our two main improvements of the traditional epoch folding techniques are: 1. We estimate the oscillation period by fitting a χ 2 response function which takes into account both data sampling and oscillation pulse shape. 2. We give an analytic expression for the period uncertainty in the general, non-sinusoidal case, and show that our method gives errors consistent with basic statistical limits for least-squares fits to the time series. For non-sinusoidal oscillations the epoch folding automatically adds together the signal of all harmonic components, so that the period determination makes maximum use of the available information. Our parameter estimation procedure works well also for non-evenly sampled data, and as long as the sampling is reasonably even the error estimates will still provide quite reliable uncertainty limits for the oscillation parameters. 2. Method In most applications of epoch folding, χ 2 (over the folded pulse) is calculated for a range of test periods and the oscillation period is taken as the period at the largest χ 2 value or at the χ 2 maximum estimated from a polynomial fit near the peak. A suggestion by Leahy (1987) to estimate period and amplitude by fitting an analytic χ 2 function to χ 2 (P test ) was the starting point for the development of our estimation procedure. In summary the analysis contain the following steps which will be further elaborated below. 1. Calculate χ 2 as a function of test folding period. 2. Calculate the χ 2 response function by sampling a synthetic pulse at the same times as the data. Then fit this χ 2 function to that of the data to estimate pulse period and amplitude. 3. Fold the data with the best fit period and make a Fourier decomposition of the pulse profile. 4. Using the Fourier pulse determination, go back to step 2. It is in most cases sufficient to rerun steps 2 & 3 only once or twice.

198 S. Larsson: Parameter estimation in epoch folding analysis 2.1. Calculatingχ 2 In epoch folding the data is folded modulo a test period and coadded into N b phase bins. We then calculate χ 2 = N b i=1 (x i x) 2, (1) so that χ 2 N b 1 if the scatter of the bin points over the pulse cycle is as expected for Gaussian noise with standard deviations σ i.ahighχ 2 value ( N b 1) will signal the presence of a periodicity for which the significance can be estimated from the χ 2 distribution function for N b 1 degrees of freedom. We want to make two comments on our definition and use of χ 2 from Eq. (1). First, since our aim is to determine the parameters for an oscillation we already know (or assume) is in the data, the σ i values are calculated from either the variance of the data points in each phase bin or from initial error estimates associated to each data point. To estimate the significance of a χ 2 value the standard deviations should instead (under the null hypothesis) be defined as σ i = σ tot / n i,whereσ tot is the standard deviation of the unfolded time series and n i is the number of data points in bin i. Secondly, for short time series a more proper test statistic than χ 2 is to use the L-statistic for estimates of significances (Schwarzenberg- Czerny 1989 and Davies 199, 1991). 2.2. Estimating pulse period Just as in the case of power density spectra the peak in the χ 2 (P test ) is a broadened function with sidelobes. The χ 2 (P test ) function depends on the time series sampling andlengthaswellasonthepulseshape.foranevenly sampled sinusoidal signal χ 2 sin 2 (x)/x 2 (Leahy 1987) as shown in Fig. 1. Leahy suggested that the pulsation period and amplitude be determined by fitting the analytic χ 2 -function for a sinusoid to χ 2 (P test )calculated from the data. From Monte Carlo simulations Leahy also gave a relation for estimating the uncertainty in the period and amplitude determinations. To estimate the error in period we instead use an analytic expression derived for power density spectra and least-squares fitted sinusoids. Kovács (19) estimated the frequency shift in power density spectra by nearby white noise frequency components and found the gaussian frequency error to be 2aσtot σ f =, (2) NAT In this equation, N is the total number of data points, A is the sinusoidal amplitude and T is the total time length for the data. For the parameter a, Kovács had to rely on a Monte Carlo estimate which gave a value of a.45. Gilliland & Fisher (1985), in the analysis of chromospheric activity data, also used Eq. (2) but with a value corresponding to a =.53. An expression which is essentially the same as Eq. (2) is given by Bloomfield (1976), σ 2 i 24σtot σ ω = + smaller terms, (3) N 3/2 A Since ω is given in radians per sampling time interval this corresponds to a =.551. Note that Bloomfield assumes that the time series is evenly sampled. Using the analytic value of a derived by Bloomfield, we can rewrite Eq. (2) in terms of period error as, σp 2 = 6σ2 tot, (4) π 2 NA 2 T 2P4 Let us now consider a non-sinusoidal oscillation. In this case the time series can be described as a set of Fourier components with frequencies, ν k = kν 1. Since the frequencies are related by exact (integer) multiples, it is possible to increase the precision of the period determination by taking a weighted mean of the fundamental period associated with these harmonics. In epoch folding this is done automatically since the higher harmonics are folded with k cycles over one fundamental pulse cycle. We can then also generalize Eq. (4) to give σp 2 = 6σ2 tot P 4 π 2 NT 2 m k=1 k2 A 2, (5) k Toapplythisestimateinpracticeweneedtoknowthe harmonic content of the signal, which is determined in the next step of the analysis. 2.5 1 4 2. 1 4 1.5 1 4 1. 1 4 5. 1 3.97.9.99 1. 1.1 1.2 1.3 Fig. 1. The plot shows the expected χ 2 calculated over the pulse shape as a function of the folding period when the evenly sampled time series contains a pure sinusoidal oscillation 2.3. Fourier decomposition of the pulse profile The Fourier decomposition is done as follows. The best fit period gives a folded pulse with value x i and standard error σ i (in bin i). After subtraction of the mean, the number of degrees of freedom is N b 1. Each Fourier component has two parameters so fitting (N b 1)/2 harmonics will absorb all white noise in the folded pulse, or equivalently the

S. Larsson: Parameter estimation in epoch folding analysis 199 white noise power per fitted harmonic is σ 2 /[(N b 1)/2]. After a linear regression fit of the desired number of Fourier components, this white noise contribution should be subtracted from each harmonic variance (A 2 k /2). Each harmonic is fitted as a k sin(kω 1 t)+b k cos(kω 1 t)fromwhich the phase for each harmonic can be calculated. The fitted amplitude values have to be corrected for binning, which is done by multiplying with the factor g(n b )= 1 f(nb ) = 1 1 π2 3N 2 b + 2π4 45N 4 b, (6) (where f(n b ) is the related factor used by Leahy 1987.) To select significant Fourier components we use the uncorrected amplitudes and the noise variance to estimate the false alarm probability for each component. This can be done in the same way as for peaks in power density spectra. In our case we have error estimates for the points in the pulse profile, so we can test the significance of each Fourier amplitude. For Gaussian white noise, the variance estimates, V k, at the Fourier frequencies are independently distributed with a probability distribution proportional to a χ 2 with two degrees of freedom, i.e. V k σn 2 = χ 2 2, (7) b This distribution is exponential and gives a probability for V k /σn 2 b to be larger than some value, z of (see e.g. Priestley 1981, p. 388), P [( V z k 1 σn 2 ) >z]= b 2 exp ( s 2 )ds=exp( z ), (8) 2 2.4. Iteration For the Fourier decomposition we now select the components with amplitudes above some significance level, using the result in Eq. (8). This Fourier pulse is then sampled at the same time points as the data and a new χ 2 response function is calculated and fitted to the χ 2 -values for the data. It is usually sufficient to rerun steps 2 & 3, only once or twice. 3. Example and verification In order to verify our parameter and uncertainty estimation procedure we have made extensive Monte Carlo simulations injecting coherent oscillations with a range of different pulse shapes and signal-to-noise ratios. In each case the error estimates were compared with the standard deviation in the parameter estimates for some 2 to simulations. The results from these simulations were found to be consistent with the estimates in Sect. 2. In particular the deviation of the period estimate for a set of 24 14 6 5 15 2 25 14 6 2 4 6 8 1 14 6 2 4 6 8 1 Fig. 2. a-c). An example of a data simulation used to test the analysis procedure. The data is evenly sampled (25 points per time unit) except for the two time gaps. a) The full time series, 224 points. b) The first part of the time series without any noise added. The injected oscillation has a period equal to one and is composed of two equally strong (A =5)Fourier components (the fundamental and the first overtone). c) The same time series segment as in b) but after adding Poisson noise

2 S. Larsson: Parameter estimation in epoch folding analysis a 6 4 2 11 6 4 2.98 1. 1.2 b.98 1. 1.2 Fig. 3. a-b). χ 2 calculated over the pulse shape for a range of different folding periods near the true fundamental period. The solid line is for the simulated data plotted in Fig. 2a. a) Best fit (dashed curve) theoretical χ 2 -function for the case of a sinusoidal oscillation. b) Best fit (dashed curve) numerically calculated χ 2 -function for the particular data sampling and pulse shape (determined from the folded pulse) of this simulation simulations containing a single sinusoidal oscillation was compared to the predicted value from our Eq. (4). The ratio of the observed to predicted standard deviation was 1.26±.18. The corresponding value for the parameter a is.565±.1, to be compared with the theoretical value of.551 and the estimates of.45 and.53 given by Kovács (19) and Gilliland & Fisher (1985) respectively. In our simulation the data was folded in 16 phase bins. The binning effectively reduces the amplitude of the folded data by the factor f(n b ) (see Eq. (6)). The expected standard deviation does therefore, in this case, correspond to a =.555 so our simulation estimate of a is consistent with the analytic expression to within the statistical uncertainty (1σ). Note that these results were obtained for evenly sampled data. As already mentioned the error estimates will be quite reasonable also for mildly non-even 9 1. 1.5 2. 2.5 3. 15 1 5-5 -1-15..5 1. 1.5 2. 11 9..5 1. 1.5 2. Fig. 4. a-c). Oscillation pulse shapes (two cycles). a) Pulse shape obtained by folding the simulated data at the best fit period. b) Fourier pulse obtained by fitting Fourier components to the pulse in f. c) The shape of the original pulse injected into the simulation

S. Larsson: Parameter estimation in epoch folding analysis 21 sampling. One example of a simulations with two substantial data gaps is shown together with analysis results in Figs. 2-4. The simulated data is shown in Figs. 2 a-c with a blow up of the first 1% of the data to show the injected oscillation (with two equally strong Fourier components) before and after the Poisson noise was added. The calculated χ 2 as a function of test period is plotted in Figs. 3a and 3b. In the first of these figures it is shown together with the best fit χ 2 function for an evenly sampled sinusoid. The narrow width of the χ 2 peak compared to that of the analytic function is because it contains a Fourier component also at the first harmonic. A more precise period determination is of course possible when the peak is more narrow. This is just a graphical representation of how the period error in Eq. (5) depends on the harmonic content of the pulse shape. In Fig. 3b we show the resulting fit after we have applied our full parameter estimation procedure. Itisclearfromthisfitthatmostoftheχ 2 modulation pattern is an effect of windowing and pulse shape. Finally in Fig. 4 we compare the pulse shape of the inject oscillation with the folded pulse shape for the best fit period and the pulse shape resulting from the Fourier decomposition of the folded pulse. For unevenly sampled data the uncertainty in the period estimate depends on the actual sampling distribution. The more asymmetric and clumped the data distribution is, the larger is the deviation of errors from the estimate of Eq. (5). For data that consist of evenly sampled segments separated by gaps (such as in Fig. 2) Eq. (5) will remain a good error estimate as long as the data is not strongly concentrated to one part of the total time range. This is what one would expect intuitively, but it is also what we found from simulations with 1 12 data gaps covering up to 99% of the length of the time series. The two most extensive sets of simulations that we ran gave period deviations corresponding to a =.469 ±.22 and.532 ±.37. The first set consisted of 3 simulations with a sampling distribution almost identical to that in Fig. 2, i.e. 224 points that were evenly sampled except for two gaps covering a fraction.74 of the total time series length. The second set of 16 simulations had the same number of points, total length and gap fraction but with six gaps instead of two. These gaped data simulations were made with the same pulsation parameters as in the previous example (Figs. 2-4). In the limit of a time series with few data points our folding procedure provides no advantage to directly fitting Fourier components to the data. The method can still be applied in such cases however, as long as there is good, but not necessarily complete, coverage of the oscillation cycle. Also, for time series with gaps the length of data subsegments with respect to the oscillation period will only have an affect on the aliasing problem (the sidelobe pattern of the χ 2 function) and not on the applicability of the method. This last statement is true if the time series contains no variability component in addition to the coherent oscillation and the white noise. This is really the most important assumption on which our analysis procedure is based. It is also the main limitation that has to be considered when applying the method to real observational data. 4. Summary We have presented a procedure by which epoch folding is used to estimate pulse period, shape, amplitude and phase for coherent oscillations in time series data. We also give a generalized analytic expression for the uncertainty in the period estimate. The error estimate is verified by Monte Carlo simulations for evenly sampled data with and without data gaps. Acknowledgements. This work was supported by the Swedish Natural Science Research Council and the Swedish National Space Board. References Bloomfield P., 1976, Fourier analysis of time series: An introduction. John Wiley & Sons, New York Davies S.R., 199, MNRAS 244, 93 Davies S.R., 1991, MNRAS 251, 64 Gilliland R.L., Fisher R., 1985, PASP 97, 285 Kovács G., 1981, Ap&SS 78, 175 Leahy D.A., 1987, A&A 1, 275 Priestley M.B., 1981, Spect. Anal. of Time Ser. Academic Press, London Schwarzenberg-Czerny A., 1989, MNRAS 241, 153