SPECTRAL ANALYSIS OF NON-UNIFORMLY SAMPLED DATA: A NEW APPROACH VERSUS THE PERIODOGRAM

SPECTRAL ANALYSIS OF NON-UNIFORMLY SAMPLED DATA: A NEW APPROACH VERSUS THE PERIODOGRAM Hao He, Jian Li and Petre Stoica Dept. of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA Dept. of Information Technology, Uppsala University, Uppsala, Sweden ABSTRACT We begin by revisiting the plain least-squares periodogram () for real-valued data. Then we introduce a new method for spectral analysis of non-uniformly sampled data by iteratively weighting, and we name the new method realvalued iterative adaptive approach (). and are most suitable for data sequences with discrete spectra. For such type of data, we present a procedure to obtain a parametric spectral estimate, from the or nonparametric estimate, by means of the Bayesian information criterion (BIC). We also discuss a possible strategy for designing the sampling pattern of future measurements. Several numerical examples are provided to illustrate the performance of our proposed approaches. Index Terms Spectral analysis, non-uniformly sampled data, the periodogram, the least-squares method, the iterative adaptive approach, BIC. 1. INTRODUCTION Spectral analysis of non-uniformly sampled (NUS) data is widely used in many areas, like economics, biomedicine, and especially astronomy. can be readily applied to the NUS data case (see, e.g., [1]), but it suffers from both local and global leakage problems. Many new approaches have been proposed (see, e.g., [][3][]) to outperform the periodogram. A recent detailed discussion can be found in [5], where the well-known Capon, MUSIC and ESPRIT methods are extended from uniform to NUS data cases. The amplitude and phase estimation (APES) method, proposed in [] for uniformly sampled data, has significantly less leakage than the periodogram. We follow here the ideas in [7] to extend APES to the NUS data case, and the resulting new method is referred to as. Among all methods mentioned above, Capon (when used with the same covariance matrix as ) is essentially the same as the first iteration of, and other parametric methods all have special requirements (e.g., a guess of the number of sinusoidal components This work was supported in part by the National Science Foundation under Grant No. CCF-37 and ECCS-7977, and by the Swedish Research Council (VR). or further assumptions on noise) which are not as favorable as the non-parametric. We also combine with BIC (see, e.g., []) to provide parametric spectral estimates in the form of a number of estimated sinusoidal components that are deemed to fit the data well. The use of BIC bypasses the need for testing the significance of the periodogram peaks, for which, to the best of our knowledge, no satisfactory procedure exists for the NUS data case (see [9] and the references therein). Finally, we present a method for designing an optimal sampling pattern (see, e.g., [1][]) that imizes an objective function based on the spectral window. In doing so, we assume that a sufficient number of observations are already available, from which we can get a reasonably accurate spectral estimate. We make use of this spectral estimate to design the sampling times when future measurements should be performed.. AND THE SPECTRAL WINDOW Let {y(t n )} N denote the zero mean, real-valued sequence whose spectral analysis is our main goal. The classical Fourier transform-based periodogram (FP) P F (ω) = 1 N y(t n )e jωtn can be written as P F (ω) = ˆβ(ω), where ˆβ(ω) comes from the following least-squares (LS) data fitting problem: ˆβ(ω) =arg β(ω) y(t n ) β(ω)e jωtn. (1) Because {y(t n )} N R, it is better to use α cos(ωt n + φ) instead of βe jωtn to fit {y(t n )} N and the so obtained LS criterion is: a,b [y(t n ) a cos(ωt n ) b sin(ωt n )], () where a = α cos φ and b = α sin φ (we omit the dependence of α and φ on ω, for notational simplicity). With the

following additional notations denote the number of the grid points ( x is the largest integer x). Thus the frequency grid is ω k = kδω, k =1,...,K. y(t 1 ) [ ] a(ω) y =., θ(ω) =, A(ω) = [ c(ω) s(ω) ] The notations in Eq. (3) are accordingly modified to A k =, b(ω) A(ω k ), θ k = θ(ωk ). Now consider again the LS problem: y(t N ) c(ω) = cos(ωt 1 ). cos(ωt N ), s(ω) = sin(ωt 1 ). sin(ωt N ), (3) the LS fitting criterion in Eq. () can be written in the following vector form: θ y Aθ, () where denotes the Euclidean norm (and the dependence of A and θ on ω is omitted). The solution to the imization problem in Eq. () is well-known to be: ˆθ =(A T A) 1 A T y, (5) and the LS periodogram is accordingly given by P LS (ω) = 1 N (Aˆθ) T (Aˆθ) = 1 N ˆθ T (A T A)ˆθ. () To analyze the performance of, consider a possibly existing sinusoidal componet ȳ =[ȳ(t 1 ) ȳ(t N )] T, where ȳ(t n ) = ā cos( ωt n )+ b sin( ωt n ),n = 1,...,N. Its effect on Eq. (5) is to introduce an error term (A T A) 1 A T ȳ. For non-pathological sampling patterns, R N I and N cos((ω + ω)t n) N sin((ω + ω)t n). Then the squared Euclidean norm of the error term can be well approximated by 1 N P ( ω) e j(ω ω)tn, (7) where P ( ω) =ā (ω) + b (ω). The part of Eq. (7) that depends on the sampling pattern viz. W (ω) = e jωtn, () is called the spectral window. Then we follow [11] to define the maximum frequency interval [,ω max ] that can be dealt with unambiguously. The spectral window attains its maximum value of N at ω =: W () = N W (ω), ω. We evaluate W (ω) for ω> to find the smallest frequency value ω, at which the spectral window has a peak whose height is close to N. Then ω max = ω / can be claimed as the largest frequency of the data sequence in question. 3. AND BIC Let Δω denote the frequency grid size (small enough as to not affect the potential resolution), and let K = ω max /Δω y A k θ k, k =1,...,K. (9) With respect to the sinusoidal component with frequency ω k, all other sinusoidal components can be considered as noise and their contributions to the spectrum can be described by the covariance matrix defined as Q k = p=1,p k A p D p A T p, D p = a (ω p )+b (ω p ) [ 1 1 () Assug that Q k is available, and that it is invertible (a necessary condition is that (K 1) N, which is easily satisfied in general), it would make sense to consider the following weighted LS (WLS) criterion (which is more accurate than the corresponding LS problem, under quite general conditions), instead of Eq. (9), [y A k θ k ] T Q 1 k [y A kθ k ]. (11) The vector θ k that imizes Eq. (11) is given by: ˆθ k =(A T k Q 1 k A k) 1 (A T k Q 1 y). (1) Using the matrix inversion lemma, we obtain the following alternative expression for the WLS estimate: where Γ = ˆθ k =(A T k Γ 1 A k ) 1 (A T k Γ 1 y), (13) A p D p A T p = Q k + A k D k A T k. () p=1 Eq. (13) is computationally more appealing than Eq. (1), as Γ 1 in Eq. (13) needs to be computed only once for all values of k =1,...,K. Because Γ in Eq. (13) depends on the very quantities that we want to estimate, viz. the {θ k } K k=1, the only apparent solution is some form of iterative process, for example the algorithm outlined in Table 1. In most applications, the algorithm is expected to require no more than iterations (see, e.g., the numerical examples in Section 5). Next we present BIC as an alternative to test the significance of the doant peaks. Let {P (ω k )} K k=1 denote the values taken by either the or periodogram, and let { ω m, ă m, b m } M m=1 denote the frequency, amplitude and phase related parameters corresponding to the M largest peaks of {P (ω k )} K k=1, arranged in a decreasing order: P ( ω 1 ) P ( ω ) P ( ω M ), (17) k ].

Table 1. The Algorithm Initialization: Use the LS method (see Eq. (5)) to obtain initial estimates of {θ k }, denoted by {ˆθ k } Iteration: Let {ˆθ i k} denote the estimates of {θ k } at the i th iteration, and let ˆΓ i denote the estimate of Γ obtained from {ˆθ i k}. Fori =, 1,,..., compute: ˆθ i+1 k =[A T k (ˆΓ i ) 1 A k ] 1 [A T k (ˆΓ i ) 1 y], k =1,...,K, (15) until a given number of iterations is performed. Periodogram Calculation: Let {ˆθ I k} denote the estimates of {θ k } obtained by the above iterative process. The periodogram is computed as (see Eq. ()): P (ω k )= 1 N (ˆθ I k) T (A T k A k )(ˆθ I k), k=1,...,k. (1) for M 1. Under the idealizing assumptions that the data sequence consists of a finite number of sinusoidal components and of normal white noise, and that { ω m, ă m, b m } M m=1 are the maximum likelihood (ML) estimates, the BIC rule estimates M as follows: where ˆM =arg M=,1,,... BIC(M), (1) BIC(M) =ρm ln N [ M +N ln y(t n ) (ă m cos( ω m t n )+ b m sin( ω m t n )) m=1 ] (19) and where ρ = ρ ML =5(see, e.g., [1][]). It follows from the derivation of BIC (see, e.g., the cited works) that the large complexity penalty ρ ML is mainly due to the high accuracy of the ML estimates. By comparison, { ω m, ă m, b m } M m=1 estimated with and are expected to be less accurate than the ML estimates. Consequently, we suggest the use of BIC with smaller values of ρ in the case of and than the ρ ML =5, namely: ρ =1and ρ =. () Admittedly, these choices are somewhat ad-hoc, but the corresponding BIC rules are simple to use and they appear to provide accurate estimates of M (see Section 5 for details).. SAMPLING PATTERN DESIGN Suppose that the data samples {y(t n )} N have already been collected and a spectral estimate {P (ω k )} K k=1 has been obtained based on them; {τ g } G g=1 = S is the set of times when future measurements of y(t) could in principle be performed. Our goal is to choose the sampling times {x f } F f=1 from {τ g } G g=1 for future measurements in an optimal manner. The previous sections have evidenced that the performance of both and depends on the spectral window. With the {P (ω k )} K k=1 imposed as weights, the integrated spectral window is given at ω = ω p (see Eq. (7)) by G P (ω k ) e j(ωp ω k)t n + e j(ωp ω k)τ g. (1) k=1 g=1 A well-known fact is that the integrated spectral window is a constant in the uniform sampling case. This is also approximately true in the non-uniformly sampling case, which is the reason why we have to assume the prior information {P (ω k )} K k=1 available (otherwise the integrated spectral window would be insensitive to the choice of {x f } F f=1 ). For a given ω p, the larger the P (ω k ), the more emphasis will be put on imizing the sidelobe level of the spectral window corresponding to ω k. The function in Eq. (1) still depends on ω p. We eliate the frequency dependence by integrating Eq. (1) weighted by P (ω p ), with respect to ω p as well, and obtain the following imization problem: {μ g} G g=1 p=1 k=1 P (ω p ) G P (ω k ) e j(ωp ω k)t n + μ g e j(ωp ω k)τ g, s.t. μ g {, 1} and g=1 G μ g = F, () g=1 where {μ g } G g=1 {, 1} are Boolean variables. Because the Boolean constraint makes the problem hard, we relax it to: μ g 1, g =1,...,G, (3) with which Eq. () becomes a linearly constrained quadratic program that can be solved efficiently, e.g., by the MATLAB 1 function quadprog. Once the solution {μ g } G g=1 is found, we choose its F largest elements and set them to 1 (and the rest to ). Letting g 1 <g < <g F denote the indices of these elements, we use the following sampling times x f = τ gf, f =1,...,F () 1 MATLAB is a registered trademark of The MathWorks, Inc.

as an approximate solution to the imization problem of Eq. (). (See, e.g., [13] for a general discussion on this type of approximation to a Boolean optimization problem.) 5.1. Simulated Data 5. NUMERICAL EXAMPLES 3 Consider a data sequence consisting of M = 3 sinusoidal components with frequencies.1,. and.1 Hz, and amplitudes, and 5, respectively. The phases of the 3 sinusoids are independently and uniformly distributed over [, π] and the additive noise is white and normally distributed with mean and variance σ =.1. The sampling pattern follows a Poisson process with parameter λ =.1s 1, that is, the sampling times are exponentially distributed with mean μ = 1/λ = s. We generate N =samples and the sampling times are rounded off to ten decimals (see [11]). Figure 1 presents the spectral estimates averaged over independent realizations while Figure shows the overlapped estimates from the first 15 Monte-Carlo trials. nearly misses the smallest sinusoid while successfully resolves all three sinusoids. Note that suffers from much less variability than from one trial to another. Next, we use BIC to select the significant peaks of the spectra obtained by and. Let M BIC denote the number of sinusoids picked up by BIC. Then the probabilities of correct detection, false alarm and miss are defined as P D = Prob.(M BIC = 3), P FA = Prob.(M BIC > 3) and P M = Prob.(M BIC < 3). Figure 3 shows the scatter plots of the estimates for the doant sinusoids obtained via using +BIC and +BIC. The amplitude estimates of the two closely-spaced sinusoids are biased and the smallest sinusoid is frequently missed. For, P D is only.3, P FA is., and P M is as high as.97. Compared to, shows much better stability and accuracy, as illustrated in Figure 3. For, P D =.9, P FA =., and P M =.. We also note that usually after 15 iterations s performance does not improve visibly. So in the above and all subsequent examples, we terate after 15 iterations. 5.. Sampling Pattern Design Suppose we have already collected data at {t n } N=3, which are the sampling times of the measured astronomical data HD195 (whose original analysis led to the discovery of an extrasolar planet named ET-1, see []). The unit of the sampling times is changed from day to second for simplicity. Figure shows the sampling pattern and Figure shows the corresponding spectral window. The smallest frequency f > at which the spectral window has a peak close to N is approximately 1 Hz. According to the discussion in Section, f max = f /.5 Hz. The set of sampling times of possible future measurements {τ g } G= g=1 is randomly selected 5.1..3..5. 3 5.1..3..5. Fig. 1. Average spectral estimates from Monte-Carlo trials. The solid line is the estimated spectrum and the circles represent the true frequencies and amplitudes of the 3 sinusoids. and. 3 5 7.1..3..5. 3 5 7.1..3..5. Fig.. Spectral estimates from 15 Monte-Carlo trials. The solid lines are the estimated spectra and the circles represent the true frequencies and amplitudes of the 3 sinusoids. and.

by and the amplitude estimates are also quite accurate. 1 1.1..3..5..1..3..5. Fig. 3. Scatter plot of the doant sinusoidal components selected by BIC in Monte-Carlo trials. Dots symbolize the estimates while circles represent the true frequencies and amplitudes of the 3 sinusoids. +BIC, and +BIC. from the interval [, ] s, which covers approximately the same period of time as {t n } N=3. Consider the same data sequence as in the previous subsection, except that the amplitudes of the two closely-spaced sinusoids (at. and.1 Hz) are now set to and 5, respectively. Figure 5 shows the scatter plots of the estimated frequencies and amplitudes using this data sequence sampled at {t n } N=3. In most trials +BIC catches only the strongest sinusoid. +BIC, on the other hand, has a much better detection rate, although the amplitude estimates are somewhat biased. The corresponding probabilities P D, P FA and P M are.1,. and.99 for and.73,.1 and. for, respectively. Next we use the optimal sampling design technique outlined in Section to choose {x f } F f=1 =3 from {τ g } G= g=1. In Eq. () the {P (ω k )} K k=1 are equal to the spectrum estimates obtained from the given 3 samples and N is chosen as because we want to achieve better estimation from only the designed samples. Figure shows the designed sampling pattern and the corresponding spectral window (although the inferred f max is much larger than.5 Hz, we still choose f max =.5Hz for a fair comparison). With the data sequence sampled at {x f } F f=1 =3, we obtain the scatter plots of the new estimates in Figure 7. The new probabilities P D, P FA and P M are.5,. and.95 for ; and.9,. and. for. All 3 sinusoidal components are now clearly resolved. CONCLUDING REMARKS We have presented a new spectral estimation method named for real-valued non-uniformly sampled data. is non-parametric, has better resolution than, and is most suitable for sinusoidal data, in which case we combine with BIC to obtain a parametric spectral estimate. We have also shown a sampling pattern design strategy, which makes use of the spectral window and the already obtained spectral estimates for designing the future sampling times in an optimal manner. Several numerical examples have been provided to show the properties of, especially compared with the benchmark periodogram. 7. REFERENCES [1] J. D. Scargle, Studies in astronomical time series analysis II: Statistical aspects of spectral analysis of unevenly spaced data, The Astrophysical Journal, vol. 3, pp. 35 53, December 19. [] R. H. Jones, Fitting a continuous time autoregression to discrete data, Appl. Time Series Anal. II, pp. 51, 191. [3] R. J. Martin, Autoregression and irregular sampling: Spectral estimation, Signal Processing, vol. 77, pp. 139 157, 1999. [] A. Rivoira and G. A. Fleury, A consistent nonparametric spectral estimator for randomly sampled signals, IEEE Transactions on Signal Processing, vol. 5, pp. 33 395,. [5] P. Stoica and N. Sandgren, Spectral analysis of irregularlysampled data: Paralleling the regularly-sampled data approaches, Digital Signal Processng, vol. 1, no., pp. 71 73, November. [] J. Li and P. Stoica, An adaptive filtering approach to spectral estimation and SAR imaging, IEEE Transactions on Signal Processing, vol., no., pp. 9, June 199. [7] T. Yardibi, J. Li, P. Stoica, M. Xue, and A. B. Baggeroer, Source localization and sensing: A nonparametric iterative adaptive approach based on weighted least squares, submitted to IEEE Transactions on Aerospace and Electronic Systems, 7. [] P. Stoica and Y. Selén, Model-order selection: a review of information criterion rules, IEEE Signal Processing Magazine, vol. 1, no., pp. 3 7, July. [9] F. A. M. Frescura, C. A. Engelbrecht, and B. S. Frank, Significance tests for periodogram peaks, E-print arxiv:7.5, June 7. [] E. S. Saunders, T. Naylor, and A. Allan, Optimal placement of a limited number of observations for period searches, Astronomy & Astrophysics, vol. 55, pp. 757 73, May. [11] L. Eyer and P. Bartholdi, Variable stars: Which Nyquist frequency?, Astronomy and Astrophysics, Supplement Series, vol. 135, pp. 1 3, February 1999. [1] P. Stoica and R. L. Moses, Spectral Analysis of Signals, Prentice-Hall, Upper Saddle River, NJ, 5. [13] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, United Kingdom,. [] J. Ge, J. van Eyken, S. Mahadevan, C. DeWitt, S. R. Kane, R. Cohen, A. V. Heuvel, S. W. Fleg, and P. Guo, The first extrasolar planet discovered with a new-generation highthroughput Doppler instrument, The Astrophysical Journal, vol., pp. 3 95, September.

Sampling Pattern Sampling Pattern 1 1 1 Sampling Times (sec) 1 1 1 Sampling Times (sec) Spectral Window (db) Spectral Window (db) 3 3.... 1 1. 3 1 Fig.. Sampling pattern and spectral window for the given sampling times in the sampling design example. The sampling pattern used for the Monte-Carlo trials in Figures 5. The distance between two consecutive bars represents the sampling interval. The corresponding spectral window. Fig.. Sampling pattern and spectral window for the designed sampling times in the sampling design example. The sampling pattern used for the Monte-Carlo trials in Figures 7. The distance between two consecutive bars represents the sampling interval. The corresponding spectral window. 1 1.5.1.15..5.3.35..5.5.5.1.15..5.3.35..5.5 1 1.5.1.15..5.3.35..5.5.5.1.15..5.3.35..5.5 Fig. 5. Scatter plot of the doant sinusoidal components selected by BIC in Monte-Carlo trials, using the given 3 sampling times. Dots symbolize the estimates while circles represent the true frequencies and amplitudes of the 3 sinusoids. +BIC, and +BIC. Fig. 7. Scatter plot of the doant sinusoidal components selected by BIC in Monte-Carlo trials, using the designed 3 sampling times. Dots symbolize the estimates while circles represent the true frequencies and amplitudes of the 3 sinusoids. +BIC, and +BIC.