Utility of Correlation Functions

Utility of Correlation Functions 1. As a means for estimating power spectra (e.g. a correlator + WK theorem). 2. For establishing characteristic time scales in time series (width of the ACF or ACV). 3. For testing whether two time series are related (CCF, CCV). 4. As a basis for calculating correlation matrices used in estimation, principal component analysis, etc. 5. As the measurement basis in interferometry (CCF of signals for different antennas, the visibility function). 6. etc. 1

Correlation Function Example Consider the autocorrelation function (ACF) of a zero-mean WSS process. We want to consider how the ACF converges as well as understand what the ACF actually quantifies. The figure shows a time series and its ACF along with an ACF averaged over 10 realizations of the time series. Figure 1: Top panel: Time series of Gaussian noise with unity variance and a correlation time of 21 steps. Bottom panel: ACF of the time series in the top panel along with the ACF averaged over 10 realizations of the time series. 2

The time series was created by taking a realization of white, Gaussian noise and smoothing it with a boxcar filter with width of 21 samples. Features of the ACF include: 1. The maximum at zero lag has a value equal to the variance in the time series (set to unity). 2. The feature that maximizes at zero lag is the same in both ACFs. 3. The decay from the maximum is on a time scale 20 steps, which is order of the smoothing time used to create the time series. 4. There are statistical variations centered on zero correlation. These variations are larger in the ACF calculated from single-realization and are estimation errors in the ACF. The width of the persistant feature in the ACF is the autocorrelation time of the process, which we define as W y. This quantifies the time interval over which the process decorrelates. 3

The estimation error in the ACF at larger lags is determined by the number of independent fluctuations N i in the time series used. In a single time series of length T, this number is N i T/W y. For the example, N i 1024/21 50. The estimation error in the ACF for a single time series will be δc y C y (0)/ N i 0.14 so we expect variations about zero at approximately this level. For the 10-realization average, we expect the estimation errors to decrease by another factor of 1/ 10 to about 0.045. 4

Calculating Correlation Functions Consider a discrete data set with N equally spaced samples, x i, i = 1, N. Two slightly different estimators can be used to calculate the ACF. The first normalizes by the number of terms (lagged products) in the sum for each lag, N τ : ˆR τ = 1 N τ x i x i+τ, (N τ) i=1 0 τ < N R τ τ < 0 This normalization yields an unbiased result because it can be shown that ˆR τ = R τ, i.e. the ensemble average of the estimator equals the ensemble-average ACF. However, for τ N, the estimation errors are large because there are few terms in the sum and their departure from the true ACF is amplified by the 1/(N τ ) factor.. An alternative normalization divides by N instead of N τ. This biases the estimator, but only at large lags owing to the presence of a triangle function, 1 = τ /N that multiples the true ACF. The advantage of this normalization is that it keeps the estimation errors at large lags small. 5

Unequal sampling: The ACF of a time series that isn t sampled uniformly can be calculated by binning in lag. For example, we can write (where now τ is in time units rather than being an index) ˆR τ = 1 N τ i j }{{} t i t j τ± τ/2 x i x j, N τ = i j }{{} t i t j τ± τ/2 Here N τ is the number of lagged products summed in a bin of size τ. Clearly the bin size τ needs to be chosen carefully so that structure in the ACF is not lost (bins too large) or that there are too few counts per bin (bins too small). Note also that (a) bins do not have to be equal and (b) in fact, they can be spaced logarithmicaly, e.g. τ/τ = constant. 1. 6

Errors in the ACF have several causes: Estimation Errors in Correlation Functions 1. Stochasticity of the signal part of the time series 2. Additive noise. Let the measurement model be the sum of a signal s and contaminating zero-mean, white noise n (both real): x(t) = s(t) + n(t) The ACF of x has an ensemble average R x (τ) = [s(t) + n(t)][s(t + τ) + n(t + τ)] = s(t)s(t + τ) + n(t)n(t + τ) + cross terms = R s (τ) + R n (τ) (variances add) What determines the errors in the estimated ACF? Both the signal (if stochastic) and the additive noise. Many results in the literature consider only the errors from additive noise. As we have seen in Figure 1, however, a signal with no additive noise produces an ACF estimate with errors. Considering again the unbiased estimate ˆR τ, as with any estimator we calculate the mean and variance of the estimated quantity. The mean is equal to the true ensemble average, as before. 7

The mean square is (for τ 0 and x still real) ˆR τ 2 = 1 (N τ) 2 i j x i x i+τ x j x j+τ. If x has Gaussian statistics, it is easy to work out the fourth moment in the summand. How many terms does the fourth moment expand into? In this case, the mean square ACF estimate and the estimation error can be worked out. 8

Structure Functions Recall (for the WSS case) R x (τ) = x(t)x(t + τ) autocorrelation C x (τ) = [x(t) x(t) ][x(t + τ) x(t + τ) ] autocovariance which we have said are useful in time series analysis in several ways. But there are processes for which R x (τ) is not a function only of lag (i.e. non-wss) and there are data sets for which the data interval [0, T ] does not satisfy W x T where W x = correlation time of the process (if it exists). E.g. 1. random walks 2. 1/f noise and other red-noise processes with power law spectra f α. In these cases the sample mean X(t) may not be estimatable (it will vary wildly over a realization and between realizations) and the nonstationarity needs to be contended with in other ways (whether the process is signal or noise ) 9

Sometimes the cure is the structure function. Define the first increment X(t, τ) = x(t) x(t + τ). Being a difference, this quantity acts in some ways as a derivative. The second moment of X(t, τ) is the first order structure function: Advantages: D x (τ) [x(t) x(t + τ)] 2 1) No estimate of the sample mean is needed 2) Sometimes the SF is a function of only τ when R x is not (stationary increments). A first-order random walk can be generated by a running integral of white noise. The random walk is nonstationary but the white noise is not (by assumption). Therefore the structure function will depend only on the lag τ and not on the absolute time. For a WSS process this becomes D x (τ) = x 2 (t) + x 2 (t + τ) }{{} 2σ 2 x 2R x (0) D x (τ) = 2[R x (0) R x (τ)] which clearly yields nothing more than the ACF does. 10 2 x(t)x(t + τ) }{{} R x (τ)

Structure Function for a Random Walk Consider a random walk [= shot noise with h(t) = U(t)]: S(t) = i a i h(t t i ) where h(t) = U(t) = unit step function and the t i are Poisson distributed. One can show that if a i = 0 and t 1 < t 2 then R s (t 1, t 2 ) = λ a 2 0 dα U(t 1 α)u(t 2 α) }{{} So generally = λ a 2 t 1 0 dα = λ a2 t 1 R s (t 1, t 2 ) = λ a 2 t <, t < = min (t 1, t 2 ) Now compute the structure function: D x (τ) = [x(t) x(t + τ)] 2 = x 2 (t) + x 2 (t + τ) 2 x(t)x(t + τ) = a 2 [λt + λ(t + τ) 2λt = λ a 2 t for τ > 0 D x (τ) = λ a 2 τ = Dependence only on τ; the RW has stationary increments. 11

Allan variance Structure functions are most useful in analyses of nonstationary random processes. Why? Because they can remove trends in the data before calculating a second moment. SFs provide standardized statistics for phase and frequency instabilities in precision frequency sources (flicker noise = 1/f noise). Let ν(t) be the time-dependent frequency of a clock or synthesizer and define the normalized frequency, y(t) = ν(t) ν Define a running average frequency as ȳ(t) = 1 τ t+τ t dt y(t) Then the Allan variance is σ 2 y(τ) = 1 2 [ȳ(t + τ) ȳ(t)]2 This proportional to the first order structure function of y: σ 2 y(τ) 1 2 D y(τ) 12

Figure 2: Plot of the square root of the Allan variance for a few frequency standards (left) and contributions to the Allan variance in a typical frequency standard (right). 13

Higher order structure functions : D x (τ) removes any mean component and thus acts like a 1st derivative or difference: The second order increment is (1) x (t, τ) = x(t + τ) x(t). (2) x (t, τ) = x(t + 2τ) 2x(t + τ) + x(t) whose second moment is the second order structure function, D (2) x (τ) [ (2) x (t, τ)] 2. It removes a ramp function in a time series just as the first order structure function removes the mean. 14

Continuing, the mth increment of the phase (following Rutman 1978) has variance that is the m th order structure function. (m) x (t, τ) m m ( 1) l( ) x [t + (m l)τ] l=0 l D x (m) (t, τ) [ (m) x (t, τ)] 2 The m th increment is useful for identifying a deterministic t m power-law term in a time series and for identifying step functions in the (m-1) th derivative: (m) x = 0 if x is a polynomial of order p < m and (m) x is independent of t for p = m. Step functions in the k th derivative of x have increments (k+1) x (t, τ) that are pulses in time t comprising piecewise polynomials (of order k) with an amplitude a k τ k 1 (where a k is the amplitude of the step function). For example, a step function in ẋ at t = τ yields a triangular pulse (2) x = ẋτ(1 t/τ ) for t τ. Correspondingly, the m th order structure function may be time invariant whereas x itself may be nonstationary. 15

Utility of Structure Functions in Wave Propagation Modeling and simulations of propagation through random phase screens are commonly done for studies of propagation through the atmosphere, the ionosphere, the interplanetary medium, and the interstellar medium. Though these media are quite differently physically, the underlying mathematics of wave propagation is the same. Consider the simple case of a plane wave propagating through a thin screen that changes the electromagnetic (EM) phase randomly: plane wave: e ikz phase screen altered phase fronts e i[kz ωt+φ(x)] observation plane 16

The screen phase can be written as φ(x) = k z dz δn r(x), k 2π/λ where δn r (x) describes the variable part of the index of refraction. The (scalar) electric field emerging from the screen is E(x, z) = e i[kz ωt+φ(x)] The autocorrelation function of the field is (letting x 1 = x and x 2 = x + b) R E (x 1, x 2 ) = E(x, z)e (x + b, z) = e i[φ(x) φ(x+b)] = e i[ φ(x,b)] The phase difference, φ(x, b) is a random process. By inspection, you might describe R E (x 1, x 2 ) as the characteristic function of the phase difference φ evaluated at ω = 1. Now assume that φ(x) is a Gaussian process with stationary statistics. Since it is a spatially varying quantity, it is said to have homogeneous statistics. This means that φ is also a Gaussian process. Why? 17

For a general Gaussian random variable Y the characteristic function for zero mean is Φ Y (ω) = e iωy = e 1 2 ω2 σ 2. We can apply this result to the correlation function defined above to get R E (x 1, x 2 ) R E (b) = e 1 2 [ φ(b)] 2. We can rewrite this in terms of the phase structure function defined as so that D φ (b) [ φ(b)] 2 R E (b) = e 1 2 D φ(b). 18

Propagation to the observation plane: Usually one wants to know what the wave field is in a plane downstream of the screen. A fundamental result is that the field correlation function R E (b) propagates unaltered from the screen to the observer. This can be shown using the wave equation (see literature). Secondly, a fundamental theorem relates R E (b) to the apparent image of the source of plane waves I(θ) by the van Cittert-Zernike theorem: R E (b) I(θ) Example: Suppose the phase structure function has a square-law form D φ (b) = 2σ 2 φ b 2. b 1 When σ φ 1, the field correlation function is Gaussian in form and then so too is the image. The wave propagation describes scattering and this result tells us that the scattered image has a Gaussian shape. For the atmospheric case, the scattered image I(θ) is called the seeing disk. Note that all of the above involves ensemble averages. Individual realizations of scattered sources show speckles that average into the Gaussian shape. All of the above is also one dimensional. It is straight forward to extend the results to two dimensional screens. 19