Large sample distribution for fully functional periodicity tests

Similar documents
TESTING FOR PERIODICITY IN FUNCTIONAL TIME SERIES

Extremogram and ex-periodogram for heavy-tailed time series

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Extremogram and Ex-Periodogram for heavy-tailed time series

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Robust Backtesting Tests for Value-at-Risk Models

Inference For High Dimensional M-estimates. Fixed Design Results

Can we do statistical inference in a non-asymptotic way? 1

Adjusted Empirical Likelihood for Long-memory Time Series Models

A Conditional Approach to Modeling Multivariate Extremes

Robust Testing and Variable Selection for High-Dimensional Time Series

Estimation of large dimensional sparse covariance matrices

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Lecture 5: Hypothesis tests for more than one sample

Chapter 4 - Fundamentals of spatial processes Lecture notes

Robustní monitorování stability v modelu CAPM

Nonlinear Time Series Modeling

Inference For High Dimensional M-estimates: Fixed Design Results

Inference on distributions and quantiles using a finite-sample Dirichlet process

Hypothesis Testing For Multilayer Network Data

STAT 730 Chapter 5: Hypothesis Testing

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Guarding against Spurious Discoveries in High Dimension. Jianqing Fan

Cross-Validation with Confidence

Extreme inference in stationary time series

First Year Examination Department of Statistics, University of Florida

Weakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah

A Multiple Testing Approach to the Regularisation of Large Sample Correlation Matrices

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Econometría 2: Análisis de series de Tiempo

Algorithms for Uncertainty Quantification

Construction of an Informative Hierarchical Prior Distribution: Application to Electricity Load Forecasting

Long Memory through Marginalization

A Spatio-Temporal Point Process Model for Ambulance Demand

Stat 700 HW2 Solutions, 9/25/09

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Announcements Monday, September 18

Supremum of simple stochastic processes

Lecture 3. Inference about multivariate normal distribution

Negative Association, Ordering and Convergence of Resampling Methods

Model Counting for Logical Theories

High-dimensional Ordinary Least-squares Projection for Screening Variables

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

You can compute the maximum likelihood estimate for the correlation

The problem is to infer on the underlying probability distribution that gives rise to the data S.

Discussion of Hypothesis testing by convex optimization

STAT 461/561- Assignments, Year 2015

STA 294: Stochastic Processes & Bayesian Nonparametrics

Multivariate Time Series: VAR(p) Processes and Models

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Stein s Method and Characteristic Functions

the long tau-path for detecting monotone association in an unspecified subpopulation

Cross-Validation with Confidence

On corrections of classical multivariate tests for high-dimensional data

On the Power of Tests for Regime Switching

Max stable Processes & Random Fields: Representations, Models, and Prediction

Overview of Spatial Statistics with Applications to fmri

High-dimensional covariance estimation based on Gaussian graphical models

Multiscale Adaptive Inference on Conditional Moment Inequalities

Minimax Estimation of Kernel Mean Embeddings

On detection of unit roots generalizing the classic Dickey-Fuller approach

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

Spring 2012 Math 541B Exam 1

10708 Graphical Models: Homework 2

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Lecture 32: Asymptotic confidence sets and likelihoods

GARCH Models Estimation and Inference

Limit theorems for dependent regularly varying functions of Markov chains

Circle a single answer for each multiple choice question. Your choice should be made clearly.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

SDS : Theoretical Statistics

Emma Simpson. 6 September 2013

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Multivariate Analysis and Likelihood Inference

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Chapter 7, continued: MANOVA

Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Large-Scale Multiple Testing of Correlations

Extreme Value Analysis and Spatial Extremes

If we want to analyze experimental or simulated data we might encounter the following tasks:

Detection of structural breaks in multivariate time series

PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou

Optimal Estimation of a Nonsmooth Functional

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Least squares problems

A comparison of four different block bootstrap methods

A framework for adaptive Monte-Carlo procedures

Purchasing power parity: A nonlinear multivariate perspective. Abstract

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

A Test of Cointegration Rank Based Title Component Analysis.

Math 181B Homework 1 Solution

Symmetry and Separability In Spatial-Temporal Processes

Transcription:

Large sample distribution for fully functional periodicity tests Siegfried Hörmann Institute for Statistics Graz University of Technology Based on joint work with Piotr Kokoszka (Colorado State) and Gilles Nisol (ULB) Clément Cerovecki (ULB) and Vaidotas Characiejus (ULB) Sept 21, 2018

Pollution data N = 167 observations of PM10 measured in Graz (Austria) from 10/1/15 to 03/15/16. 3 4 5 6 7 8 9 MO TU WE TH FR SA SU Is there a significant difference in the daily average w.r.t. day of week?

Pollution data We use ANOVA: H 0 : µ MO = = µ SU v.s. H A : H 0 doesn t hold. We obtain p-value of 0.72 = do not reject H 0.

Pollution data Same time period with NO (nitrogen monoxide). 2 4 6 8 10 12 14 MO TU WE TH FR SA SU We obtain p-value of 0.25 = do not reject H 0.

Dependence Autocorrelation functions of daily averages of PM10 and NO. Series PM10 Series NO ACF 0.0 0.2 0.4 0.6 0.8 1.0 ACF 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 Lag 0 5 10 15 20 Lag

Functional data PM10 data in Graz are available in 30 minutes resolution. = Functional time series Goal: check whether there is underlying periodic signal PM10 20 40 60 80 0 2 4 6 8 10 Time

Weekday averages Intraday mean curves from half-hourly observations. PM10 Mean curves 4.5 5.0 5.5 6.0 6.5 PM10 NO 4 6 8 10 NO mean curves Monday Tuesday Wednesday Thursday Friday Saturday Sunday Time

Three models for periodic functional data Y t (u) = µ(u) + [α cos(tθ) + β sin(tθ)] w 0 (u) +Z t (u), (1) Y t (u) = µ(u) + s t w 0 (u) +Z t (u), (2) Y t (u) = µ(u) + w t (u) +Z t (u), (3) with and s t = s t+d and w t (u) = w t+d (u) d s t = 0 and d w t (u) = 0. t=1 t=1

Settings Part 1: Part 2: Noise: θ is known. θ is unknown. We start with Z t i.i.d. N (0, Γ). Later we generalize to very general stationary processes.

Projections Choose a set of functions v 1 (u),..., v p (u) and compute Y tk = Y t (u)v k (u)du, which gives rise to the vector models Y t = µ + [α cos(tθ) + β sin(tθ)] w 0 + Z t ; Y t = µ + s t w 0 + Z t ; Y t = µ + w t + Z t.

LR tests for projections Let us assume that d is odd and set q = (d 1)/2, ϑ k = 2πk d. Theorem Assume Σ = Var(Z 1 ) > 0 known. LR-test statistics are: T M 1 := A(ϑ1 )Σ 1 A (ϑ 1 ) T M 2 := A(ϑ1,..., ϑ q )Σ 1 A (ϑ 1,..., ϑ q ) T M 3 := A(ϑ1,..., ϑ q )Σ 1 A (ϑ 1,..., ϑ q ) tr Here A(θ i1,..., θ ik ) = [ R(θ i1 ),..., R(θ ik ), C(θ i1 ),..., C(θ ik ) ] where R(θ) + ic(θ) = 1 N Y k e ikθ. N k=1

Model 3: relation to MANOVA Note that where T M 3 = A(ϑ 1,..., ϑ q )Σ 1 A (ϑ 1,..., ϑ q ) tr T MAV T MAV = d k=1 N d (Y k Y ) Σ 1 (Y k Y ), where Y k = 1 n n t=1 Y (t 1)d+k, 1 k d. (Balanced design).

Remarks Approach is easy and purely multivariate. In practice we replace Σ by estimator. NOTE: If we plug in Σ A this test is asymptotically equivalent to Wilks Lambda. For all three tests explicit null-distributions are available. The downside of this method: we need to have a good basis which picks up the signal.

Remarks The standard multivariate test of MacNeill for Model 1 is S M1 := D N (ϑ 1 ) 2 = A(ϑ 1 )Σ 1 A (ϑ 1 ) tr. (Here D N is the DFT of standardized data.)

Rescaling The tests are computed from the standardized data = asymptotically pivotal tests; Σ 1/2 Y 1,..., Σ 1/2 Y N. = component processes with larger variation are not concealing potential periodic patterns in components with little variance. While this is clearly a very desirable property for multivariate data samples, one may favor a different perspective for functional data.

Fully functional approach Let A(θ i1,..., θ ik ) be a 2k-vector of functions computed from functional DFT 1 N Y t (u)e itθ. N t=1 Fully functional test for Model (3) ANOVA model: Again T F 3 := A(ϑ 1,..., ϑ q )A (ϑ 1,..., ϑ q ) tr. T F3 T FAV = d k=1 N d Y k Y 2. Asymptotic distribution of T F3 (functional of DFT s) can be derived and then carried over to T FAV.

Theorem Under L 2 -m-approximability the functional ANOVA test statistic satisfies (d 1)/2 d N d Y k Y 2 d 2 Ξ k, k=1 k=1 where Ξ k ind. HExp(λ 1 (ϑ k ), λ 2 (ϑ k ),...), and λ l (ϑ k ) are the eigenvalues of F ϑk. Remarks: 1. non-pivotal test; 2. consistent estimation of spectral density operator and its eigenvalues is possible; 3. good empirical approximation.

Empirical levels Synthetic but realistic data: PM-residuals resampled. Used to generate a functional MA(5) process. 1.000 repetitions. α = 5% α = 10% T M 1 T M 2 T M 3 T F 3 T M 1 T M 2 T M 3 T F 3 FF 5.0 8.3 4.7 9.9 p = 1 5.0 3.9 3.9 9.6 8.7 8.7 5.9 4.4 4.4 9.8 9.9 9.9 p = 2 6.4 4.3 3.9 11.2 8.7 8.0 6.0 5.4 4.1 10.6 9.8 9.1 p = 3 6.8 3.8 3.9 12.2 7.9 6.9 5.8 5.5 4.2 10.6 9.6 7.9 p = 5 7.4 6.4 6.2 15.5 11.6 11.4 6.7 6.4 5.7 12.5 12.2 10.7 Table 1: Empirical size (in %) at the nominal level α of 5% and 10% for dependent time series with sample sizes N = 210 (top rows) and N = 420 (bottom rows).

Power To study a realistic alternative we add back the weekday means to previous time series. The signal-to-noise ratio in this example is approx 1/30. N = 210 N = 420 T M1 T M2 T M3 T F3 T M1 T M2 T M3 T F3 FF 72.9 99.9 p = 1 14.3 26.5 26.5 21.7 56.6 56.6 p = 2 50.2 89.4 89.9 76.9 99.7 99.8 p = 3 73.4 96.4 98.1 92.2 100 100 p = 5 99.22 100 100 100 100 100 Table 2: Empirical rates (in %) when testing at nominal level α of 5%. Results are based on 1,000 Monte Carlo simulation runs.

Real data tests for NO data T M1 T M2 T M3 T F3 FF (100%) 0.006 v(u) 1 0.305 0.099 0.099 p = 1 (68%) 0.247 0.076 0.076 p = 2 (81%) 0.495 0.204 0.172 p = 3 (87%) < 10 5 < 10 5 < 10 5 p = 5 (97%) < 10 5 < 10 5 < 10 5 Table 3: The p-values for NO data. In parentheses, the percentage of variance explained by the first p principal components on which the curves are projected.

Uniform Test Assume now that we would like to test for a periodic pattern when the frequency θ is unknown. We consider a test based on M N = max D N(θ j ) 2 = max A(θ j)a (θ j ) tr. j=1,...,q j=1,...,q

Scalar setting In the scalar setting, we know that M N log q d G, q N/2 (4) where G is the standard Gumbel distribution. Simple explanation: If data are iid Gaussian, then D N (θ j ) 2 iid Exp(1). Without Gaussianity: we typically only have that ( DN (ω 1 ) 2,..., D N (ω q ) 2) d (E 1,..., E q ), E j iid Exp(1) if q is fixed. Nevertheless, Davis and Mikosch (1999) obtained (4) for i.i.d. sequences and linear processes, and later to a more general notion of dependence in Lin and Liu (2009).

Main results For p 1, we define M p N = max j=1,...,q D p N (ω j) 2, where D p K.L. N N (ω) = N 1/2 t=1 k=1 p X t, v k v k e itω and the v k s are the eigenfunctions of the operator C 0, associated to eigenvalues λ 1 > > λ p. Theorem 1. Theorem 2. M p N bp N λ 1 d G, for a fixed p 1 M p N N b p N N λ 1 d G, for some p = pn + Theorem 3. M N b N λ 1 d G where b p N = λ 1 log q λ p 1 j=2 log(1 λ j/λ 1 ) b N = lim p + bp N λ 1 log q + E X 0 2

Idea of proof (with λ 1 = 1) Write M N b N = M N M p N }{{} A 1 p + MN bp N }{{} A 2 p + bn b N. }{{} A 3 Obviously A 3 0 when p. We can show that A 1 0 if p fast enough. The challenge is A 2! We will need that p grows slowly enough and hope that the intersection of sequences p = p N in A 2 and A 1 is not empty.

Idea of proof Let M p N be computed as Mp N, with a Gaussians sequence Y 1,..., Y N such that Var(Y 1 ) = Var(X 1 ), and we define ρ N,p = sup P(M p N x) P( M p N x) x R Then P ( M p N bp N x) e e x ρn,p + P( M p N bp N x) e e x. }{{}

Gaussian approximation We have that where {M p N x} = {S X N A}, X t, v 1 cos(ω 1 ) X t, v 1 sin(ω 1 ) X t =. R2pq X t, v p sin(ω q ) (2pq Np) and A is 2p sparsely convex, i.e. finite intersections of sets whose indicator function depends on at most 2p components.

Gaussian approximation We use a result of Chernozhukov, Chetverikov, and Kato (2017) sup P(SN X A) P(S N Y A), A A sp where A sp is the class of s sparsely convex subsets of R d. We had to work a bit around since in this bound there was one constant left, which depends (translated into our setup) in an unspecified way on λ p. Doesn t matter if p is fixed! In the end we deduce that ρ N,p p3 log(n) λ 1/2 p N. 1/6

Conclusion of the proof In particular Th. 3 holds if: o λ k ρ k, ρ < 1 and E X 1 s <, with s > 10. (p N log(n)) o λ k 1 k ν, ν > 1 and E X 1 s <, with s very big. (p N N γ )

Comments o All the results are still valid when b N is replaced by b n. o We work under E X 1 4 <, but when p is fixed, we can use a truncation argument to relax it to E X 1 2+ɛ <. Hence, we have extended the result of Davis and Mikosch (1999). o If p is fixed, we have considered the dependent case, i.e. of multivariate linear processes X t = l=1 Ψ l(z t l ), where the Ψ l s are matrices and (Z t ) t Z is an i.i.d. sequence in H. In that case, we rather work with 1 2 L N = max F ω j D N (ω j ) 2 j=1,...,q

Uniform test Consider a functional time series X t = s t + ε t where t s t is periodic of unknown frequency ω and (ε t ) t Z is some iid noise in H. We wish to test whether { H 0 : s t = 0 t, H 1 : t such that s t > 0 From the previous result, we deduce that under H 0, T = M N b N λ 1 d G. On the other hand, we have that T P + under H 1.

Simulation We consider 3 different alternative hypothesis: o low frequency, s (1) t =.5 cos(t 2π 40 ) o high frequency, s (2) o mixture, s (3) t t =.5 cos(t 2π 7 ) =.3 cos(t 2π 2π 40 ) +.3 cos(t 20 ) +.3 cos(t 2π 7 ) 4 2 0 2 4 0 5 10 15 20 25 time

Simulations Rejection rates under the null and 3 different alternatives at level α = 0.05 with increasing sample size, computed with 1000 replications. N = 100 N = 200 N = 500 N = 1000 X (0) 0.06 0.05 0.06 0.06 X (1) 0.18 0.87 0.97 1.00 X (2) 0.31 0.49 0.97 1.00 X (3) 0.14 0.39 0.87 1.00 X (i) t = s (i) t + ε t, t = 1,..., n with s (0) t := 0, and the curves ε t are bootstrapped from the PM 10 data set.

Simulations We set ω = argmax D N (ω j ) 2 j=1,...,q X (3) X (2) 0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0 X (1) 0.0 1.0 2.0 3.0 X (0) 0.0 1.0 2.0 3.0 n=100 n=200 n=500 n=1000

Summary We have proposed several tests for periodicity of functional data. We distinguish between projection based approach and fully functional approach. Fully functional approach theoretically and practically appealing. Projection based approach very powerful, but needs tuning. We developed our tests from LR approach which improves also multivariate theory. Extension to dependent data. Extension to unknown length of period.