Appendix to: Hypothesis Testing for Multiple Mean and Correlation Curves with Functional Data

Similar documents
Lecture 33: Bootstrap

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

1 The Haar functions and the Brownian motion

Mathematical Statistics - MS

LECTURE 8: ASYMPTOTICS I

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

32 estimating the cumulative distribution function

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

7.1 Convergence of sequences of random variables

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

STAT Homework 1 - Solutions

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Empirical Processes: Glivenko Cantelli Theorems

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

MA Advanced Econometrics: Properties of Least Squares Estimators

Random Variables, Sampling and Estimation

Math Solutions to homework 6

STA Object Data Analysis - A List of Projects. January 18, 2018

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Notes 19 : Martingale CLT

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n

Chapter 6 Principles of Data Reduction

Sequences and Series of Functions

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

Efficient GMM LECTURE 12 GMM II

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Rates of Convergence by Moduli of Continuity

Notes 27 : Brownian motion: path properties

Properties and Hypothesis Testing

Solutions to HW Assignment 1

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

1.3 Convergence Theorems of Fourier Series. k k k k. N N k 1. With this in mind, we state (without proof) the convergence of Fourier series.

SDS 321: Introduction to Probability and Statistics

Singular Continuous Measures by Michael Pejic 5/14/10

An Introduction to Asymptotic Theory

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Binomial Distribution

The Central Limit Theorem

Exponential Families and Bayesian Inference

Supplementary Material to Wasserstein Covariance for Multiple Random Densities

Section 14. Simple linear regression.

SAMPLING LIPSCHITZ CONTINUOUS DENSITIES. 1. Introduction

Estimation of the essential supremum of a regression function

Solution to Chapter 2 Analytical Exercises

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Probability and Statistics

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Simulation. Two Rule For Inverting A Distribution Function

PRELIM PROBLEM SOLUTIONS

Lecture 7: Properties of Random Samples

arxiv: v1 [math.pr] 4 Dec 2013

Solutions: Homework 3

Archimedes - numbers for counting, otherwise lengths, areas, etc. Kepler - geometry for planetary motion

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Study the bias (due to the nite dimensional approximation) and variance of the estimators

1 Convergence in Probability and the Weak Law of Large Numbers

Homework 4. x n x X = f(x n x) +

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

x x x Using a second Taylor polynomial with remainder, find the best constant C so that for x 0,

lim za n n = z lim a n n.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Univariate Normal distribution. whereaandbareconstants. Theprobabilitydensityfunction(PDFfromnowon)ofZ andx is. ) 2π.

Central Limit Theorem using Characteristic functions


of the matrix is =-85, so it is not positive definite. Thus, the first

Asymptotic distribution of products of sums of independent random variables

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

2.2. Central limit theorem.

Lecture 15: Density estimation

Kernel density estimator

Intro to Learning Theory

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Self-normalized deviation inequalities with application to t-statistic

STAT331. Example of Martingale CLT with Cox s Model

Topic 9: Sampling Distributions of Estimators

7.1 Convergence of sequences of random variables

Topic 9: Sampling Distributions of Estimators

Summary. Recap ... Last Lecture. Summary. Theorem

PAPER : IIT-JAM 2010

Chapter 6 Infinite Series

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES (SUPPLEMENTARY MATERIAL)

Algebra of Least Squares

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science. BACKGROUND EXAM September 30, 2004.

Detailed proofs of Propositions 3.1 and 3.2

Asymptotic Results for the Linear Regression Model

Transcription:

Appedix to: Hypothesis Testig for Multiple Mea ad Correlatio Curves with Fuctioal Data Ao Yua 1, Hog-Bi Fag 1, Haiou Li 1, Coli O. Wu, Mig T. Ta 1, 1 Departmet of Biostatistics, Bioiformatics ad Biomathematics, Georgetow Uiversity Medical Ceter, Washigto DC 0057 USA Office of Biostatistics Research, Natioal Heart, Lug ad Blood Istitute, Natioal Istitutes of Health, Bethesda, MD 089 USA Appedix I. Descriptio of the simulatio. 1 Simulatio To ivestigate the fiite sample properties of the proposed methods, ad compare with the commoly used local liear smoothig (Loess) ad splie methods, we preset here several simulatio studies, which are desiged to mimic the practical situatios with moderate sample sizes. We cosider separately the tests for the equality of two mea curves ad the tests for the correlatio fuctio betwee two stochastic processes. For each case, the simulatio is based o 5000 replicatios, ad the mea values of estimates over the replicatios are reported. 1.1 Testig the Equality of Mea Curves 1.1.1 Simulatio for Test with Two-Sided Alteratives For testig the hypotheses (9), we geerate the observatios { X 1, Y, ( X xy, Y xy )} of { X(t), Y (t) : t T } usig the data structure () with 1 = = 50 ad xy = 0 o k() = 50 equally spaced time poits { t j = j : j = 1,..., 50 }, so that x = y = 30. 1

For subjects with oly X(t) or Y (t) observed, we geerate X i (t j ) + ɛ i (t j ) = µ(t j ) + r i si(8 + t j /10) / 30 + N(0, σ (t j )), µ(t) = (t + Er]) si(8 + t/10) ]/ 30, where the r i s ad r are iid radom itegers uiformly distributed o {1,..., 50} used to make the curves more wiggly lookig; similarly, Y i (t j ) + ξ i (t j ) = η(t j ) + Cr i cos(8 + t j /10) / 100 + N(0, σ (t j )), η(t) = µ(t) + CEr] cos(8 + t j /10) / 100. σ (t) = 0.001 t, r i s ad r are iid radom itegers uiformly distributed o {1,..., 50} used to make the curves more wiggly lookig, ad C is a costat characterizig the differece betwee µ(t) ad η(t). For subjects with ( X(t), Y (t) ) observed, we geerate {( X i (t) + ɛ i (t j ), Y i (t) + ξ i (t j ) ) T (( ) T ) } N µ(t), η(t), Σ(t) : i = x +1,..., 1, where the covariace matrix Σ(t) is composed by the variace σ (t) ad the correlatio coefficiet ρ(t) = 0.01 (t/50). 1.1. Simulatio for Test with Oe-Sided Alteratives For testig the hypotheses (10), we geerate the observatios { X 1, Y, ( X xy, Y xy )} of { X(t), Y (t) : t T } usig the same method as Sectio 5.1.1, except that Y i (t j ) + ξ i (t j ) is replaced with Y i (t j ) + ξ i (t j ) N ( η (t j ), σ (t j ) ), where η (t) = µ(t) + C (t + Er]) cos(8 + t/3) / 100 ]. characterizig the differece betwee µ(t) ad η (t). Here, C, which plays a similar role as C, is a costat 1. Testig Correlatio Fuctios 1..1 Simulatio for Test with Two-Sided Alteratives For simplicity, each of our simulated samples cotais 1 = = xy = 50 subjects observed o k() = 50 time poits { t j = j : j = 1,..., 50 }, so that, i view of (), the sample cotais oly the paired observatios {( X i (t j ) + ɛ i (t j ), Y i (t j ) + ξ i (t j ) ) T : i = 1,..., 50; j = 1,..., 50 }. For testig the hypotheses i (15) usig the test statistic S, we geerate i each sample X i (t j )+ɛ i (t j ) N ( µ(t j ), σ1(t j ) ), where µ(t) = t si(8+t/10)/30

ad σ 1(t) = 0.01 t, ad, coditioal o X i (t j ) = x i (t j ), Y i (t j )+ξ i (t j ) has the coditioal ormal distributio, Y i (t j )+ξ i (t j ) ( xi (t j )+ɛ i (t j ) N µ(t j )+ρ(t j ) σ (t j ) / σ 1 (t j ) ] x i (t j )+ɛ i (t j ) µ(t j ) ], ( 1 ρ ) ) σ(t j ), (9) where σ (t) = 0.01 t ad ρ(t) = ρ si(8 t/10) for some ρ 0. Here, for ay t {1,..., 50}, ρ(t) is the true correlatio coefficiet betwee X i (t) ad Y i (t), ad ρ determies the differece of the correlatio curve R(t) from zero. Appedix II. Proofs. Proof of Theorem 1 (i). Sice, by (3) ad (4), X i,k() ( ) ad Y i,k() ( ) are two stochastic processes o T, we deote by µ k() (t) = EX i,k() (t)] ad η k() (t) = EY i,k() (t)] (A.1) the expectatios of the radom variables X i,k() (t) ad Y i,k() (t), respectively, for each fixed t T. The, we have { µ1(t) µ(t) ] η (t) η(t) ]} 1 = 1 + 1 { µ1(t) µ k() (t) ] η (t) η k() (t) ]} { µk() (t) µ(t) ] η k() (t) η(t) ]}. Note that by (), (4) ad the assumptio Eɛ i (t)] = 0 for all t, µ k() (t) = (t j+1 t)ex i (t j ) + ɛ i (t j+1 )] + (t t j )EX i (t j+1 ) + ɛ i (t j+1 )] t j+1 t j = (t j+1 t)µ(t j ) + (t t j )µ(t j+1 ) t j+1 t j, t t j, t j+1 ]. (A.) Thus, µ k() ( ) is the liear iterpolatio of µ( ) o the t j, t j+1 )] s for j = 0,..., k() + 1 with t 0 = if{t T } ad t k()+1 = sup{t T }. Similarly, η k() ( ) is the liear iterpolatio of η( ) o T. 3

By the assumptios that µ( ) ad η( ) are Lipschitz cotiuous o T which is bouded, it follows that µ k() (t) ad η k() (t) are uiformly cotiuous o T. Thus, we have ad if µ(s) µ k()(t) sup µ(s) s t j,t j+1 ) s t j,t j+1 ) if η(s) η k()(t) sup η(s) s t j,t j+1 ) s t j,t j+1 ) for t t j, t j+1 ), j = 0,..., k() + 1. Let δ k() = max{t j+1 t j : j = 0, 1,..., k()}. The assumptio of first order Lipschitz cotiuity implies there are 0 < c 1, c <, such that sup µ(t) µ(s) c 1 δ k() ad s,t T, t s δ k() sup η(t) η(s) c δ k(). s,t T, t s δ k() Thus by the coditio δ k() 0, we get 1 sup t T µ k() (t) µ(t) ] η k() (t) η(t) ] (c1 + c ) δ k() 0. (A.3) Now, it suffices to show that i l (T ), 1 { µ1( ) µ k() ( ) ] η ( ) η k() ( ) ]} D W ( ). (A.4) To prove (A.4), it suffice to show, i l (T ), 1 µ1 ( ) µ k() ( ) ] D W 1 ( ) ad η ( ) η k() ( ) ] D W 1 ( ). (5) for some Gaussia processes W 1 ( ) ad W ( ). We will use Theorem.11.3 i va der Vaart ad Weller (1996, P.1) to prove (A.5). We oly show the first i (A.5), that for the secod is the same. Deote X i ( ) = X i ( ) + ɛ i ( ), the ˆµ 1 (t) 1 1 1 (t t j ) X i (t j ) + (t j+1 t) X i (t j+1 ) t j+1 t j := 1 1 1 g,t ( X i ), t t j, t j+1 ], 4

where g,t is the liear iterpolatio fuctioal, with kots {t 1,..., t k() }, evaluated at t T, ad µ k() (t) = Eg,t ( X i ) := P g,t ( X i ). Deote P 1 the empirical measure of X 1,..., X 1, the the first i (A.5) is writte as 1/ 1 (P 1 P )g, ( X i ) D W 1 ( ), i l (T ). (A.6) To show the above, we oly eed to check the coditios of Theorem.11.3. For ay s, t T, let ρ(s, t) = t s, the (T, ρ) is a totally bouded semi-metric space. Let X be a iid copy of the X i s, ad defie Ỹi ad Ỹ similarly. Let G = {g,t ( X) : t T }. The G = G = sup t T X (t) + Ỹ (t)] 1/ is a evelope for G. By the give coditio P G <, so P G = P G = O(1), ad P G I(G > δ )] = P G I(G > δ )] 0 for every δ > 0. Also, for every δ 0, by the give codito ( X(t) ] ] ) δ 0 sup t s δ E + ɛ(t) X(s) ɛ(s) + Y (t) + η(t) Y (s) η(s) 0, sup P ( g,s ( X) g,t ( X) ) sup P ( ) X(s) X(t) 0. ρ(s,t)<δ ρ(s,t)<δ Thus (.11.1) i va der Vaart ad Weller (1996, P.0) is satisfied. Let N ] ( ɛ, G, L (P ) ) be the umber of ɛ-brackets eeded to cover G uder the L (P ) metric. Sice for each, there is oe member g,t G, let l,t = u,t = g,t, the l,t ( X) g,t ( X) u,t ( X) (t T ), ad for all ɛ > 0, P ( u,t ( X) l,t ( X) ) = 0 < ɛ G L (P ), i.e., (l,t, u,t ) is a ɛ-bracket of G uder the L (P ) orm. Here we have N ] ( ɛ G L (P ), G, L (P ) ) = 1, thus δ 0 log N ] ( ɛ G L (P ), G, L (P ) ) dɛ 0, for every δ 0. Now, by Theorem.11.3 i va der Vaart ad Weller (1996, P.1), (A.6) is true. Next we idetify the weak limit W ( ). For each fixed iterpolatio g,t G, g,t ( X) is a radom fuctio i t, so W ( ) is a process o T. For each positive iteger k ad fixed (t 1,..., t k ), by cetral limit theorm for double array, (W (t 1 ),..., W (t k )) T is the weak 5

limit of the vector 1 { µ1( ) µ k() (t j ) ] η ( ) η k() (t j ) ] : j = 1,..., k}. So (W (t 1 ),..., W (t k )) T is a mea zero ormal radom vector, ad by the uiform weak covergece showed above, W ( ) is a Gaussia process o T. Clearly EW ( )] = 0. The covariace fuctio R(s, t) = EW (s)w (t)] is give by ( 1 ) R(s, t) = lim Cov{ µ1 (s) µ k() (s) ] η (s) η k() (s) ], µ1 (t) µ k() (t) ] η (t) η k() (t) ]}. Sice { Cov µ1(s) µ k() (s) ] η (s) η k() (s) ], µ 1(t) µ k() (t) ] η (t) η k() (t) ]} we have { 1 x = Cov Xi,k() (s) µ k() (s) ] + 1 1 1 1 1 1 i= x+1 i= x+1 Xi,k() (s) µ k() (s) ] Yi,k() (s) η k() (s) ] 1 Yi,k() (s) η k() (s) ]], i= 1 1 x Xi,k() (t) µ k() (t) ] + 1 1 1 1 1 1 i= x+1 i= x+1 Yi,k() (t) η k() (t) ] 1 Xi,k() (t) µ k() (t) ] i= 1 Yi,k() (t) η k() (t) ]]} = 1 1 Cov X 1,k() (s), X 1,k() (t) ] xy 1 Cov X 1,k() (s), Y 1,k() (t) ] xy 1 Cov Y 1,k() (s), X 1,k() (t) ] + 1 Cov Y 1,k() (s), Y 1,k() (t) ], ( 1 ){ R(s, t) = lim 1 1 Cov X 1,k() (s), X 1,k() (t) ] ( 1 x ) Cov X 1,k() (s), Y 1,k() (t) ] 1 ( 1 x ) 1 = γ R 11 (s, t) γ 1 R 1 (s, t) + R 1 (s, t)] + γ 1 R (s, t). Cov Y 1,k() (s), X 1,k() (t) ] + 1 Cov Y 1,k() (s), Y 1,k() (t) ]} 6

Proof of Theorem. (i). By Theorem 1, we have that, uder H 0 of (9), L D 1 T T W (t)dt. (A.9) Sice R(, ) is almost everywhere cotiuous ad T is bouded, R (, ) is itegrable, that is, T T R (s, t) dsdt <. By Mercer s Theorem (cf. Theorem 5..1 of Shorack ad Weller (1986), page 08), we have that R(s, t) = λ j h j (s)h j (t), (A.10) where λ j 0, j = 1,,..., are the eigevalues of R(, ), ad h j ( ), j = 1,,..., are the correspodig orthoormal eigefuctios. Let { Z 1,..., Z m,... } be the set of idepedet idetically distributed radom variables with Z m N(0, 1). The Z(t) = λj Z j h j (t) is a Gaussia process o T with mea zero ad covariace fuctio R(s, t). W (t) ad Z(t), have the same distributio o T, Thus, the two stochastic processes, W (t) d = Z(t) = λj Z j h j (t) (A.11) ad, by (A.9) ad (A.10), 1 T T W (t)dt = d 1 T T λj Z j h j (t)] dt = 1 T λ j Zj. (A.1) The result of Theorem (i) follows from (A.9), (A.11) ad (A.1). (ii). By Theorem 1, we have that, uder H 0 of (10), D D 1 T t T W (t) dt = U, (A.13) where U has ormal distributio with mea zero. To compute the variace of U, we cosider the partitio { s j, s j+1 ) : j = 1,..., m } of T with δ = max { s j+1 s j : j = 7

1,..., m }. The, it follows from (A.13) that U = lim δ 0 W (s j )(s j+1 s j ). (A.14) Sice E W (s j ) ] = 0 for each fixed j, we have that, by (A.14) ad the cotiuity coditio of R(, ), V ar(u) = lim δ 0 = lim δ 0 E W (s i ) W (s j ) ] (s i+1 s i )(s j+1 s j ) R(s i, s j ) (s i+1 s i )(s j+1 s j ) = s T t T R(s, t) ds dt. The result of Theorem (ii) follows from (A.13) ad V ar(u). Proof of Theorem 3 The proof is similar to derivatio of (A.5) by substitutig µ( ) ad µ ( ) with µ( ) ad µ ( ), respectively. The differece here is that we have order two polyomial iterpolatios for the terms ( X i,k() ( ), Y i,k() ( ), X i,k()( )Y i,k() ( ) ) i additio to the liear iterpolatios for X i,k() ( ) ad Y i,k() ( ) with G playig the role of G i the derivatio of (A.5). The rest of the derivatio is proceeded the same way. The, the delta method leads to the claimed result. To idetify the matrix covariace fuctio Ω(s, t), we ote that µ (t) = 1 ( Xi,k() (t), Y i,k() (t), X i,k()(t), Y i,k()(t), X i,k() (t) Y i,k() (t) ) := 1 Z i (t). The, Ω(s, t) = Cov ( Z(s), Z(t) ) gives the expressio for Ω(s, t). Proof of Theorem 4. 8

The proof here is focused o the derivatio of (8), as the proof of (7) ca be proceeded usig the same approach here ad the results of Theorem 3. We first ote that W (t) = H µ (t) ] ad, uder H 0 of (15) ad (16), H µ(t) ] = 0. It follows that W (t) = { H µ (t) ] H µ(t) ]} = 1 + o p (1) ] Ḣ µ(t) ] µ (t) µ(t) ]. Usig result of Theorem 3, the delta method ad the similar derivatios as the proof of Theorem 1, we have W ( ) P W ( ) i l (T ) uiformly o P, where W ( ) is the mea zero Gaussia process o T with covariace fuctio ] Q(s, t) = Cov {Ḣ µ(s) X(s), Ḣ µ(t) ] Z(t) ]} = Ḣ µ(s) ] Cov Z(s), Z(t) ] Ḣ µ(t) ] = Ḣ µ(s) ] Ω(s, t) Ḣ µ(t) ]. The rest of the proof is the same as i that of Theorem. 9