Testing the Equality of Covariance Operators in Functional Samples

Size: px
Start display at page:

Download "Testing the Equality of Covariance Operators in Functional Samples"

Transcription

1 Scandinavian Journal of Statistics, Vol. 4: 38 5, 3 doi:./j x Board of the Foundation of the Scandinavian Journal of Statistics. Published by Blackwell Publishing Ltd. Testing the Equality of Covariance Operators in Functional Samples STEFAN FREMDT and JOSEF G. STEINEBACH Mathematical Institute, University of Cologne LAJOS HORVÁTH Department of Mathematics, University of Utah PIOTR KOKOSZKA Department of Statistics, Colorado State University ABSTRACT. We propose a non-parametric test for the equality of the covariance structures in two functional samples. The test statistic has a chi-square asymptotic distribution with a known number of degrees of freedom, which depends on the level of dimension reduction needed to represent the data. Detailed analysis of the asymptotic properties is developed. Finite sample performance is examined by a simulation study and an application to egg-laying curves of fruit flies. Key words: asymptotic distribution, covariance operator, functional data, quadratic forms, two sample problem. Introduction The last decade has seen increasing interest in methods of functional data analysis which offer novel and effective tools for dealing with problems where curves can naturally be viewed as data objects. The books by Ramsay & Silverman (5) and Ramsay et al. (9) offer comprehensive introductions to the subject, the collection by Ferraty & Romain () reviews some recent developments focusing on advances in the relevant theory, while the monographs of Bosq (), Ferraty & Vieu (6) and Horváth & Kokoszka () develop the field in several important directions. Despite the emergence of many alternative ways of looking at functional data, and many dimension reduction approaches, the functional principal components (FPCs) still remain the most important starting point for many functional data analysis procedures, and Reiss & Ogden (7), Gervini (8), Yao & Müller (), Gabrys et al. () are just a handful of illustrative references. The FPCs are the eigenfunctions of the covariance operator. This paper focuses on testing if the covariance operators of two functional samples are equal. By the Karhunen Loève expansion, this is equivalent to testing if both samples have the same set of FPCs. Benko et al. (9) developed bootstrap procedures for testing the equality of specific FPCs. Panaretos et al. () proposed a test of the type we consider, but assuming that the curves have a Gaussian distribution. The main result of Panaretos et al. () follows as a corollary of our more general approach (theorem ). A generalization to non-gaussian data was discussed in Panaretos et al. (, ). For some recent work confer also Boente et al. () who studied a related approach together with a corresponding bootstrap procedure. Despite their importance, two sample problems for functional data received relatively little attention. In addition to the work of Benko et al. (9) and Panaretos et al. (), the relevant references are Horváth et al. (9) and Horváth et al. () who focus, respectively, on the regression kernels in functional linear models and the mean of functional data exhibiting temporal dependence. For a recent contribution, see also Gaines et al. (), who

2 Scand J Statist 4 Equality of covariance operators 39 use a likelihood ratio-type approach for testing the equality of two covariance operators. Clearly, if some population parameters of two functional samples are different, estimating them using the pooled sample may lead to spurious conclusions. Due to the importance of the FPCs, a relatively simple and non-parametric procedure for testing the equality of the covariance operators is called for. The remainder of this paper is organized as follows. Section sets out the notation and definitions. The construction of the test statistic and its asymptotic properties are developed in section 3. Section 4 reports the results of a simulation study and illustrates the procedure by application to egg-laying curves of Mediterranean fruit flies. The proofs of the asymptotic results of section 3 are given in section 5.. Preliminaries Let X, X,..., X N be independent, identically distributed random variables with values in L [, ], the Hilbert space of square-integrable R-valued functions on [, ], and set EX i (t) = μ(t) and cov(x i (t), X i (s)) = C(t, s). We assume that another sample X *, X *,...X* M is also available and let μ * (t) = EX i * (t) and C * (t, s) = cov(x i * (t), X i * (s)) for t, s [, ]. We wish to test the null hypothesis H : C = C * against the alternative H A that H does not hold. A crucial assumption considering the asymptotics of our test procedure will be that Θ N,M = N Θ (, ) as N, M. () M + N For the construction of our test procedure, we will use an estimate of the asymptotic pooled covariance operator R of the two given samples [cf. (4)] which is defined by the kernel R(t, s) = ΘC(t, s) + ( Θ)C * (t, s). In the case of samples X i and X j * of Gaussian random functions, the latter approach has successfully been applied by Panaretos et al. () to construct an asymptotic test for checking the equality of two covariance operators (see also Panaretos et al., ). Denote by (λ, φ ), (λ, φ ),..., the eigenvalue/eigenfunction pairs of R, which are defined by λ k φ k (t) = Rφ k (t) = Throughout this paper, we assume R(t, s)φ k (s)ds, t [, ], k <. () λ > λ > > λ p > λ p +, (3) i.e. there exist at least p distinct (positive) eigenvalues. Under assumption (3), we can uniquely (up to signs) choose φ,..., φ p satisfying (), if we require φ i =, where always denotes the L -norm, e.g. for x L ([, ]), ( / x = x (t)dt). Thus, under (3), φ i, i p is an orthonormal system that can be extended to an orthonormal basis φ i, i <. Board of the Foundation of the Scandinavian Journal of Statistics.

3 4 S. Fremdt et al. Scand J Statist 4 If H holds, then (λ i, φ i ), i <, are also the eigenvalues/eigenfunctions of the covariance operators C of the first and C * of the second sample. To construct a test statistic which converges under H, we can therefore pool the two samples, as explained in section The test and the asymptotic results Along the lines of Panaretos et al. (), our procedure is also based on projecting the observations onto a suitably chosen finite-dimensional space. To define this space, introduce the empirical pooled covariance operator ˆR N,M defined by the kernel where ˆR N,M (t, s) = N + M X N (t) = N + N (X k (t) X N (t))(x k (s) X N (s)) (X k * (t) X * M(t))(X k * (s) X * M(s)), (4) X k (t) and X * M(t) = M X k * (t) are the sample mean functions. Let ( ˆλ i,ˆφ i ) denote the eigenvalues/eigenfunctions of ˆR N,M, i.e. ˆλ i ˆφ i (t) = ˆR N,M ˆφ i (t) = ˆR N,M (t, s) ˆφ i (s)ds, t [, ], i N + M, with ˆλ ˆλ. We can and will assume that the ˆφ i form an orthonormal system. We consider the projections and â k (i) = X k X N,ˆφ i = â * k( j) = X * k X * M,ˆφ j = (X k (t) X N (t)) ˆφ i (t)dt (5) ( ) X k * (t) X * M(t) ˆφ j (t)dt, (6) where, denotes the inner product of two elements of the Hilbert space L [, ]. To test H, we compare the matrices ˆΔ N and ˆΔ * M with entries and ˆΔ N (i, j) = â k (i)â k ( j), i, j p, N ˆΔ * M(i, j) = â * M k(i)â * k( j), i, j p. We note that ˆΔN (i, j) ˆΔ * M(i, j) is the projection of ˆφ i (t) ˆφ j (s), where Ĉ N (t, s) Ĉ * M(t, s) in the direction of Ĉ N (t, s) = (X k (t) X N (t))(x k (s) X N (s)) N Board of the Foundation of the Scandinavian Journal of Statistics.

4 Scand J Statist 4 Equality of covariance operators 4 and Ĉ * M(t, s) = M (X k * (t) X * M(t))(X k * (s) X * M(s)) are the empirical covariances of the two samples. We create the vector ˆξ N,M from the columns below the diagonal of ˆΔ N ˆΔ * M as follows: ˆΔ N (, ) ˆΔ * M(, ) ( ˆξ N,M = vech ˆΔN ˆΔ ) * ˆΔ M = N (, ) ˆΔ * M(, ).. (7) ˆΔ N (p, p) ˆΔ * M(p, p) For the properties of the vech operator, we refer to Abadir & Magnus (5). Next, we estimate the asymptotic covariance matrix of (MN/(N + M)) / ˆξ N,M. Note that, in general, this estimate differs from the one which was used in the Gaussian case (cf. Panaretos et al.,, and theorem ). Let ˆL N,M (k, k ) = ( Θ N,M ) â`(i)â`(j)â`(i )â`(j ) Ĉ N ˆφ N i,ˆφ j Ĉ N ˆφ i,ˆφ j ` = + Θ N,M â *`(i)â *`(j)â *`(i )â *`(j ) Ĉ * M ˆφ M i,ˆφ j Ĉ * M ˆφ i,ˆφ j, ` = where i, j, i, j depend on k, k (see below), and Ĉ N Ĉ N defined as (Ĉ * M) is interpreted as an operator with Ĉ N ˆφ i = Ĉ N (t, s) ˆφ i (s)ds. (An analogous definition holds for Ĉ * M.) From this definition it follows that Ĉ N ˆφ i,ˆφ j = N â`(i)â`( j). l = There are other ways to estimate the asymptotic covariance matrix. We note that one can use ˆL * N,M(k, k ) instead of ˆL N,M (k, k ), where ˆL * N,M(k, k ) is defined like ˆL N,M (k, k ), but Ĉ N ˆφ i,ˆφ j and Ĉ * M ˆφ i,ˆφ j are replaced with if i j and ˆλ i if i = j. In the same spirit, Ĉ N ˆφ i,ˆφ j and Ĉ * M ˆφ i,ˆφ j are replaced with for i j and ˆλ i if i = j. The index (i, j) is computed from k in the following way: Let k p(p + ) = k +, i = p i + and j = p j +. (8) We look at an upper triangle matrix (a i,j ). Then, for column j, wehavethat(j )j / < k j ( j + )/. Thus, j = k + 4 and i = k ( j )j /, where r = mink Z : k r for r R. Consequently, the index (i, j) can be computed from k via Board of the Foundation of the Scandinavian Journal of Statistics.

5 4 S. Fremdt et al. Scand J Statist 4 j = p p(p + ) k and i = k + p pj + j( j ). (9) With the above notation, we can formulate the main result of this paper in the non- Gaussian case. The latter case has briefly been mentioned (without any mathematical details) in the concluding remarks of Panaretos et al. () (see also Panaretos et al., ). Theorem. We assume that H, () and (3) hold, and E(X (t)) 4 dt <, E(X * (t)) 4 dt <. () Then, NM N + M ˆξ T N,M ˆL ˆξ D N,M N,M χ p(p + )/, as N, M, where χ p(p + )/ stands for a χ random variable with p(p + )/ degrees of freedom. Theorem implies that the null hypothesis is rejected if the test statistic ˆT = NM N + M ˆξ T N,M ˆL N,M ˆξ N,M exceeds a critical quantile of the chi-square distribution with p(p + )/ degrees of freedom. If both samples are Gaussian random processes, the quadratic form ˆξ T N,M ˆL ˆξ N,M N,M can be replaced with the normalized sum of the squares of ˆΔ N,M (i, j) ˆΔ * N,M(i, j), as stated in theorem (cf. Panaretos et al., ). Theorem. If X, X * then, as N, M, ˆT = NM N + M i,j p are Gaussian processes and the conditions of theorem are satisfied, ( ˆΔ N (i, j) ˆΔ * M(i, j)) ˆλ i ˆλ j D χ p(p + )/. Observe that the statistic ˆT can be written as ˆT = NM ( ˆΔ N (i, j) ˆΔ * N(i, j)) p ( + ˆΔ M (i, i) ˆΔ * M(i, i)) N + M ˆλ i < j p i ˆλ j i = ˆλ. i Next, we discuss the asymptotic consistency of the testing procedure based on theorem. Analogously to the definition of ˆξ N,M we define the vector ξ = (ξ(),..., ξ(p(p + )/)) using the columns of the matrix ( ) D = (C(t, s) C * (t, s))φ i (t)φ j (s)dtds () instead of ˆΔ N ˆΔ * M, i.e. ξ = vech(d). i,j =,...,p Board of the Foundation of the Scandinavian Journal of Statistics.

6 Scand J Statist 4 Equality of covariance operators 43 Theorem 3. We assume that H A, (), (3) and () hold. Then, there exist random variables ĥ = ĥ (N, M),..., ĥp(p + )/ = ĥ p(p + )/ (N, M), taking values in, such that, as N, M, max ˆξ N,M (i) ĥiξ(i) = o P () () i p(p + )/ and therefore ˆξ N,M P ξ, (3) where denotes the Euclidean norm. If ξ and the p largest eigenvalues of C and C * are positive, we also have ˆT P, as N, M. (4) The assumption that the p largest eigenvalues of C and C * are positive implies that the random functions X i, i =,..., N, and X j *, j =,..., M, are not included in a (p )-dimensional subspace. Remark. The application of the test requires the selection of the number p of the empirical FPCs to be used. A rule of thumb is to choose p so that the first p empirical FPCs in each sample (i.e. those calculated as the eigenfunctions of Ĉ N and Ĉ * M) explain about 85 9 per cent of the variance in each sample. Choosing p too large generally negatively affects the finite sample performance of tests of this type, and for this reason, we do not study asymptotics as p tends to infinity. It is often illustrative to apply the test for a range of the values of p; each p specifies a level of relevance of differences in the curves or kernels. A good practical approach is to look at the Karhunen Loève approximations of the curves in both samples, and choose p which gives approximation errors that can be considered unimportant. Cross validation has also been suggested in the literature without investigating its properties in detail. For a more formal discussion of this selection, confer also section 3.3 in Panaretos et al. (). 4. A simulation study and an application We first describe the results of a simulations study designed to evaluate finite sample properties of the tests based on the statistics ˆT and ˆT. The emphasis is on verifying the advantage of a non-parametric procedure, i.e. to see the robustness to the violation of the assumption of normality. We simulated Gaussian curves as Brownian motions and Brownian bridges, and non-gaussian curves via X (t) = A sin(πt) + B sin(πt) + C sin(4πt), (5) where A = 5Y, B = 3Y, C = Y 3 and Y, Y, Y 3 are independent t 5 -distributed random variables (similarly X * (t) for the second sample). All curves were simulated at equidistant points in the interval [, ], and transformed into functional data objects using the Fourier basis with 49 basis functions. For each data generating process, we used one thousand replications. Table shows the empirical sizes for non-gaussian data. The test based on ˆT has severely inflated size, due to the violation of the assumption of normality. As documented in Panaretos et al. (), and confirmed by our own simulations, this test has very good empirical size when the data are Gaussian. The test based on ˆT is conservative, especially for smaller sample sizes. This is true for both Gaussian and non-gaussian data; there is not much difference in the empirical size of this test for different data-generating processes. Board of the Foundation of the Scandinavian Journal of Statistics.

7 44 S. Fremdt et al. Scand J Statist 4 Table. Empirical sizes of the tests based on statistics ˆT and ˆT for non-gaussian data. The curves in each sample were generated according to (5) ˆT ˆT Sample sizes % 5% % % 5% % p = N = M = N = M = N = M = p = 3 N = M = N = M = N = M = Table gives an example of the empirical power of the test based on statistic ˆT. The test was carried out for two equally sized samples of, 5 and realizations, respectively, of (5) for the first sample and scaled versions of (5), i.e. X * (t) = cx (t), for the second sample. The results are displayed for a selection of values for the scaling parameter c. It can be seen that in all cases the power increases with the sample size. As can be expected, the convergence of the power towards improves for larger deviations (c ) from the null hypothesis. Since, due to the inflated size of the test based on ˆT in the non-gaussian case (cf. Table ), its power is (misleadingly) higher than that of the test based on ˆT and thus will not be displayed here. We also studied a Monte Carlo version of the test based on the statistic ˆT 3 = NM(N + M) ˆξ T ˆξ N,M N,M and found that its finite sample properties were similar to those of the test based on ˆT. We now describe the results of the application of both tests to an interesting data set consisting of egg-laying trajectories of Mediterranean fruit flies (medflies). The data were kindly made available to us by Hans Georg Müller. This data set has been extensively studied in biological and statistical literature; see Müller & Stadtmüller (5) and references therein. We consider 534 egg-laying curves of medflies who lived at least 34 days, but we only consider the egg-laying activities on the first 3 days. We examined two versions of these egglaying curves. The curves are scaled such that the functions in either version are defined on Table. Power of the test based on statistic ˆT for non-gaussian data. The curves in the equally sized samples were generated according to (5) in the first sample and as a scaled version of (5) in the second sample, i.e. X * (t) = cx (t) c =.8 c =.9 p = p = 3 p = p = 3 N, M % 5% % % 5% % % 5% % % 5% % c =. c =. p = p = 3 p = p = 3 N, M % 5% % % 5% % % 5% % % 5% % Board of the Foundation of the Scandinavian Journal of Statistics.

8 Scand J Statist 4 Equality of covariance operators 45 the interval [, ]. Version curves (denoted X i (t)) are the absolute counts of eggs laid by fly i on day 3t. Version curves (denoted Y i (t)) are the counts of eggs laid by fly i on day 3t relative to the total number of eggs laid in the lifetime of fly i. The 534 flies are classified into long-lived, i.e. those who lived 44 days or longer, and short-lived, i.e. those who died before the end of the 43rd day after birth. In the data set, there are 56 short-lived and 78 long-lived flies. This classification naturally defines two samples: Sample : the egglaying curves X i (t)(resp.y i (t)), t, i =,,..., 56 of the short-lived flies. Sample : the egg-laying curves X j * (t)(resp.y j * (t)), < t 3, j =,,..., 78 of the long-lived flies. The egg-laying curves are very irregular; Fig. shows ten (smoothed) curves of short- and longlived flies for version, and Fig. shows ten (smoothed) curves for version (both using a B-spline basis for the representation). Table 3 shows the p-values for the absolute egg-laying counts (version ). For the statistic ˆT, the null hypothesis cannot be rejected irrespective of the choice of p. For the statistic ˆT, the result of the test varies depending on the choice of p. As explained in section 3, the usual recommendation is to use the values of p which explain 85 to 9 per cent of the variance Fig.. Ten randomly selected smoothed egg-laying curves of short-lived medflies (left panel) and ten such curves for long-lived medflies (right panel) Fig.. Ten randomly selected smoothed egg-laying curves of short-lived medflies (left panel) and ten such curves for long-lived medflies (right panel), relative to the number of eggs laid in the fly s lifetime. Board of the Foundation of the Scandinavian Journal of Statistics.

9 46 S. Fremdt et al. Scand J Statist 4 Table 3. The p-values (in per cent) of the test based on statistics ˆT and ˆT applied to absolute medfly data. Here f p denotes the fraction of the sample variance explained by the first p FPCs, i.e. f p = ( p ˆλ k )/( N + M ˆλ k ) p-values p ˆT ˆT f p For such values of p, ˆT leads to a clear rejection. Since this test has however overinflated size, we conclude that there is little evidence that the covariance structures of version curves for long- and short-lived flies are different. For the version curves, the statistic ˆT yields p-values equal to zero (in machine precision), potentially indicating that the covariance structures for the short- and long-lived flies are different. The assumption of a normal distribution is however questionable, as the QQ-plots in Fig. 3 show. These QQ-plots are constructed for the inner products Y i, e k and Yi, e k, where the Y i are the curves from one of the samples (we cannot pool the data to construct QQ-plots because we test if the stochastic structures are different), and e k is the kth element of the Fourier basis. The normality of a functional sample implies the normality of all projections onto a complete orthonormal system. For X i, e k, the QQ-plots show a strong deviation from a straight line for some projections. Almost all projections Y i, e k have QQ-plots indicating a strong deviation from normality. It is therefore important to apply the non-parametric test based on the statistic ˆT. The corresponding p-values for version are displayed in Table 4. For most values of p, these p-values Fig. 3. Normal QQ-plots for the scores of the version medfly data with respect to the first two Fourier basis functions. Left sample, Right sample. Board of the Foundation of the Scandinavian Journal of Statistics.

10 Scand J Statist 4 Equality of covariance operators 47 Table 4. The p-values (in per cent) of the test based on statistics ˆT applied to relative medfly data; f p denotes the fraction of the sample variance explained by the first p FPCs, i.e. f p = ( p ˆλ k )/( N + M ˆλ k ) p-values p ˆT f p p ˆT f p indicate the rejection of H. Many of them hover around the 5 per cent level, but since the test is conservative, we can with confidence view them as favouring H A. The above application confirms the properties of the statistics established through the simulation study. It shows that while there is little evidence that the covariance structures for the absolute counts are different, there is strong evidence that they are different for relative counts. 5. Proofs of the results of section 3 The proof of theorem follows from several lemmas, which we establish first. We can and will assume without loss of generality that μ(t) = μ * (t) = for all t [, ]. We will use the identity N / (X k (t) X N (t))(x k (s) X N (s)) = N / X k (t)x k (s) N / X N (t) X N (s), (6) and an analogous identity for the second sample. Our first lemma establishes bounds in probability which will often be used in the proofs. Lemma. Under the assumptions of theorem, as N, M, N / X k (t)x k (s) C(t, s) = O P(), (7) N / X N (t) = O P (), (8) and M / X k * (t)x k * (s) C * (t, s) = O P(), (9) M / X * M(t) = O P (), () where here and in the sequel the notation is also used for the corresponding norm in L ([, ] ). Board of the Foundation of the Scandinavian Journal of Statistics.

11 48 S. Fremdt et al. Scand J Statist 4 Proof. These are classical estimates and can easily be obtained by a straightforward calculation of the second moments. Note, for example, that [ E X N / k (t)x k (s) C(t, s)] dt ds = EX (t)x (s) C(t, s) dt ds, so, by Markov s inequality, we have X k (t)x k (s) C(t, s) = O P (). N / Similar arguments yield (8) (). Confer also Dauxois et al. (98) for an early reference. Lemma shows that the estimation of the mean functions, cf. the definition of the projections â k (i) and â k( j) in (5) and (6), has an asymptotically negligible effect. Lemma. Under the assumptions of theorem, for all i, j p, as N, M, and N / ˆΔN (i, j) = N / M / ˆΔ* M(i, j) = M / ( X k,ˆφ i X k,ˆφ j + O ) P N / ( X k *,ˆφ i X k *,ˆφ j + O ) P M /. Proof. Using (6) and (8), we have by the Cauchy Schwarz inequality, N / X N (t) X N (s) ˆφ i (t) ˆφ j (s)dtds = N / N / X N (t) ˆφ i (t)dt N / X N (s) ˆφ j (s)ds ( ( N / N / X N (t) ) ) / ( ( dt ˆφ i (t)dt N / X N (s) ) ds ( = N / N / X N (t) ) dt ) / ˆφ j (s)ds ( = O ) P N /. The second part can be proven in the same way. We now state bounds on the distances between the estimated and the population eigenvalues and eigenfunctions. These bounds are true under the null hypothesis and extend the corresponding one sample bounds. Lemma 3. If the conditions of theorem are satisfied, then, as N, M, max ( ˆλ i λ i = O ) P (N + M) / i p and max ˆφ ( i ĉ i φ i = O ) P (N + M) /, i p where ĉ i = ĉ i (N, M) = sign( ˆφ i, φ i ). Board of the Foundation of the Scandinavian Journal of Statistics.

12 Scand J Statist 4 Equality of covariance operators 49 Proof.These estimates are also well-known (cf., e.g. Bosq,, lemma 4.3 and assertion (4.43), or Horváth & Kokoszka,, lemmas..3). Note that the first rate above is independent of p, whereas the second one may actually depend on the projection dimension p. Lemma 3 now allows us to replace the estimated eigenfunctions by their population counterparts. The random signs ĉ i must appear in the formulation of lemma 4, but they cancel in the subsequent results. Lemma 4. If the conditions of theorem are satisfied, then, for all i, j p, as N, M, ( NM N + M = ) / ( ˆΔ N (i, j) ˆΔ * M(i, j)) ) / ( NM N + M N X k,ĉ i φ i X k,ĉ j φ j M X k *,ĉ i φ i X k *,ĉ j φ j + o P (). Proof. We write N X k,ˆφ i X k,ˆφ j C(t, s) ˆφ i (t) ˆφ j (s)dtds = N / (X N / k (t)x k (s) C(t, s)) ˆφ i (t) ˆφ j (s)dtds. Using lemmas 3 we get N / = (X k (t)x k (s) C(t, s)) (ˆφ i (t) ˆφ j (s) ĉ i φ i (t)ĉ j φ j (s)) dt ds (X N / k (t)x k (s) C(t, s)) ( ˆφ i (t) ĉ i φ i (t)) ˆφ j (s) + ĉ i φ i (t)( ˆφ j (s) ĉ j φ j (s)) dt ds (X N / k (t)x k (s) C(t, s)) dt ds + = N / = o P (). ) / (ˆφ i (t) ĉ i φ i (t)) ˆφ j (s)dtds (X N / k (t)x k (s) C(t, s)) dt ds ) / φ i (t)( ˆφ j (s) ĉ j φ j (s)) dt ds (X k (t)x k (s) C(t, s)) ˆφ i ĉ i φ i + ˆφ j ĉ j φ j Board of the Foundation of the Scandinavian Journal of Statistics.

13 5 S. Fremdt et al. Scand J Statist 4 Similar arguments give that (X * M / k (t)x k * (s) C * (t, s)) ˆφ i (t) ˆφ j (s) ĉ i φ i (t)ĉ j φ j (s) dt ds = o P(). Since C = C *, the lemma is proven. The previous lemmas isolated the main terms in the differences ˆΔ N (i, j) ˆΔ * M(i, j). The following lemma describes the limits of these main terms (without the random signs). Lemma 5. If the conditions of theorem are satisfied, then, as N, M, where Δ N,M (i, j), i, j p D Δ(i, j), i, j p, Δ N,M (i, j) = ( ) / NM N + M N X k, φ i X k, φ j M and Δ(i, j), i, j p is a Gaussian matrix with EΔ(i, j) = and X k *, φ i X k *, φ j, EΔ(i, j)δ(i, j ) = ( Θ)E( X, φ i X, φ j X, φ i X, φ j ) E( X, φ i X, φ j )E( X, φ i X, φ j ) + ΘE( X *, φ i X *, φ j X *, φ i X *, φ j ) E( X *, φ i X *, φ j )E( X *, φ i X *, φ j ). Proof. First we note that E X, φ i X, φ j = E X *, φ i X *, φ j = if i j, λ i if i = j. Since E( X, φ i X, φ j ) < and E( X *, φ i X *, φ j ) <, the multivariate central limit theorem implies the result. Finally, we need an asymptotic approximation to the covariances ˆL N,M (k, k ). Let L N,M (k, k ) = ( Θ N,M ) a`(i)a`( j)a`(i )a`( j ) Ĉ N ˆφ N i,ˆφ j Ĉ N ˆφ i,ˆφ j ` = + Θ N,M a *`(i)a *`( j)a *`(i )a *`( j ) Ĉ * M ˆφ M i,ˆφ j Ĉ * M ˆφ i,ˆφ j, where ` = a`(i) = X`, φ i and a *`(i) = X *`, φ i, and i, j, i, j are determined from k and k as in (8) and (9). Lemma 6. If the conditions of theorem are satisfied, then for all k, k p(p + )/, ˆL N,M (k, k ) ĉ i ĉ j ĉ i ĉ j L N,M (k, k ) = o P () as N, M, where (i, j) and (i, j ) are determined from k and k as in (8) and (9). Proof. The result follows from lemma 3 along the lines of the proof of lemma 4. Board of the Foundation of the Scandinavian Journal of Statistics.

14 Scand J Statist 4 Equality of covariance operators 5 Proof of theorem. According to lemma and lemmas 4 6, the asymptotic distribution of ˆξ T N.M ˆL ˆξ N,M N,M does not depend on the signs ĉ,...,ĉ p, so it is sufficient to prove the result for ĉ = = ĉ p =. The law of large numbers yields that P L N,M (k, k ) L(k, k ), () where L(k, k ) = ( Θ)E ( a (i)a ( j)a (i )a ( j ) ) E ( a (i)a ( j)a (i )a ( j ) ) + ΘE ( a (i)a * ( * j)a (i * )a ( * j ) ) E ( a (i)a * ( * j)a (i * )a ( * j ) ). () The result then follows from lemmas, 4 and 5. Proof of theorem. In the case of Gaussian observations, Δ(i, j), i j p, are independent normal random variables with mean and EΔ λi λ (i, j) = j if i = j, λ i if i = j. Now the result follows from lemmas 5. For more details, we refer to Panaretos et al. (). Proof of theorem 3. First, we observe that by the law of large numbers we have ( ˆR N,M (t, s) R(t, s)) dt ds = o P (). Hence, using the result in section VI.. of Gohberg et al. (99), we get that max ˆλ i λ i = o P () (3) and i p max ˆφ i ĉ i φ i = o P (), (4) i p where ĉ i = ĉ i (N, M) = sign( ˆφ i, φ i ). Relations (3) and (4) show that lemma 3 remains true. It follows from the law of large numbers and (4) that for all i, j p ˆΔ N (i, j) ˆΔ * M(i, j) ĉ i ĉ j (C(t, s) C * (t, s))φ i (t)φ j (s) dt ds ) = (ĈN (t, s) Ĉ * M(t, s) ˆφ i (t) ˆφ j (s)dtds ĉ i ĉ j (C(t, s) C * (t, s))φ i (t)φ j (s)dtds + )) (ĈN (t, s) C(t, s) (Ĉ* M(t, s) C * (t, s) ˆφ i (t) ˆφ j (s)dtds (C(t, s) C * (t, s))( ˆφ i (t) ˆφ j (s) ĉ i φ i (t)ĉ i φ j (s)) dt ds Ĉ N C + Ĉ M C * + C C * ˆφ i ˆφ j ĉ i φ i ĉ i φ j = o P (), where the fact that φ i = = ˆφ i was used. Hence, the proof of () is complete. It is also clear that () implies (3). Board of the Foundation of the Scandinavian Journal of Statistics.

15 5 S. Fremdt et al. Scand J Statist 4 Next, we observe that lemma 6 and () remain true under the alternative. Now, by some lengthy calculations, it can be verified that L given in () is positive definite so that (4) follows from (3). Acknowledgements Research partially supported by NSF grants DMS 954 at the University of Utah, DMS and DMS at Colorado State University and DFG grant STE 36/- at the University of Cologne. References Abadir, K. M. & Magnus, J. R. (5). Matrix algebra. Cambridge University Press, New York. Benko, M., Härdle, W. & Kneip, A. (9). Common functional principal components. Ann. Statist. 37, 34. Boente, G., Rodriguez, D. & Sued, M. (). Testing the equality of covariance operators. In Recent advances in functional data analysis and related topics (ed. F. Ferraty), Physica-Verlag, Heidelberg. Bosq, D. (). Linear processes in function spaces. Springer, New York. Dauxois, J., Pousse, A. & Romain, Y. (98). Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference. J. Multivariate Anal., Ferraty, F. & Romain, Y. eds (). The Oxford handbook of functional data analysis. Oxford University Press, Oxford. Ferraty, F. & Vieu, P. (6). Nonparametric functional data analysis: theory and practice. Springer, New York. Gabrys, R., Horváth, L. & Kokoszka, P. (). Tests for error correlation in the functional linear model. J. Amer. Statist. Assoc. 5, 3 5. Gaines, G., Kaphle, K. & Ruymgaart, F. (). Application of a delta-method for random operators to testing equality of two covariance operators. Math. Meth. Statist., Gervini, D. (8). Robust functional estimation using the spatial median and spherical principal components. Biometrika 95, Gohberg, I., Goldberg, S. & Kaashoek, M. A. (99). Classes of linear operators. Operator theory: Advances and applications, 49. Birkhäuser, Basel. Horváth, L. & Kokoszka, P. (). Inference for functional data with applications. Springer Series in Statistics. Springer, New York (in press). Horváth, L., Kokoszka, P. & Reeder, R. (). Estimation of the mean of functional time series and a two sample problem. J. Roy. Statist. Soc., Ser. B (in press). Horváth, L., Kokoszka, P. & Reimherr, M. (9). Two sample inference in functional linear models. Canad. J. Statist. 37, Müller, H. G. & Stadtmüller, U. (5). Generalized functional linear models. Ann. Statist. 33, Panaretos, V. M., Kraus, D. & Maddocks, J. H. (). Second-order comparison of Gaussian random functions and the geometry of DNA minicircles. J. Amer. Statist. Assoc. 5, Panaretos, V. M., Kraus, D. & Maddocks, J. H. (). Second-order inference for functional data with application to DNA minicircles. In Recent advances in functional data analysis and related topics (ed. F. Ferraty), Physica-Verlag, Heidelberg. Ramsay, J. O. & Silverman, B. W. (5). Functional data analysis. Springer, New York. Ramsay, J., Hooker, G. & Graves, S. (9). Functional data analysis with R and MATLAB. Springer, New York. Reiss, P. T. & Ogden, R. T. (7). Functional principal component regression and functional partial least squares. J. Amer. Statist. Assoc., Yao, F. & Müller, H. G. (). Functional quadratic regression. Biometrika 97, Received February, in final form February Josef G. Steinebach, Mathematical Institute, University of Cologne, Weyertal 86-9, 593 Köln, Germany. jost@math.uni-koeln.de Board of the Foundation of the Scandinavian Journal of Statistics.

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Estimation of the mean of functional time series and a two-sample problem

Estimation of the mean of functional time series and a two-sample problem J. R. Statist. Soc. B (01) 74, Part 5, pp. Estimation of the mean of functional time series and a two-sample problem Lajos Horváth, University of Utah, Salt Lake City, USA Piotr okoszka Colorado State

More information

FUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1

FUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1 FUNCTIONAL DATA ANALYSIS Contribution to the International Handbook (Encyclopedia) of Statistical Sciences July 28, 2009 Hans-Georg Müller 1 Department of Statistics University of California, Davis One

More information

Tests for separability in nonparametric covariance operators of random surfaces

Tests for separability in nonparametric covariance operators of random surfaces Tests for separability in nonparametric covariance operators of random surfaces Shahin Tavakoli (joint with John Aston and Davide Pigoli) April 19, 2016 Analysis of Multidimensional Functional Data Shahin

More information

Distances and inference for covariance operators

Distances and inference for covariance operators Royal Holloway Probability and Statistics Colloquium 27th April 2015 Distances and inference for covariance operators Davide Pigoli This is part of joint works with J.A.D. Aston I.L. Dryden P. Secchi J.

More information

Conditional functional principal components analysis

Conditional functional principal components analysis Conditional functional principal components analysis Hervé Cardot CESAER, UMR INRA-ENESAD. March 27, 2006 Abstract This work proposes an extension of the functional principal components analysis, or Karhunen-Loève

More information

A Note on Hilbertian Elliptically Contoured Distributions

A Note on Hilbertian Elliptically Contoured Distributions A Note on Hilbertian Elliptically Contoured Distributions Yehua Li Department of Statistics, University of Georgia, Athens, GA 30602, USA Abstract. In this paper, we discuss elliptically contoured distribution

More information

Tests for error correlation in the functional linear model

Tests for error correlation in the functional linear model Tests for error correlation in the functional linear model Robertas Gabrys Utah State University Lajos Horváth University of Utah Piotr Kokoszka Utah State University Abstract The paper proposes two inferential

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

arxiv: v1 [stat.me] 20 Jan 2017

arxiv: v1 [stat.me] 20 Jan 2017 Permutation tests for the equality of covariance operators of functional data with applications to evolutionary biology Alessandra Cabassi 1, Davide Pigoli 2, Piercesare Secchi 3, and Patrick A. Carter

More information

Chapter 13: Functional Autoregressive Models

Chapter 13: Functional Autoregressive Models Chapter 13: Functional Autoregressive Models Jakub Černý Department of Probability and Mathematical Statistics Stochastic Modelling in Economics and Finance December 9, 2013 1 / 25 Contents 1 Introduction

More information

Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles

Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles Supplementary materials for this article are available online. Please click the JASA link at http://pubs.amstat.org. Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles

More information

Detecting changes in the mean of functional observations

Detecting changes in the mean of functional observations J. R. Statist. Soc. B (2009) 71, Part 5, pp. 927 946 Detecting changes in the mean of functional observations István Berkes, Graz University of Technology, Austria Robertas Gabrys, Utah State University,

More information

REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE

REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE Hans-Georg Müller 1 and Fang Yao 2 1 Department of Statistics, UC Davis, One Shields Ave., Davis, CA 95616 E-mail: mueller@wald.ucdavis.edu

More information

Focusing on structural assumptions in regression on functional variable.

Focusing on structural assumptions in regression on functional variable. Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS043) p.798 Focusing on structural assumptions in regression on functional variable. DELSOL, Laurent Université d

More information

Smooth Common Principal Component Analysis

Smooth Common Principal Component Analysis 1 Smooth Common Principal Component Analysis Michal Benko Wolfgang Härdle Center for Applied Statistics and Economics benko@wiwi.hu-berlin.de Humboldt-Universität zu Berlin Motivation 1-1 Volatility Surface

More information

Weak invariance principles for sums of dependent random functions

Weak invariance principles for sums of dependent random functions Available online at www.sciencedirect.com Stochastic Processes and their Applications 13 (013) 385 403 www.elsevier.com/locate/spa Weak invariance principles for sums of dependent random functions István

More information

An Introduction to Functional Data Analysis

An Introduction to Functional Data Analysis An Introduction to Functional Data Analysis Chongzhi Di Fred Hutchinson Cancer Research Center cdi@fredhutch.org Biotat 578A: Special Topics in (Genetic) Epidemiology November 10, 2015 Textbook Ramsay

More information

Empirical Dynamics for Longitudinal Data

Empirical Dynamics for Longitudinal Data Empirical Dynamics for Longitudinal Data December 2009 Short title: Empirical Dynamics Hans-Georg Müller 1 Department of Statistics University of California, Davis Davis, CA 95616 U.S.A. Email: mueller@wald.ucdavis.edu

More information

Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1

Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1 Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1 Short title: Derivative Principal Component Analysis Xiongtao Dai, Hans-Georg Müller Department

More information

FUNCTIONAL DATA ANALYSIS

FUNCTIONAL DATA ANALYSIS FUNCTIONAL DATA ANALYSIS Hans-Georg Müller Department of Statistics University of California, Davis One Shields Ave., Davis, CA 95616, USA. e-mail: mueller@wald.ucdavis.edu KEY WORDS: Autocovariance Operator,

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

More information

Additive modelling of functional gradients

Additive modelling of functional gradients Biometrika (21), 97,4,pp. 791 8 C 21 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asq6 Advance Access publication 1 November 21 Additive modelling of functional gradients BY HANS-GEORG MÜLLER

More information

Modeling Multi-Way Functional Data With Weak Separability

Modeling Multi-Way Functional Data With Weak Separability Modeling Multi-Way Functional Data With Weak Separability Kehui Chen Department of Statistics University of Pittsburgh, USA @CMStatistics, Seville, Spain December 09, 2016 Outline Introduction. Multi-way

More information

Supplementary Material for Testing separability of space-time functional processes

Supplementary Material for Testing separability of space-time functional processes Biometrika (year), volume, number, pp. 20 Printed in Great Britain Advance Access publication on date Supplementary Material for Testing separability of space-time functional processes BY P. CONSTANTINOU

More information

Functional modeling of longitudinal data

Functional modeling of longitudinal data CHAPTER 1 Functional modeling of longitudinal data 1.1 Introduction Hans-Georg Müller Longitudinal studies are characterized by data records containing repeated measurements per subject, measured at various

More information

Two sample inference for the second-order property of temporally dependent functional data

Two sample inference for the second-order property of temporally dependent functional data Bernoulli 21(2), 2015, 909 929 DOI: 10.3150/13-BEJ592 Two sample inference for the second-order property of temporally dependent functional data XIANYANG ZHANG 1 and XIAOFENG SHAO 2 1 Department of Statistics,

More information

arxiv: v2 [math.pr] 27 Oct 2015

arxiv: v2 [math.pr] 27 Oct 2015 A brief note on the Karhunen-Loève expansion Alen Alexanderian arxiv:1509.07526v2 [math.pr] 27 Oct 2015 October 28, 2015 Abstract We provide a detailed derivation of the Karhunen Loève expansion of a stochastic

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

More information

A randomness test for functional panels

A randomness test for functional panels A randomness test for functional panels Piotr Kokoszka 1, Matthew Reimherr 2, and Nikolas Wölfing 3 arxiv:1510.02594v3 [stat.me] 10 Jul 2016 1 Colorado State University 2 Pennsylvania State University

More information

Bivariate Splines for Spatial Functional Regression Models

Bivariate Splines for Spatial Functional Regression Models Bivariate Splines for Spatial Functional Regression Models Serge Guillas Department of Statistical Science, University College London, London, WC1E 6BTS, UK. serge@stats.ucl.ac.uk Ming-Jun Lai Department

More information

Modeling Repeated Functional Observations

Modeling Repeated Functional Observations Modeling Repeated Functional Observations Kehui Chen Department of Statistics, University of Pittsburgh, Hans-Georg Müller Department of Statistics, University of California, Davis Supplemental Material

More information

Independent component analysis for functional data

Independent component analysis for functional data Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability

More information

Fractal functional regression for classification of gene expression data by wavelets

Fractal functional regression for classification of gene expression data by wavelets Fractal functional regression for classification of gene expression data by wavelets Margarita María Rincón 1 and María Dolores Ruiz-Medina 2 1 University of Granada Campus Fuente Nueva 18071 Granada,

More information

The Mahalanobis distance for functional data with applications to classification

The Mahalanobis distance for functional data with applications to classification arxiv:1304.4786v1 [math.st] 17 Apr 2013 The Mahalanobis distance for functional data with applications to classification Esdras Joseph, Pedro Galeano and Rosa E. Lillo Departamento de Estadística Universidad

More information

AN F TEST FOR LINEAR MODELS WITH FUNCTIONAL RESPONSES

AN F TEST FOR LINEAR MODELS WITH FUNCTIONAL RESPONSES Statistica Sinica 14(2004), 1239-1257 AN F TEST FOR LINEAR MODELS WITH FUNCTIONAL RESPONSES Qing Shen and Julian Faraway Edmunds.com and University of Michigan Abstract: Linear models where the response

More information

On a Nonparametric Notion of Residual and its Applications

On a Nonparametric Notion of Residual and its Applications On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014

More information

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto

More information

Detecting and dating structural breaks in functional data without dimension reduction

Detecting and dating structural breaks in functional data without dimension reduction Detecting and dating structural breaks in functional data without dimension reduction Alexander Aue Gregory Rice Ozan Sönmez August 28, 2017 Abstract Methodology is proposed to uncover structural breaks

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS Rituparna Sen Monday, July 31 10:45am-12:30pm Classroom 228 St-C5 Financial Models Joint work with Hans-Georg Müller and Ulrich Stadtmüller 1. INTRODUCTION

More information

Weakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah

Weakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah Weakly dependent functional data Piotr Kokoszka Utah State University Joint work with Siegfried Hörmann University of Utah Outline Examples of functional time series L 4 m approximability Convergence of

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Cramér-von Mises Gaussianity test in Hilbert space

Cramér-von Mises Gaussianity test in Hilbert space Cramér-von Mises Gaussianity test in Hilbert space Gennady MARTYNOV Institute for Information Transmission Problems of the Russian Academy of Sciences Higher School of Economics, Russia, Moscow Statistique

More information

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt. SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Functional principal component and factor analysis of spatially correlated data

Functional principal component and factor analysis of spatially correlated data Boston University OpenBU Theses & Dissertations http://open.bu.edu Boston University Theses & Dissertations 2014 Functional principal component and factor analysis of spatially correlated data Liu, Chong

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction J. Appl. Math & Computing Vol. 13(2003), No. 1-2, pp. 457-470 ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS Myongsik Oh Abstract. The comparison of two or more Lorenz

More information

MIXTURE INNER PRODUCT SPACES AND THEIR APPLICATION TO FUNCTIONAL DATA ANALYSIS

MIXTURE INNER PRODUCT SPACES AND THEIR APPLICATION TO FUNCTIONAL DATA ANALYSIS MIXTURE INNER PRODUCT SPACES AND THEIR APPLICATION TO FUNCTIONAL DATA ANALYSIS Zhenhua Lin 1, Hans-Georg Müller 2 and Fang Yao 1,3 Abstract We introduce the concept of mixture inner product spaces associated

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Diagnostics for functional regression via residual processes

Diagnostics for functional regression via residual processes Diagnostics for functional regression via residual processes Jeng-Min Chiou Academia Sinica, Taiwan E-mail: jmchiou@stat.sinica.edu.tw Hans-Georg Müller University of California, Davis, USA E-mail: mueller@wald.ucdavis.edu

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Mixture of Gaussian Processes and its Applications

Mixture of Gaussian Processes and its Applications Mixture of Gaussian Processes and its Applications Mian Huang, Runze Li, Hansheng Wang, and Weixin Yao The Pennsylvania State University Technical Report Series #10-102 College of Health and Human Development

More information

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

Diagnostics for functional regression via residual processes

Diagnostics for functional regression via residual processes Diagnostics for functional regression via residual processes Jeng-Min Chiou a, Hans-Georg Müller b, a Academia Sinica, 128 Academia Road Sec.2, Taipei 11529, Taiwan b University of California, Davis, One

More information

Defect Detection using Nonparametric Regression

Defect Detection using Nonparametric Regression Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Testing relevant hypotheses in functional time series via self-normalization

Testing relevant hypotheses in functional time series via self-normalization esting relevant hypotheses in functional time series via self-normalization Holger Dette, Kevin Kokot Stanislav Volgushev arxiv:1809.06092v1 [stat.me] 17 Sep 2018 Ruhr-Universität Bochum Fakultät für Mathematik

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals Acta Applicandae Mathematicae 78: 145 154, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 145 Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals M.

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Supervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth

Supervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth Supervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth Chong Ma 1 David B. Hitchcock 2 1 PhD Candidate University of South Carolina 2 Associate Professor

More information

Contribute 1 Multivariate functional data depth measure based on variance-covariance operators

Contribute 1 Multivariate functional data depth measure based on variance-covariance operators Contribute 1 Multivariate functional data depth measure based on variance-covariance operators Rachele Biasi, Francesca Ieva, Anna Maria Paganoni, Nicholas Tarabelloni Abstract We introduce a generalization

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Functional Data Analysis for Sparse Longitudinal Data

Functional Data Analysis for Sparse Longitudinal Data Fang YAO, Hans-Georg MÜLLER, and Jane-Ling WANG Functional Data Analysis for Sparse Longitudinal Data We propose a nonparametric method to perform functional principal components analysis for the case

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Bahadur representations for bootstrap quantiles 1

Bahadur representations for bootstrap quantiles 1 Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Fundamental concepts of functional data analysis

Fundamental concepts of functional data analysis Fundamental concepts of functional data analysis Department of Statistics, Colorado State University Examples of functional data 0 1440 2880 4320 5760 7200 8640 10080 Time in minutes The horizontal component

More information

A Stickiness Coefficient for Longitudinal Data

A Stickiness Coefficient for Longitudinal Data A Stickiness Coefficient for Longitudinal Data Revised Version November 2011 Andrea Gottlieb 1 Department of Statistics University of California, Davis 1 Shields Avenue Davis, CA 95616 U.S.A. Phone: 1

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

A trigonometric orthogonality with respect to a nonnegative Borel measure

A trigonometric orthogonality with respect to a nonnegative Borel measure Filomat 6:4 01), 689 696 DOI 10.98/FIL104689M Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat A trigonometric orthogonality with

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Tutorial on Functional Data Analysis

Tutorial on Functional Data Analysis Tutorial on Functional Data Analysis Ana-Maria Staicu Department of Statistics, North Carolina State University astaicu@ncsu.edu SAMSI April 5, 2017 A-M Staicu Tutorial on Functional Data Analysis April

More information

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15

More information

Estimating Mixture of Gaussian Processes by Kernel Smoothing

Estimating Mixture of Gaussian Processes by Kernel Smoothing This is the author s final, peer-reviewed manuscript as accepted for publication. The publisher-formatted version may be available through the publisher s web site or your institution s library. Estimating

More information

Two sample inference for the second-order property of temporally dependent functional data

Two sample inference for the second-order property of temporally dependent functional data Bernoulli 21(2), 2015, 909 929 DOI: 10.3150/13-BEJ592 arxiv:1506.00847v1 [math.st] 2 Jun 2015 Two sample inference for the second-order property of temporally dependent functional data XIANYANG ZHANG 1

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Banach Journal of Mathematical Analysis ISSN: (electronic)

Banach Journal of Mathematical Analysis ISSN: (electronic) Banach J. Math. Anal. 6 (2012), no. 1, 139 146 Banach Journal of Mathematical Analysis ISSN: 1735-8787 (electronic) www.emis.de/journals/bjma/ AN EXTENSION OF KY FAN S DOMINANCE THEOREM RAHIM ALIZADEH

More information

SOME INSIGHTS ABOUT THE SMALL BALL PROBABILITY FACTORIZATION FOR HILBERT RANDOM ELEMENTS

SOME INSIGHTS ABOUT THE SMALL BALL PROBABILITY FACTORIZATION FOR HILBERT RANDOM ELEMENTS Statistica Sinica 27 (2017), 1949-1965 doi:https://doi.org/10.5705/ss.202016.0128 SOME INSIGHTS ABOUT THE SMALL BALL PROBABILITY FACTORIZATION FOR HILBERT RANDOM ELEMENTS Enea G. Bongiorno and Aldo Goia

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions

Testing for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions Hong Kong Baptist University HKBU Institutional Repository Department of Economics Journal Articles Department of Economics 1998 Testing for a unit root in an ar(1) model using three and four moment approximations:

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information