This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px
Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Transcription

1 his article appeared in a journal published by Elsevier. he attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or ex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

2 Journal of Multivariate Analysis 102 (2011) Contents lists available at Scienceirect Journal of Multivariate Analysis journal homepage: imension estimation in sufficient dimension reduction: A unifying approach E. Bura a,b,, J. Yang c a epartment of Statistics, George Washington University, Washington, C 20052, United States b Vertex Pharmaceuticals, Inc., Cambridge, MA 02139, United States c Rho, Inc., 199 Wells Avenue, Suite 302, Newton, MA 02459, United States a r t i c l e i n f o a b s t r a c t Article history: Received 8 July 2009 Available online 20 August 2010 AMS subject classifications: 62B05 62H12 62H15 62H10 62J99 62G99 Keywords: Random matrix Chi-square and weighted chi-square tests imension reduction SIR SAVE Sufficient imension Reduction (SR) in regression comprises the estimation of the dimension of the smallest (central) dimension reduction subspace and its basis elements. For SR methods based on a kernel matrix, such as SIR and SAVE, the dimension estimation is equivalent to the estimation of the rank of a random matrix which is the sample based estimate of the kernel. A test for the rank of a random matrix amounts to testing how many of its eigen or singular values are equal to zero. We propose two tests based on the smallest eigen or singular values of the estimated matrix: an asymptotic weighted chi-square test and a Wald-type asymptotic chi-square test. We also provide an asymptotic chi-square test for assessing whether elements of the left singular vectors of the random matrix are zero. hese methods together constitute a unified approach for all SR methods based on a kernel matrix that covers estimation of the central subspace and its dimension, as well as assessment of variable contribution to the lower-dimensional predictor projections with variable selection, a special case. A small power simulation study shows that the proposed and existing tests, specific to each SR method, perform similarly with respect to power and achievement of the nominal level. Also, the importance of the choice of the number of slices as a tuning parameter is further exhibited Elsevier Inc. All rights reserved. 1. Introduction his paper is concerned with providing a unifying approach to sufficient dimension reduction (SR) methodology for estimating the dimension of a regression, even though our results have wider application. he estimation of the rank of a random matrix is the central problem in all SR methods based on a kernel matrix. able 1 lists several SR methods and their respective kernel matrices. Under the assumption that a root-n consistent estimator exists for an unobservable random matrix, several tests for its rank have been proposed. Gill and Lewbell [19], and Cragg and onald [15] used a rank test based on the Lower-iagonal-Upper triangular (LU) decomposition. heir test has the advantage of possessing a limiting chi-square distribution, but tends to be overly conservative with type I error close to zero when the sample size is small [30]. Cragg and onald [16] proposed another test based on a minimum chi-square criterion. he procedure needs to minimize an objective function numerically, which is often very difficult and requires the knowledge of the rank of the asymptotic variance of the estimator. Robin and Smith [31] obtained a weighted chi-square test for the rank without making such an assumption. heir test statistic is a variant of Anderson s [1] likelihood ratio statistic for the rank of a regression coefficient matrix in a multivariate normal linear model that is a functional of certain characteristic roots of a Corresponding author at: epartment of Statistics, George Washington University, C 20052, United States. addresses: ebura@gwu.edu (E. Bura), Jiao_Yang@rhoworld.com (J. Yang) X/$ see front matter 2010 Elsevier Inc. All rights reserved. doi: /j.jmva

3 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) able 1 SR Kernel matrices and their estimates. Method M M SIR Cov(E(Z Y)) H h=1 ˆp h Z h Z h PIR E(Z Y) Bn = (F n F n) 1 F n Z n SAVE E(I p Var(Z h )) 2 H h=1 ˆp h(i p Var(Zh )) 2 phd E((Y E(Y))ZZ 1 ) n n i=1 (Y i Ȳ)Z i Z i SIRII E(Var(Z Y) E(Var(Z Y))) 2 H h=1 ˆp h( Var(Zh ) Var(Z h )) 2 SCR 1/2 x H(c) 1/2 1/2 1/2 x H(c) CMS R MAVE (β YZ, YZZ β YZ,..., p 1 β YZZ YZ ) 2E E 2 (ZZ I p Y) + 2E 2 E(Z Y)E(Z Y) + 2E E(Z Y)E(Z Y) E E(Z Y)E(Z Y) E( g(x) g(x)) x x p 1 ( ˆβ YZ, YZZ ˆβ YZ,..., ˆβ YZZ YZ ) 2 H h=1 ˆp he 2 n (Z hz h I p) + 2 H h=1 E n(z h )E n (Z h ) H h=1 ˆp he n (Z h )E n(z h ) H h=1 ˆp he n (Z h )E n (Z h ) 1 n ˆb n i=1 j ˆb j matrix quadratic form. he latter is a quadratic form of the estimated matrix and two positive definite weighting matrices. he major disadvantage of this approach is that each application requires the selection of the two weighting matrices and the results of the test depend critically on this choice and on their interaction with the random matrix. Most if not all tests for the rank of a random matrix are based on the fact that the rank equals the number of its non-zero eigen or singular values. he tests we propose here also fall within this class. he novelty of the proposed methods is that the only requirement is the estimate of the random matrix be unbiased and asymptotically normal with finite asymptotic second moments. No other restrictions, such as on the multiplicity of the singular values of the random matrix, nor other external quantities, such as weighting matrices, are required. In this context, we propose two tests based on the smallest eigen or singular values of the estimated matrix in Section 2. he first is an asymptotically weighted chi-square test based on a result by Eaton and yler [18] for the distribution of the singular values of a random matrix. From an application point of view, the second may be more important as it is easy to compute a Wald-type asymptotically chi-square test. We also adjust and apply the asymptotic chi-square test for testing whether components of the elements of the basis of the column space of the random matrix are zero that was developed by Bura and Pfeiffer [7] to the context of general SR in Section 3. his leads to a test for variable contribution in linear projections of the predictor vector in Section 4.2. When the variables whose contribution is tested are the same in all linear projections, this is a test for variable selection. imension Reduction falls within the realm of Random Matrix heory/analysis as its estimation target is typically a random matrix. For example, in a regression context with response Y, Sufficient imension Reduction (SR, [9]) is based on the idea that the p-dimensional predictor vector X can be replaced by a smaller number of linear combinations of the predictors whose coefficients comprise basis elements of a dimension reduction subspace spanned by the columns of a kernel matrix. We use the results of Sections 2 and 3 to develop an umbrella theory for all kernel-matrix-based sufficient dimension reduction methods that generalize and unify the previous results in Section 4. We discuss SR kernel matrices and methodology in detail in Section 4. As an aside, we also derive a straightforward proof of the asymptotic normality of the SAVE [14] kernel matrix, which was lacking from the SR literature. A power simulation study comparing the two proposed tests and the existing tests for SIR and SAVE is carried out in Section 5. We conclude with a discussion in Section 6. All theorem and lemma proofs are given in Appendix A. 2. Estimating the rank of a random matrix o estimate the rank k of a random p q matrix M we consider the sequential testing of hypotheses of the form H 0 : rank(m) = j versus H 1 : rank(m) > j (1) starting with j = 0. he smallest value of j for which the null is not rejected, for a fixed α level, is the estimate of the rank k of M. Let k = rank(m) min(p, q). he singular value decomposition (SV) of M is M = U R, where the orthogonal matrix U = (U 1, U 0 ) is p p with U 1 : p k, U 0 : p (p k), = diag(λ 1, λ 2,..., λ k ), is a diagonal matrix of the descending singular values of M, λ 1 λ 2 λ k > 0, and R = (R 1, R 0 ) is orthogonal with R 1 : q k, R 0 : q (q k). he k left singular vectors U 1 = (u 1,..., u k ) of M, which correspond to its k non-zero singular values λ 1 λ k > 0, span S(M). (2)

4 132 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) Let M denote an estimate of M based on a random sample of size n. Let us assume that the estimator M satisfies n vec( M M) Npq (0, V). he SV of M is 1 M = U R = U 0 0 R 0 with U = ( U1, U0 ), R = ( R1, R0 ), where the partitions conform to those in the SV of M in (2). In terms of the singular values of M, ˆλ 1 ˆλ min (p,q), the singular value decomposition in (4) can be written as follows: 1 = U 1 M R1 = diag(ˆλ 1, ˆλ 2,..., ˆλ k ) : k k 0 = U 0 M R0 = (ˆd(0) ij ) : (p k) (q k) where ˆd(0) ij = 0 for i j and ˆd(0) ii = ˆλ k+i for i = 1,..., m k, m = min(p, q). When k = rank(m), 0 tends to zero as the sample size increases. he development of the proposed two tests for the rank of M is based on this fact Weighted chi-square test Let Λ 1 (k) = n trace ( min(p,q) 0 0 ) = n vec( 0 ) vec( 0 ) = n where ˆλ 1 ˆλ 2 ˆλ min (p,q) 0 are the singular values of M. i=k+1 heorem 1. Assume rank(m) = k and that M satisfies (3). hen, Λ 1 (k) s i=1 w i χ 2 i, ˆλ 2 i (6) where the χ 2 i s are independent chi-square random variables each with 1 degree of freedom, and w 1 w 2 w s are the ordered eigenvalues of Q = (R 0 U 0 )V(R 0 U 0 ), with s = min(rank(v), (p k)(q k)). Let V be a consistent estimate of V. Also, let ŵi, i = 1, 2,..., s, s = min(rank(v), (p k)(q k)) = min (rank( V), (p k)(q k)), be the eigenvalues of Q = ( R 0 U ) 0 V( R0 U0 ) in descending order. If rank(m) = s k, ŵiχ 2 i=1 i is a consistent s estimate of w i=1 iχ 2 i and the limiting distribution of Λ 1 (k) s is consistently estimated by ŵiχ 2 i=1 i. o approximate a linear combination of chi-square random variables, one may use Wood s [34] statistic. In practice, the computationally less expensive Satorra and Bentler s [32] scaled and adjusted chi-square approximations to the weighted chi-square distribution are frequently used Chi-square test he estimated kernel matrix can be expressed as M M M = M + ( M M) = M + ϵ = M + ϵb, ϵ where ϵb is the perturbation of the matrix M [22]. Using (3) we obtain that the perturbation matrix is asymptotically normal with zero mean and standard deviation of order n 1/2 ; that is, ϵb = O p (n 1/2 ). Let Λ 2 (k) = n vec( 0 ) Q + vec( 0 ) where 0 is defined in (5) and Q = ( R 0 U 0 ) V( R0 U0 ). he notation A + signifies the inverse of a matrix A if it is nonsingular, or its Moore Penrose generalized inverse otherwise. his is a Wald-type test statistic [28] for testing (1). It has the attractive feature of being asymptotically chi-square, in contrast to (7), as shown next. heorem 2. Assume rank(m) = k and that M satisfies (3). hen Λ 2 (k) χ 2 s for Λ 2 defined in (8), where s = min(rank(v), (p k)(q k)). Remark. When the rank k random matrix M is symmetric, U 0 = R 0 and U0 = R0, so that U 0 M R0 is (p k) (p k) symmetric. Hence, its variance, Q, has at most s = (p k)(p k + 1)/2 non-zero eigenvalues, which is the value of s in both (7) and (9). (3) (4) (5) (7) (8) (9)

5 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) A chi-square test for assessing component significance If the rank of M is k, S(M) = S(U 1 ) = span(u 1,..., u k ), where u i, i = 1,..., p, are the left p 1 singular vectors of M. In general, we can simultaneously test whether any component or set of components of any basis element of S(M) equals zero by selecting an appropriate matrix C and testing H 0 : C vec(u) = 0 vs. H 1 : C vec(u) 0 (10) where the matrix C is a pre-specified matrix of zeros and ones of dimension r pk and rank r, where r is the number of components whose contribution is being tested, p is the dimension of the basis elements, and k = rank(m). he unit entry in each row of the matrix C corresponds to the component(s) of the basis elements u 1,..., u k tested for being zero. he test we propose requires the computation of the asymptotic distribution of the k (=rank(m)) left singular vectors of M. Bura and Pfeiffer [7] showed that for any random matrix M with an asymptotically normal sample estimate M, as in (3), the asymptotic distribution of U1 is given by n 1/2 vec( U1 U 1 ) N pk (0, ( 1 R 1 I)V(R 1 1 I)) (11) where U1 is defined in (4) and and R 1 in (2). A general Wald-type test for (10) is given in the next theorem. Its proof can be found in [7, heorem 2]. heorem 3. Let C be a matrix of order r pk and rank r, θ = vec(c vec(u 1 )) and θ = vec(c vec( U1 )), both rpk 1 vectors. Also, let A = C( 1 R 1 I)V(R 1 1 I)C. a. If V is positive definite, then when θ = 0 L = nˆθ A 1ˆθ χ 2 (r) where A is a consistent estimate of A. b. If V is positive semi-definite with rank(v) r, then when θ = 0 (12) L = nˆθ A +ˆθ χ 2 (r) where A is a consistent estimate of A. Bura and Pfeiffer [7] showed that the sample moment based estimate A = C( 1 R 1 I) V( R1 1 I)C is consistent for A, and that A 1 is consistent for A 1, if V is positive definite. If V is positive semi-definite then A + is consistent for A +, where + indicates the Moore Penrose inverse. 4. Unifying sufficient dimension reduction (13) 4.1. Kernel based SR Let X = (X 1,..., X p ) denote a predictor vector and Y a response variable. Sufficient dimension reduction (SR) is based on the idea that X can be replaced with a lower-dimensional projection P S X without loss of information about the conditional distribution of Y X, where P S is the orthogonal projection onto the vector space S in the usual inner product. No pre-specified model for Y X is required. he intersection of all subspaces S R p with F(Y X) = F(Y P S X), where F( ) is the conditional distribution function of the response Y given the second argument, is the central subspace, S Y X [10,9]. he dimension k = dim(s Y X ) is called the structural dimension of the regression of Y on X and can take on any value in the set {0, 1,..., p}. When k < p, the structural dimension of the regression is smaller than the number of predictors. If η = (η 1,..., η k ) is a basis for S Y X, P η X, or equivalently, the k linear combinations η X = (η X,..., 1 η k X) contain all the information in X about Y. If x denotes the covariance matrix of X, Z = 1/2 x (X E(X)) is its standardized version. here is no loss of generality in working in the Z-scale since S Y X = 1/2 x S Y Z. he estimation of the central subspace in most sufficient dimension reduction (SR) techniques is based on finding a kernel matrix M so that S(M) S Y Z. Suppose the kernel matrix M in (14) is of order p q. Let k = rank(m) and λ 1 λ k be the non-zero singular values of M, and u 1,..., u k be its corresponding left singular vectors. he estimation of the possibly lower-dimensional subspace S(M) in (14) can be formulated as an eigen-decomposition problem where estimating the dimension of S(M) amounts to estimating the rank of the kernel matrix M, k, and estimation of the subspace itself amounts to estimating the k left singular vectors of M, u 1,..., u k, since span(u 1,..., u k ) = S(M). he SR predictors (Z,..., 1 Z ) = r (u Z,..., 1 u Z) r are the projections of Z onto S(M). he SR predictors in the X scale are (X,..., 1 X ) = r ( 1/2 x u 1X,..., 1/2 x u r X). hey replace the original X predictor vector in modeling the response as a function of X. (14)

6 134 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) For the span of a kernel matrix to be a subspace of S Y Z at least one of two conditions on the marginal moments of the predictors must hold. For first moment based kernel methods, such as Sliced Inverse Regression (SIR) [23] and Parametric Inverse Regression (PIR) [5], the following linearity condition is needed: E(Z P SY Z Z) = P SY Z Z. (15) For second moment based kernel methods, such as Sliced Average Variance Estimation (SAVE) [14] and principal Hessian directions (phd) [23,10], condition (15) and also the constant variance condition Var(Z P SY Z Z) = I P SY Z are required. o estimate k = dim(s Y Z ), the test statistic for dimension is generally of the form L k = n i f (ˆλ i ), where ˆλ i are the singular or eigenvalues of M in decreasing order and f is a smooth non-negative function. he dimension is usually estimated via sequential testing H 0 : k = r against H a : k > r, starting at r = 0, which corresponds to independence of Y and Z. Assessment of the accuracy of the estimation requires knowledge of the asymptotic distribution of the test statistic, the computation of which comprises an important aspect of all SR techniques Estimation methods (16) Suppose a random sample of size n is available on (Y, X) resulting in the n 1 vector Y of responses, and the n p matrix X n 1/2 of observations on the predictors. he standardized version of the X matrix is Z n = x (X n Xn ), where Xn = n i=1 X i/n and x = n i=1 (X n Xn )(X n Xn ) /n. able 1 lists the kernel matrices of several SR methods and their respective estimates. he two most popular kernel based SR methods, SIR and SAVE, will be discussed in detail in Section 4.3. he notation used in the sample estimates of their respective kernel matrices is as follows. hroughout, H indicates the number of slices used, n h is the number of observations in slice h, Z h denotes the n h p matrix of the standardized predictors in slice h, Zh = n h i=1 Z ih/n h denotes their p 1 intra-slice mean, and ˆp h = n h /n is the fraction of observations falling in slice h. For PIR, M = Bn is the least squares estimate of the q p parameter matrix B = (β lj ) in the linear model Z n Y = F n B+E n, where F n = ( fil ) is an n q fixed matrix with fil = fl (Y i ) = f il n i=1 f il/n, the centered version of f il = f l (Y i ). Simple Contour Regression (SCR), introduced by Li et al. [26], uses the matrix H(c), defined by E[( X X)( X X) I( Ỹ Y c)] for c > 0 and ( X, Ỹ) an independent copy of (X, Y). he sample based estimate of H(c) is 1 H(c) = n (X j X i )(X j X i ) I( Y j Y i c) 2 (i,j) N where N = {(i, j) : i = 2,..., n, j = 1,..., i 1}. Li and Wang [25] developed irectional Regression (R) that builds upon and substantially improves the accuracy of contour regressions and decreases computing time. Moreover, Li and Wang [25] showed S R = S SAVE, yet it is computationally more accurate than SAVE. In able 1, the notation E n = n i=1 ( )/n is used for R. In Cook and Li s [13] Central Mean Subspace (CMS), β YZ = E(Y Z) and YZZ = E[(Y E(Y))ZZ ]. heir estimates are the corresponding sample moment estimates. In MAVE, g stands for the unknown link function in Y = g(η X) + ϵ, where η is a p k orthogonal matrix so that S(η) is a dimension reduction subspace. he estimated kernel matrix uses the minimizers ˆb j of n min (y i (a j + b j (X i X j ))) 2 w ij a j,b j i=1 where w ij = K h (X i X j )/ n i=1 K h(x i X j ). K h is a multidimensional kernel function and h is the bandwidth [35,8]. Let M denote an estimate of M based on a random sample of size n. For most SR methods, the asymptotic normality of M has already been established. his is intuitively true since the kernels are moments or functions of moments of the conditional distribution of the predictors given the response. he reader is referred to the provided references for each method. Moreover, a general paradigm of obtaining the asymptotic normality of functions of means and variances is provided in the proof for the asymptotic normality of the SIR and SAVE kernel matrices in Section 4.3. o the best of our knowledge, the asymptotic distribution of MMAVE has not been computed yet. he only two conditions required for both heorems 1 and 2 are (a) S(M) S Y X and (b) the existence of the fourth moments of X. Condition (a) is satisfied when either (15) and/or (16) hold, depending on the SR method used to estimate M. We will illustrate the application of our general results in Sections 2 and 3 using the two, arguably, most popular SR methods, SIR and SAVE, in Section 4.3 and in the simulation section SIR and SAVE So far in SR literature, researchers have focused on developing tests for dimension tailored to the specific kernel matrix to each method. As the platform for arguing in favor of the proposed unified approach in SR methodology, we focus on the two well-known and understood SR methods, SIR [23] and SAVE [14].

7 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) For SIR, the test statistic for dimension is Λ SIR = n p i=k+1 ˆλ i, where ˆλ 1 ˆλ p are the eigenvalues of the moment estimate of Cov(E(Z Y)). Bura [4], Bura and Cook [6] showed that Λ SIR is asymptotically distributed as a sum of weighted independent chi-square random variables, each with one degree of freedom. his result was later extended to other SR methods. Cook and Lee [12] obtained the same result for the SAVE [14] test statistic for dimension in the context of binary regression. Shao et al. [33] proposed a test statistic for dimension in the SAVE context and showed that is also a sum of weighted chi-square random variables. o obtain this result they required that the variance of the Kronecker product of the projection of the predictors onto the null space of the SAVE kernel matrix is constant given the projection of the predictors onto the central dimension reduction subspace (see [33, h. 3]); a condition which may be rather challenging to check in practice. Others have shown similar results for their proposed SR methods he unified approach For SIR [23], the range of the response Y is divided into H slices and MSIR symmetric random matrix. If we let Zn = ( Z1 ˆp1,..., Zh ˆpH ), we can write = Cov(E(Z Y)) = H h=1 ˆp h Z h Z h is a p p M SIR = ˆp 1 Z1 Z 1 + ˆp 2 Z 2 Z ˆp H Z H Z H = Zn Z n. (17) he multivariate central limit theorem (see, for example, [29, p. 15]) and the multivariate version of Slutsky s theorem yield n 1/2 vec( Zn µ) N ph (0, V) where µ = (µ 1, µ 2,..., µ H ), µ h = E(Z ih ), i = 1,..., n h. he asymptotic covariance V is given by (B.3) and (B.4) in Appendix B. We will use MSIR = Zn as our SIR kernel matrix since E(Z Y) span(cov(e(z Y))) with probability 1 (see [17, Prop. 2.7, p. 75]). We can now apply the theory developed in Sections 2 and 3. he SIR weighted chi-square test statistic for dimension in (6) is Λ SIR 1 (k) = n p i=k+1 ˆλ 2 i where ˆλ i are the ordered singular values of MSIR = Zn or, equivalently, ˆλ 2 i are the ordered eigenvalues of MSIR. he weights in (7) are estimated by the ordered eigenvalues of Q = ( R 0 U ) 0 V( R0 U0 ), with s = (p k)(h k). he columns of U0 are the p k right singular vectors and the columns of R0 are the H k left singular vectors of MSIR, and V is the sample moment based estimate of V. In Appendix B we also prove, using the multivariate delta method, that n vec( MSIR M SIR ) N p 2(0, V SIR ). (20) he p 2 p 2 asymptotic covariance matrix V SIR is given by (B.5) also in Appendix B. he asymptotic normality of MSIR is known (see [23] for X normal and [6] for general X). he interest here lies in the derivation of the asymptotic covariance V SIR via the delta method and the use of the gradient of a matrix-valued function (see Appendix B). As will also be seen in SAVE, this can lead to a general paradigm for computing the asymptotic covariance of kernel matrix estimates of other SR methods. he SIR chi-square test statistic for dimension in (8) is Λ SIR 2 (k) = n vec( U 0 M SIR R0 ) Q + vec( U 0 M SIR R0 ) (21) with Q + = ( R 0 U 0 ) V + ( R0 U0 ). he chi-square degrees of freedom in (9) are s = (p k)(h k). It is worth mentioning that, in the context of SIR, Bai and He [2] also obtained an asymptotic chi-square test for dimension without requiring normality. Yet, the response has to be second-order uncorrelated (see [2] for the definition) with a subset of the predictor vector for the result to hold. For SAVE [14], for h = 1,..., H and i = 1,..., n h, we let Cov(Z ih ) = h, h K n = ((I p 1 ) ˆp 1,..., (I p H ) ˆp h ) denote a p (ph) matrix. hen, (18) (19) = n h i=1 (Z ih Zh )(Z ih Zh ) /n h, and M SAVE = Kn K n = ˆp 1(I p 1 ) 2 + ˆp H (I p H ) 2. (22) Applying Corollary in [29, p. 19], we obtain n 1/2 h ( h h ) N p 2(0, Q h ), where Q h = Cov(vec(Z 1h µ h )(Z 1h µ h ) ). Let g : A R p p (I p A) 2 R p p. Using the multivariate delta method yields nh vec(g( h ) g( h )) he p 2 p 2 asymptotic covariance matrix in (23) is Q = g( h) g( h ) h Q h. h h N p 2(0, Q h ). (23)

8 136 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) From (22) and (23) we have n vec( Kn K n KK ) = n vec( MSAVE M SAVE ) N(0, V SAVE ) with V SAVE = H h=1 p hq h. he computation of the gradient of g at h, necessary for calculating Q h, is given in Appendix B. It is interesting to observe that in SAVE, in contrast to SIR, we can work at the slice level since the summands in (22) are independent of one another. Also we note that our proof of the asymptotic normality of the SAVE kernel matrix only requires the existence of fourth moments of the predictor vector. he SAVE weighted chi-square test statistic for dimension in (6) is p(p+1)/2 Λ SAVE 1 (k) = n i=k+1 ˆλ 2 i where ˆλ i are the ordered eigenvalues of MSAVE. he weights in (7) are estimated by the ordered eigenvalues of Q = ( U 0 U ) 0 VSAVE ( U0 U0 ), with s = (p k)(p k + 1)/2. he columns of U0 are the p k right singular vectors of M SAVE and VSAVE is the sample moment based estimate of V SAVE. he SAVE chi-square test statistic for dimension in (8) is Λ SAVE 2 (k) = n vec( U 0 M SAVE U0 ) Q + vec( U 0 M SAVE U0 ) (25) with Q + = ( U 0 U ) 0 V + ( SAVE U0 U0 ). he chi-square degrees of freedom in (9) are s = (p k)(p k + 1)/2. It should be pointed out that the only other test for dimension for SAVE, aside from a permutation test, is given in [33] and is not directly based on MSAVE. heir test statistic is asymptotically weighted chi-square unless the predictors are normal, in which case it is chi-square. he SIR and SAVE test statistics defined in this section will be used in the simulation section to estimate the dimension of the regression A chi-square test for variable contribution or selection Estimating the rank of a random matrix in multivariate analysis or dimension reduction regression problems provides information about the dimension of the data, but does not shed any light on which variables have significant contributions. By removing variables with insignificant contributions, the complexity of the original data set is further reduced. In dimensionality reduction methods for regression based on a kernel matrix, the rank of the kernel matrix is the dimension of the regression and its left singular vectors are the coefficients of the linear combinations of the predictor vector that are sufficient for modeling the response. hat is, the original p variables X = (X 1,..., X p ) are replaced by the k < p linear combinations, X i = u i X = u i1x 1 + u i2 X u ip X p, i = 1,..., k. he coefficients u ij, j = 1,..., p, of the individual variables, X 1,..., X p, can be thought of as measuring the contribution of each variable to the ith linear combination, X i, i = 1,..., k. Some coefficients in X i may not be statistically significantly different from zero and the corresponding variables can be removed from the linear combination. hus, testing for variable contribution to the k linear projections of the original predictors is equivalent to testing for component significance in the k left singular vectors of the kernel matrix as in Section 3. We can simultaneously test the effect of any variable or set of variables in any linear combination or any set of linear combinations by selecting an appropriate matrix C and testing (10), where the matrix C is a pre-specified matrix of zeros and ones of dimension r pk and rank r, with r = number of variables whose contribution is being tested, p the number of variables in the data set, and k = rank(m). he unit entry in each row of the matrix C corresponds to the coefficient of the variable whose importance is being tested. For example, if variable X 2 does not contribute to the first linear combination X 1, then u 21, the coefficient of X 2 in X 1 and the second element of u 2, should not be significantly different from zero. If we let C = (0, 1, 0,..., 0) be a 1 pk matrix, then C vec(u) = u 21, and testing H 0 : C vec(u) = 0 vs. H 1 : C vec(u) 0 is equivalent to assessing whether X 2 has a significant contribution to X 1. he test statistic is the asymptotic chi-square test statistic given in (12) when the asymptotic covariance matrix of the kernel matrix is full rank or in (13) when it is not. In particular, the special case where the effect of the same variable(s) across all k linear combinations is assessed is equivalent to variable selection. hat is, the chi-square test for variable contribution can also be applied to select variables important in modeling the response. 5. Simulation study he following two models were considered by Shao et al. [33] in their simulation study: Y = X 1 + ϵ Y = X X 2 + ϵ. For both models, ϵ is an independent and normally distributed random variable with mean zero and standard deviation 0.1. he predictor vector X = (X 1, X 2,..., X p ) has dimension p = 4, 10 in the simulations. he first model in (26) (24) (26) (27)

9 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) able 2 Estimated levels and power of nominal 5% tests based on model (26). H = 5 H = 10 n p = 4, X i t(5) SIR SIR SIR adj SIR scaled SIR Wald d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d able 3 Estimated levels and power of nominal 5% tests based on model (26). n p = 4, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d is a one-dimensional model with S Y X = span((1, 0,..., 0) ). he second model in (27) is a two-dimensional model with S Y X = span(((1, 0,..., 0), (0, 1, 0,..., 0)) ). We report simulation results only for non-normal predictors, X i iid t(5), i = 1,..., p, to compare our respective results. We set the number of slices to 5, used in [33], and 10 to also examine the effect of number of slices on the tests. he nominal level for all tests is For the weighted chi-square test in heorem 1, we use Wood s [34] numerical approximation to the exact distribution of a linear combination of independent chi-square variates, as implemented in the dr package in R, as well as two simple i in (7) is sc = /c χ 2 s, approximations based on the chi-square distribution [3]. he re-scaled version of = s w i=1 iχ 2 where c = s w i=1 i/s. he adjusted version of is adj = /a χ 2 b, where a = s i=1 w2/ s w i i=1 i and b = ( s w i=1 i) 2 / s i=1 w2 i. sc is a mean corrected version of, whereas adj matches the first two moments of with those of aχ 2 b. Shao et al. [33] used sc in place of their weighted chi-square test statistic in their simulation study. In able 2 we report the estimated level and power for several tests of dimension based on SIR applied only to model (26). For model (27), SIR will estimate the central dimension reduction to be 0, since E(X 1 Y) = 0. he column headings denote the following tests: SIR is the original chi-square test statistic proposed by Li [23]; SIR is the weighted chi-square test statistic and SIR Wald is the chi-square test statistic, both derived in this paper, as described in Section 4.1; SIR scaled is the scaled and SIR adj is the adjusted version of the weighted chi-square test statistic, as described above. In this table, the line headed by d = 0 vs. d 1 corresponds to the power of the corresponding test, whereas the row headed by d = 1 vs. d 2 reports the estimated level. he power of all tests is always uniformly best, i.e. 1, but one can immediately observe that the level of the tests depends on the combination of sample size and number of slices. In this simple model, when the number of slices is 5, the weighted chi-square and its adjusted chi-square version, derived in this paper, have levels smaller than the nominal. he scaled version of the weighted chi-square test achieves the nominal level. Li s SIR test has level fairly close to nominal. When the number of slices is 10, the sample size needs to be at least 200 for the scaled version to achieve the nominal level. It also appears that the Wald-type chi-square test is the least conservative and requires higher sample sizes to achieve the nominal level. he effect on estimation of the interplay between the number of slices and the sample size will also be observed in the tables reporting results for the SAVE based tests of dimension for both models. In ables 3 8 we report the estimated level and power for several tests of dimension based on SAVE applied to models (26) and (27). he column headings denote the following tests: S N and S G denote the sliced average variance estimation test assuming normality (a chi-square test) and the general (weighted chi-square test), respectively, proposed by Shao et al. [33]; SAVE denotes the weighted chi-square test statistic and SAVE Wald is the chi-square test statistic, both derived in this paper, as described in Section 4.1; SAVE scaled is the scaled and SAVE adj is the adjusted version of the weighted chi-square test statistic, as described above. In ables 3 and 4, the line headed by d = 0 vs. d 1 corresponds to the power of the corresponding test, whereas the row headed by d = 1 vs. d 2 reports the estimated level. One can readily see that S N fails to estimate the correct dimension 1, as expected as it assumes that the predictors are normal. Among the other tests, there is no clear winner as both power and level depend on the combination of sample size and number of slices. In general, the tests derived in this paper are more conservative than those proposed by Shao et al. [33]. Yet, a pattern is emerging. he

10 138 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) able 4 Estimated levels and power of nominal 5% tests based on model (26). n p = 4, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d d = 0 vs. d d = 1 vs. d able 5 Estimated levels and power of nominal 5% tests based on model (27). n p = 4, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d able 6 Estimated levels and power of nominal 5% tests based on model (27). n p = 4, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d able 7 Estimated levels and power of nominal 5% tests based on model (27). n p = 10, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d scaled version of the weighted chi-square test statistic has higher power and its level is closer to the nominal as compared to either the weighted or adjusted chi-square test statistics. Moreover, the simple Wald-type chi-square test we propose in this paper has similar performance, with level slightly closer to the nominal. We also observe that the combination of the sample size and the number of slices has a big impact on the results across tests. In ables 5 8, the line headed by d = 1 vs. d 2 corresponds to the power of the corresponding test, whereas the row headed by d = 2 vs. d 3 reports the estimated level. he conclusions are similar to those for the one-dimensional model in ables 3 and 4. Of course, in this case the sample size required across all tests is larger since the model is more complex.

11 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) able 8 Estimated levels and power of nominal 5% tests based on model (27). n p = 10, X i t(5), H = S N S G SAVE SAVE adj SAVE scaled SAVE Wald d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d d = 0 vs. d d = 1 vs. d d = 2 vs. d able 9 Estimated level and power of nominal 5% variable importance tests in SAVE predictors based on model (26). n p = 4, X i t(5), H = 5 H 0 : u 12 = u 13 = u 14 = 0 H 1 : u 11 = u 12 = u 13 = u 14 = 0 H 1 : u 11 = u 12 = p = 4, X i t(5), H = able 10 Estimated level and power of nominal 5% variable importance tests in SAVE predictors based on model (27). n p = 4, X i t(5), H = 5 H 0 : u ij = 0, i = 1, 2, j = 3, 4 H 1 : u ij = 0, i = 1, 2, j = 1, 2, 3, p = 4, X i t(5), H = In summary, these simulation results indicate that the scaled chi-square approximation to the proposed weighted chisquare test for dimension has similar performance to Shao et al. s [33] competitor. What is more important, however, is that the simple Wald-type chi-square test we propose in this paper, is a very good, and many times even better, competitor to the weighted chi-square tests for larger sample sizes. In able 9 we report the level (under column headed by H 0 ) and power (under column(s) headed by H 1 ) for testing variable importance in model (26). We assess the importance of variables X 2, X 3, X 4 in the SAVE predictor versus the specific alternatives that none of the variables are important (u 11 = u 12 = u 13 = u 14 = 0) and that X 1 and X 2 are not important (u 11 = u 12 = 0). he power is practically 1 across sample sizes. he level is close to nominal when the sample size is about 120 for H = 5, and 250 for H = 10. Again the importance of the choice of the number of slices is noted. For the most complex model (27), we are testing for the importance of X 3 and X 4 ; that is, whether they contribute to both SAVE predictors simultaneously (H 0 : u ij = 0, i = 1, 2, j = 3, 4) versus the alternative that none is important in either SAVE predictor (H 1 : u ij = 0, i = 1, 2, j = 1, 2, 3, 4). his test is equivalent to variable selection. he power is 1 even at a sample size of 50 when H = 5, but the level fluctuates across different sample size/number of slices combinations. he nominal of 5% is achieved at roughly 100 observations when H = 5 and at 300 when H = 10 (able 10).

12 140 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) iscussion We present two tests for the rank of a random matrix that is asymptotically normal. As an application of this general result, we provide a general theory that encompasses all sufficient dimension reduction methods based on kernel matrices. he two tests for dimension we propose can be applied to all such SR methods. Moreover, the asymptotic chi-square test we developed only requires the existence of predictor fourth moments and can be used in all SR kernel-matrix-based methods in lieu of the currently used weighted chi-square test that requires the numerical approximation of the weighted chi-square distribution quantiles when large samples are available. In contrast, the existing chi-square tests for dimension in SR require normal predictors (e.g., [23,33]). We also note that the proposed rank or, equivalently, dimension estimates based on sequential testing is consistent for the true rank. his can be shown by using arguments similar to those of Robin and Smith ([31], hms. 5.1 and 5.2). We also propose an asymptotic chi-square test for assessing which components of the basis elements of a random matrix are statistically significant. In the context of dimension reduction in regression, this is a general test for variable contribution in the lower-dimensional projections of the predictor vector. Variable selection is a special case of this test. In SR, the first formal statistical test for the significance of a subset of predictors was developed by Cook [11] (marginal coordinate hypotheses) for SIR. he asymptotic distribution of the test statistics he proposed was weighted chi-square under the assumption of linearity. Also, the tests developed there are variable selection tests in that they test for concurrent variable importance in all SIR projections of the predictor vector. hat is, the marginal coordinate hypothesis procedure does not allow testing whether a variable is not significant in the first SR predictor, yet allow for it to be retained in another SR predictor. his is also true for the gridded chi-square test [24], a heuristic method, for assessing variable importance in SIR predictors based on residuals and also for the marginal coordinate weighted chi-square test of Shao et al. [33] for SAVE. Our test not only can be used for variable selection but also allows testing the contribution of any variable or set of variables in the SR projections of the predictor vector, either separately or simultaneously. In addition, only finite fourth moments are required and the method can be used to assess variable contribution to any linear combination of random variables. For example, this test can be used to assess variable contribution to principal components as they are simply linear combinations of the elements of a random vector with coefficients the elements of the left singular vectors of the covariance matrix of the random vector. Acknowledgment he authors would like to thank Prof. Liliana Forzani for her help in computing the gradient in the SAVE asymptotic covariance and for her comments on this paper. Appendix A Proof of heorem 1. Observe that n(r 0 U 0 ) vec( M M) = n vec[u 0 ( M M)R0 ] = n vec(u 0 MR 0 ) since from (2) we have U 0 MR 0 = 0. Hence, from (3), n vec(u 0 MR 0 ) d N(0, (R 0 U 0 )V(R 0 U 0 )). (A.1) Also from (3), M is root n consistent for M, which yields that U0 and R0 are also root n consistent for U 0 and R 0, respectively [7,30]. hat is, U0 = U 0 + O p (n 1/2 ) and R0 = R 0 + O p (n 1/2 ), and n vec( U 0 M R0 ) = n vec[(u 0 + O p (n 1/2 )) M(R0 + O p (n 1/2 ))] = vec[ nu 0 MR 0 + U 0 MO p (1) + O p (1) MR0 + MOp (n 1/2 )]. p Observe that U 0M p U 0 M, and MR0 MR 0. Also, MOp (n 1/2 ) = [M+O p (n 1/2 )]O p (n 1/2 ) p 0, U 0 M = 0 U 0 MO p(1) = 0, and MU 0 = 0 O p (1)MU 0 = 0. Hence, U 0MO p (1) p p 0, O p (1) MU0 0, and MOp (n 1/2 ) p 0. hese results together with (A.1) imply n vec( 0 ) = n vec( U 0M R0 ) N(0, (R 0 U 0 )V(R 0 U 0 )) (A.2) which in turn yields (7). he weights w i, i = 1, 2,..., s, are the eigenvalues of Q = (R 0 U 0 )V(R 0 U 0 ) in descending order (see, for example, [20]).

13 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) Proof of heorem 2. From (A.2), Q = (R 0 U 0 )V(R 0 U 0) is the asymptotic covariance matrix of n vec( 0 ) = n vec( U 0M R0 ). Using the consistency of U0 and R0 [7,30], we obtain that the estimate of Q, Q = ( R 0 U ) 0 V( R0 U0 ), is also consistent. Inversion is a continuous function, so that Q 1 p Q 1, when Q is full rank. Also, the Moore Penrose inverse of a matrix is unique and its entries are continuous functions of the entries of the original matrix. hus, Q + p Q +, when Q is not full rank. Hence, by (A.2), Λ 2 (k) = n vec( 0 ) Q + vec( 0 ) χ 2 s [28]. Appendix B he asymptotic covariance V in (18): he asymptotic covariance V is an H H array of p p matrices V hs = ncov( Zh, Zs ), h, s = 1,..., H. Bura and Cook [6] computed these matrices. For h = s, V hh = I p p h + (1 2p h ) z h (B.3) where z h = 1/2 x x h 1/2 x, and x h = Cov(X i Y i falls in slice h). Also, V hs = p h p s (I p z h z s ). (B.4) Proof of (20) and computation of V SIR : By the multivariate version of the delta method, we have n 1/2 ( Zn Z n µµ ) = n 1/2 ( MSIR M SIR ) N p 2(0, f µ V fµ ) where f (x) = xx, with x a p H matrix. Let H p be the elimination matrix defined by vech A = H p vec A for any matrix A and G p be the duplication matrix defined by vec A = G p vech A for any symmetric matrix A [21, p. 352]. hen, d vec(f (x)) f = d vec x = G p H p d vec(f (x)) d vec x where the derivative in the right-hand side is the usual derivative without taking into account any symmetry and equals (x I p ) + (I p x)k (p,h), where K (p,h) is the commutation matrix of order ph ph that transforms vec(a) to vec(a ) for any matrix A [27]. hen, Hence, f µ = G p H p ((µ I p ) + (I p µ)k (p,h) ). V SIR = G p H p ((µ I p ) + (I p µ)k (p,h) )V(G p H p ((µ I p ) + (I p µ)k (p,h) )). (B.5) Computation of g( h ) in Q h : he function g = (I A)2 = (I A)(I A) is symmetric and is applied to A R p p, a symmetric matrix. he derivative of the symmetric g at the symmetric A is Now, and, vech g(a) g(a) = vech (A) = vec g(a) vec (A) vec(a) vech (A). vec g(a) vec (A) = ((I A ) I) (I (I A)) vec(a) vech (A) = G vech(a) p vech (A) = G p. Plugging these in (B.6) gives g(a) = ((I A) I)G p (I (I A))G p, so that (B.6) g( h ) = ((I h ) I)G p (I (I h ))G p (B.7) which is a p 2 p(p + 1)/2 matrix whose rows are specified by the p 2 g components and its columns by the p(p + 1)/2 distinct h entries. References [1].W. Anderson, he asymptotic distribution of certain characterstic roots and vectors, in: J. Neyman (Ed.), Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, 1951, pp [2] Z.. Bai, Z. He, A chi-square test for dimensionality with non-gaussian data, Journal of Multivariate Analysis 88 (2004) [3] P.M. Bentler, J. Xie, Corrections to test statistics in principal Hessian directions, Statistics and Probability Letters 47 (2000)

14 142 E. Bura, J. Yang / Journal of Multivariate Analysis 102 (2011) [4] E. Bura, imension reduction via inverse regression, Ph.. hesis, epartment of Statistics, University of Minnesota, [5] E. Bura, R.. Cook, Estimating the structural dimension of regressions via parametric inverse regression, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2001) [6] E. Bura, R.. Cook, Extending SIR: the weighted chi-square test, Journal of the American Statistical Association 96 (2001) [7] E. Bura, R. Pfeiffer, On the distribution of the left singular vectors of a random matrix and its applications, Statistics and Probability Letters 78 (2008) [8] P. Cizek, W. Hardle, Robust estimation of dimension reduction space, Computational Statistics and ata Analysis 51 (2006) [9] R.. Cook, Regression Graphics: Ideas for Studying Regressions through Graphics, Wiley, New York, [10] R.. Cook, Principal Hessian directions revisited, Journal of the American Statistical Association 93 (1998) [11] R.. Cook, esting predictor contributions in sufficient dimension reduction, he Annals of Statistics 32 (2004) [12] R.. Cook, H. Lee, imension reduction in regressions with a binary response, Journal of the American Statistical Association 94 (1999) [13] R.. Cook, B. Li, imension reduction for the conditional mean in regression, he Annals of Statistics 30 (2002) [14] R.. Cook, S. Weisberg, iscussion of Li [23], Journal of the American Statistical Association 86 (1991) [15] J.G. Cragg, S.G. onald, On the asymptotic properties of LU-based tests of the rank of a matrix, Journal of the American Statistical Association 91 (1996) [16] J.G. Cragg, S.G. onald, Inferring the rank of a matrix, Journal of Econometrics 76 (1997) [17] M.L. Eaton, Multivariate Statistics: A Vector Space Approach, Wiley, New York, [18] M.L. Eaton,.E. yler, Asymptotic distributions of singular values with applications to canonical correlations and correspondence analysis, Journal of Multivariate Analysis 50 (1994) [19] L. Gill, A. Lewbell, esting the rank and definitiness of estimated matrices with applications to factor, state-space, and ARMA models, Journal of the American Statistical Association 87 (1992) [20] I. Guttman, Linear Models: An Introduction, Wiley, New York, [21].A. Harville, Matrix Algebra from a Statistician s Perspective, Springer-Verlag, New York, [22]. Kato, Perturbation heory for Linear Operators, Springer-Verlag, Berlin, [23] K.-C. Li, Sliced inverse regression for dimension reduction (with discussion), Journal of the American Statistical Association 86 (1991) [24] L. Li, R.. Cook, C.J. Nachtsheim, Model-free variable selection, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2005) [25] B. Li, S. Wang, On directional regression for dimension reduction, Journal of the American Statistical Association 102 (2007) [26] B. Li, H. Zha, F. Chiaromonte, Contour regression: a general approach to dimension reduction, he Annals of Statistics 33 (2005) [27] J.R. Magnus, H. Neudecker, he commutation matrix: some properties and applications, Annals of Statistics 7 (1979) [28].S. Moore, Generalized inverse, Wald s method, and the construction of chi-squared tests of fit, Journal of the American Statistical Association 72 (1977) [29] R.J. Muirhead, Aspects of Multivariate Statistical heory, Wiley, New York, [30] Z. Ratsimalahelo, Rank test based on matrix perturbation theory, [31] J.M. Robin, R.J. Smith, ests of rank, Econometric heory 16 (2000) [32] A. Satorra, P.M. Bentler, Corrections to tests statistics and standard errors in covariance structure analysis, in: A. von Eye, C.C. Clogg (Eds.), Latent Variables Analysis: Applications for evelopmental Research, Sage, Newbury Park, CA, 1994, pp [33] Y. Shao, R.. Cook, S. Weisberg, Marginal tests with sliced average variance estimation, Biometrika 94 (2007) [34] A..A. Wood, An F approximation to the distribution of a linear combination of chi-squared variables, Communications in Statistics: Simulation 18 (1989) [35] Y. Xia, H. ong, W.K. Li, L.-X. Zhu, An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (2002)

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Combining eigenvalues and variation of eigenvectors for order determination

Combining eigenvalues and variation of eigenvectors for order determination Combining eigenvalues and variation of eigenvectors for order determination Wei Luo and Bing Li City University of New York and Penn State University wei.luo@baruch.cuny.edu bing@stat.psu.edu 1 1 Introduction

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research education use, including for instruction at the authors institution

More information

Marginal tests with sliced average variance estimation

Marginal tests with sliced average variance estimation Biometrika Advance Access published February 28, 2007 Biometrika (2007), pp. 1 12 2007 Biometrika Trust Printed in Great Britain doi:10.1093/biomet/asm021 Marginal tests with sliced average variance estimation

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St. Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Zhishen Ye a, Jie Yang,b,1 a Amgen Inc., Thousand Oaks, CA 91320-1799, USA b Department of Mathematics, Statistics,

More information

Estimation and Testing for Common Cycles

Estimation and Testing for Common Cycles Estimation and esting for Common Cycles Anders Warne February 27, 2008 Abstract: his note discusses estimation and testing for the presence of common cycles in cointegrated vector autoregressions A simple

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction Zhishen Ye a, Jie Yang,b,1 a Amgen Inc., Thousand Oaks, CA 91320-1799, USA b Department of Mathematics, Statistics,

More information

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

Moment Based Dimension Reduction for Multivariate. Response Regression

Moment Based Dimension Reduction for Multivariate. Response Regression Moment Based Dimension Reduction for Multivariate Response Regression Xiangrong Yin Efstathia Bura January 20, 2005 Abstract Dimension reduction aims to reduce the complexity of a regression without requiring

More information

A review on Sliced Inverse Regression

A review on Sliced Inverse Regression A review on Sliced Inverse Regression Kevin Li To cite this version: Kevin Li. A review on Sliced Inverse Regression. 2013. HAL Id: hal-00803698 https://hal.archives-ouvertes.fr/hal-00803698v1

More information

Sufficient Dimension Reduction using Support Vector Machine and it s variants

Sufficient Dimension Reduction using Support Vector Machine and it s variants Sufficient Dimension Reduction using Support Vector Machine and it s variants Andreas Artemiou School of Mathematics, Cardiff University @AG DANK/BCS Meeting 2013 SDR PSVM Real Data Current Research and

More information

Reduced rank regression in cointegrated models

Reduced rank regression in cointegrated models Journal of Econometrics 06 (2002) 203 26 www.elsevier.com/locate/econbase Reduced rank regression in cointegrated models.w. Anderson Department of Statistics, Stanford University, Stanford, CA 94305-4065,

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Strongly Consistent Determination of the Rank of Matrix.

Strongly Consistent Determination of the Rank of Matrix. Strongly Consistent Determination of the Rank of Matrix. Zaka Ratsimalahelo University of Franche-Comté, U.F.R. Science Economique, 45D, Av., de l Observatoire - 25030 Besançon - France, E-mail: Zaka.Ratsimalahelo@univ-fcomte.fr.

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

4.1 Order Specification

4.1 Order Specification THE UNIVERSITY OF CHICAGO Booth School of Business Business 41914, Spring Quarter 2009, Mr Ruey S Tsay Lecture 7: Structural Specification of VARMA Models continued 41 Order Specification Turn to data

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra. Scaled and adjusted restricted tests in multi-sample analysis of moment structures Albert Satorra Universitat Pompeu Fabra July 15, 1999 The author is grateful to Peter Bentler and Bengt Muthen for their

More information

Estimation of matrix rank: historical overview and more recent developments

Estimation of matrix rank: historical overview and more recent developments Estimation of matrix rank: historical overview and more recent developments Vladas Pipiras CEMAT, IST and University of North Carolina, Chapel Hill UTL Probability and Statistics Seminar in Lisbon November

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Testing Structural Equation Models: The Effect of Kurtosis

Testing Structural Equation Models: The Effect of Kurtosis Testing Structural Equation Models: The Effect of Kurtosis Tron Foss, Karl G Jöreskog & Ulf H Olsson Norwegian School of Management October 18, 2006 Abstract Various chi-square statistics are used for

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Consistent Bivariate Distribution

Consistent Bivariate Distribution A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

A note on the equality of the BLUPs for new observations under two linear models

A note on the equality of the BLUPs for new observations under two linear models ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 14, 2010 A note on the equality of the BLUPs for new observations under two linear models Stephen J Haslett and Simo Puntanen Abstract

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

New insights into best linear unbiased estimation and the optimality of least-squares

New insights into best linear unbiased estimation and the optimality of least-squares Journal of Multivariate Analysis 97 (2006) 575 585 www.elsevier.com/locate/jmva New insights into best linear unbiased estimation and the optimality of least-squares Mario Faliva, Maria Grazia Zoia Istituto

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω) Online Appendix Proof of Lemma A.. he proof uses similar arguments as in Dunsmuir 979), but allowing for weak identification and selecting a subset of frequencies using W ω). It consists of two steps.

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Simulation study on using moment functions for sufficient dimension reduction

Simulation study on using moment functions for sufficient dimension reduction Michigan Technological University Digital Commons @ Michigan Tech Dissertations, Master's Theses and Master's Reports - Open Dissertations, Master's Theses and Master's Reports 2012 Simulation study on

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Comprehensive Examination Quantitative Methods Spring, 2018

Comprehensive Examination Quantitative Methods Spring, 2018 Comprehensive Examination Quantitative Methods Spring, 2018 Instruction: This exam consists of three parts. You are required to answer all the questions in all the parts. 1 Grading policy: 1. Each part

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

CANONICAL CORRELATION ANALYSIS AND REDUCED RANK REGRESSION IN AUTOREGRESSIVE MODELS. BY T. W. ANDERSON Stanford University

CANONICAL CORRELATION ANALYSIS AND REDUCED RANK REGRESSION IN AUTOREGRESSIVE MODELS. BY T. W. ANDERSON Stanford University he Annals of Statistics 2002, Vol. 30, No. 4, 34 54 CANONICAL CORRELAION ANALYSIS AND REDUCED RANK REGRESSION IN AUOREGRESSIVE MODELS BY. W. ANDERSON Stanford University When the rank of the autoregression

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

A Selective Review of Sufficient Dimension Reduction

A Selective Review of Sufficient Dimension Reduction A Selective Review of Sufficient Dimension Reduction Lexin Li Department of Statistics North Carolina State University Lexin Li (NCSU) Sufficient Dimension Reduction 1 / 19 Outline 1 General Framework

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix Estimating Variances and Covariances in a Non-stationary Multivariate ime Series Using the K-matrix Stephen P Smith, January 019 Abstract. A second order time series model is described, and generalized

More information

Hands-on Matrix Algebra Using R

Hands-on Matrix Algebra Using R Preface vii 1. R Preliminaries 1 1.1 Matrix Defined, Deeper Understanding Using Software.. 1 1.2 Introduction, Why R?.................... 2 1.3 Obtaining R.......................... 4 1.4 Reference Manuals

More information

Sufficient Dimension Reduction for Longitudinally Measured Predictors

Sufficient Dimension Reduction for Longitudinally Measured Predictors Sufficient Dimension Reduction for Longitudinally Measured Predictors Ruth Pfeiffer National Cancer Institute, NIH, HHS joint work with Efstathia Bura and Wei Wang TU Wien and GWU University JSM Vancouver

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Orthogonal decompositions in growth curve models

Orthogonal decompositions in growth curve models ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 4, Orthogonal decompositions in growth curve models Daniel Klein and Ivan Žežula Dedicated to Professor L. Kubáček on the occasion

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Studentization and Prediction in a Multivariate Normal Setting

Studentization and Prediction in a Multivariate Normal Setting Studentization and Prediction in a Multivariate Normal Setting Morris L. Eaton University of Minnesota School of Statistics 33 Ford Hall 4 Church Street S.E. Minneapolis, MN 55455 USA eaton@stat.umn.edu

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

identity matrix, shortened I the jth column of I; the jth standard basis vector matrix A with its elements a ij

identity matrix, shortened I the jth column of I; the jth standard basis vector matrix A with its elements a ij Notation R R n m R n m r R n s real numbers set of n m real matrices subset of R n m consisting of matrices with rank r subset of R n n consisting of symmetric matrices NND n subset of R n s consisting

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Lecture Summaries for Linear Algebra M51A

Lecture Summaries for Linear Algebra M51A These lecture summaries may also be viewed online by clicking the L icon at the top right of any lecture screen. Lecture Summaries for Linear Algebra M51A refers to the section in the textbook. Lecture

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Basic Concepts in Linear Algebra

Basic Concepts in Linear Algebra Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University February 2, 2015 Grady B Wright Linear Algebra Basics February 2, 2015 1 / 39 Numerical Linear Algebra Linear

More information

This appendix provides a very basic introduction to linear algebra concepts.

This appendix provides a very basic introduction to linear algebra concepts. APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

ARTICLE IN PRESS European Journal of Combinatorics ( )

ARTICLE IN PRESS European Journal of Combinatorics ( ) European Journal of Combinatorics ( ) Contents lists available at ScienceDirect European Journal of Combinatorics journal homepage: www.elsevier.com/locate/ejc Proof of a conjecture concerning the direct

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Linear Algebra and Robot Modeling

Linear Algebra and Robot Modeling Linear Algebra and Robot Modeling Nathan Ratliff Abstract Linear algebra is fundamental to robot modeling, control, and optimization. This document reviews some of the basic kinematic equations and uses

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information