Effects of data dimension on empirical likelihood

Size: px
Start display at page:

Download "Effects of data dimension on empirical likelihood"

Transcription

1 Biometrika Advance Access published August 8, 9 Biometrika (9), pp. C 9 Biometrika Trust Printed in Great Britain doi:.9/biomet/asp7 Effects of data dimension on empirical likelihood BY SONG XI CHEN Department of Statistics, Iowa State University, Iowa 5-, U.S.A. songchen@iastate.edu LIANG PENG School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia -6, U.S.A. peng@math.gatech.edu AND YING-LI QIN Department of Statistics, Iowa State University, Iowa 5-, U.S.A. qinyl@iastate.edu SUMMARY We evaluate the effects of data dimension on the asymptotic normality of the empirical likelihood ratio for high-dimensional data under a general multivariate model. Data dimension and dependence among components of the multivariate random vector affect the empirical likelihood directly through the trace and the eigenvalues of the covariance matrix. The growth rates to infinity we obtain for the data dimension improve the rates of Hjort et al. (8). Some key words: Asymptotic normality; Data dimension; Empirical likelihood; High-dimensional data.. INTRODUCTION Since Owen (988, 99) introduced empirical likelihood, it has been extended to a wide range of settings as a tool for nonparametric and semiparametric inference. Its most attractive property is its permitting likelihood-like inference in nonparametric or semiparametric settings, largely due to its sharing two key properties with the conventional likelihood: Wilks theorem and Bartlett correction (Hall & La Scala, 99; DiCiccio et al., 99; Chen & Cui, 6). High-dimensional data are increasingly common; for instance, in DNA and genetic microarray analysis, marketing research and financial applications. There is a rapidly expanding literature on multivariate analysis where the data dimension p depends on the sample size n and grows to infinity as n ; see, for example, Portnoy (984, 985) in the context of M-estimation, Bai & Saranadasa (996) for two-sample test for means, Ledoit & Wolf ()andschott (5) for testing a specific covariance structure and Schott (7) for tests with more than two samples. Given the interest in both high-dimensional data and empirical likelihood, there is a need to evaluate the behaviour of the latter when the data dimension and the sample size increase simultaneously. In this paper, we evaluate the effects of the data dimension and dependence on the asymptotic normality of the empirical likelihood ratio statistic for the mean.

2 SONG XI CHEN, LIANG PENG AND YING-LI QIN Let X,...,X n be independent and identically distributed p-dimensional random vectors in R p with mean vector µ = (µ,...,µ p ) T and nonsingular variance matrix.let L n (µ) = sup ( n i= π i : π i, π i =, i= ) π i X i = µ be the empirical likelihood for µ and let w n (µ) = log{n n L n (µ)} be the empirical likelihood ratio statistic. When p is fixed, Owen (988, 99) showed that i= () w n (µ) χ p () in distribution as n, which mimics Wilks theorem for parametric likelihood ratios. An extension of the above result for parameters defined by general estimating equations is given in Qin & Lawless (994). As p for high-dimensional data, the natural substitute for ()is (p) / {w n (µ) p} N(, ) () in distribution as n, since χp is asymptotic normal with mean p and variance p. Akey question is how large the dimension p can be while () remains valid. In a recent study, Hjort et al. (8) have established that it is p = o(n / ) under the assumptions: Assumption. Assumption. The eigenvalues of are uniformly bounded away from zero and infinity, and All components of X i are uniformly bounded random variables. When Assumption is relaxed, we have: Assumption. E( p / X i q )andp p ( j) j= E( X i µ j q ) are bounded for some q 4, where is the Euclidean norm. Hjort et al. (8) showed that () is valid if p +6/(q ) /n. When q = 4 in Assumption, it means p = o(n /6 ). Hence, there is a significant slowingdown on the rate of p when Assumption is weakened. Tsao (4) found that, when p is moderately large but fixed, the distribution of w n (µ) has an atom at infinity for fixed n: the probability of w n (µ) = is nonzero. Tsao showed that, if p and n increase at the same rate such that p/n 5, the probability of w n (µ) = converges to since the probability of µ being contained in the convex hull of the sample converges to. These reveal the effects of p on the empirical likelihood from another perspective. In this paper, we analyze the empirical likelihood for high-dimensional data under a general multivariate model, which facilitates a more detailed analysis than Hjort et al. (8) and allows less restrictive conditions. The analysis requires neither the largest eigenvalue of nor E( p / X i q ) to be bounded, and hence accommodates a wider range of dependences among components of X i. Our main finding is that the effect of the dimensionality and the dependence among components of X i on the empirical likelihood are leveraged through tr( ), the trace of the covariance matrix and its largest eigenvalue λ p. We provide a general rate for the dimension p, which is shown to be dependent on tr( ) andλ p. In particular, under Assumptions and, p = o(n / ), which improves p = o(n / )ofhjort et al. (8). This is likely to be the best rate for p in the context of the empirical likelihood as p = o(n / ) is the sufficient and necessary condition for the

3 Empirical likelihood convergence of the sample covariance matrix to under the trace-norm when all the eigenvalues of are bounded.. PRELIMINARIES Suppose that each of the independent and identically distributed observations X i R p is specified by X i = ƔZ i + µ, whereɣ is a p m matrix, m p, andz i = (Z i,...,z im ) T is a random vector such that E(Z i ) =, var(z i ) = I m, E ( Zil 4k ) = m4k (, ), (4) E ( Z α il Z α q il q ) = E ( Z α il ) E ( Z α q il q ), whenever q l= α l 4k and l l q. Here k is some positive integer and I m is the m-dimensional identity matrix. The above multivariate model, employed in Bai & Saranadasa (996), means that each X i is a linear transformation of some m-variate random vector Z i. An important feature is that m, the dimension of Z i, is arbitrary provided m p and ƔƔ T =, which can generate a rich collection of X i from Z i with the given covariance. It also requires that power transformations of different components of Z i are uncorrelated, which is weaker than assuming that they are independent. The model (4) encompasses many multivariate models. It includes the elliptically contoured distributions with Z i = RU (m) where R is a nonnegative random variable and U (m) is the uniform random vector on the unit sphere (Fang & Zhang, 99). The multivariate normal and t-distribution are elliptically contoured, and so are a mixture of normal distributions whose density is defined by n(x µ, v )dw(v), where n(x µ, ) is the density of N(µ, ) and w(v) is the distribution function of a nonnegative univariate random variable (Anderson, ). Both the moment conditions and the correlation are imposed on Z i rather than X i. This model structure allows the moments of X i µ k to be derived and allows us to conduct a more detailed analysis than possible in Hjort et al. (8). The integer k determines the number of finite moments for Z il.ask, each Z il has at least finite fourth moments. This is the minimal moment condition to ensure the convergence of the largest eigenvalue of the sample covariance matrix to the largest eigenvalues of (Yin et al., 988; Bai et al., 998), and hence the convergence of the sample covariance matrix to under the matrix norm based on the largest eigenvalue. By inspecting the proofs given in the Appendix, we see that a divergent sample covariance matrix would dramatically alter the asymptotic mean and variance of the empirical likelihood ratio. Hence, it is unclear if () would remain true. From the standard empirical likelihood solutions (Owen, 988, 99) that are valid for any p, fixed or growing, the optimal weights π i for the optimization problem () are π i = n where λ R p is a Lagrange multiplier satisfying g(λ) = i= + λ T (X i µ), X i µ =. (5) + λ T (X i µ) Hence, the empirical likelihood L n (µ) equals n n n i= { + λ T (X i µ)}. As the maximum empirical likelihood is attained at π i = n (i =,...,n), the empirical likelihood ratio

4 4 SONG XI CHEN, LIANG PENG AND YING-LI QIN statistic is w n (µ) = log{n n L n (µ)} = log{ + λ T (X i µ)}. Throughout the paper we let γ (A) γ p (A) denote the eigenvalues and let tr(a) denote the trace operator of a matrix A. When A =, we write γ j ( )asγ j ( j =,...,p). It is assumed throughout the paper that γ C for some positive constant C. i=. EFFECTS OF HIGH DIMENSION The Lagrange multiplier λ defined in (5) is a key element in any empirical likelihood formulation, and reflects the implicit nature of the methodology. When p is fixed, Owen (99) showed that λ = O p (n / ). (6) This has been the prevailing order for the λ except in nonparametric curve estimation, where n is replaced by the effective sample size (Chen, 996). When p grows with n,(6) is no longer valid. THEOREM. If {tr( )} 4k γ p = O(n k ) and γ p p = o(n), then λ = O p [{tr( )/n} / ]. Theorem implies that the effect of the dimension and dependence among components of X i on the Lagrange multiplier is directly determined through tr( )andγ p. The rate for λ can be regarded as a generalization of (6) for a fixed p since O p [{ tr( )/n} / ] degenerates to O p (n / ) in that case. We first study the effects of dimension on the asymptotic normality of w n (µ), assuming existence of the minimal fourth moment for each Z il. Later, we will increase the number of moments. We assume for the time being that k = in(4) and tr 5 ( )γ 5 p = o(np). Since pγ tr( ) pγ p, this implies the conditions of Theorem. We wish to establish an expansion for w n (µ). Put W i = λ T (X i µ). From (A7) of the Appendix, max i=,...,n W i =o p (), which allows log{ + λ T (X i µ)} =W i W i / + W i /( + ξ i) 4, (7) where ξ i λ T (X i µ). Expand (5) so that = g(λ) = X µ S n λ + β n, where β n = n n i= (X i µ)w i /( + ξ i) for some ξ i λ T (X i µ) and S n = n n i= (X i µ)(x i µ) T. Hence, From (7) and (8), we obtain an expansion for w n (µ): w n (µ) = n( X µ) T Sn ( X µ) nβ n Sn β n + λ = Sn ( X µ) + Sn β n. (8) ni= {λ T (X i µ)} /( + ξ i ) 4 = n( X µ) T ( X µ) + n( X µ) T (S n )( X µ) nβ n S n β n + R n{ + o p ()}, (9) where R n = n i= {λ T (X i µ)}. This expansion looks similar to that given in Owen (99) for afixedp, but the stochastic order of each term requires careful evaluation as p grows with n.

5 From Lemma 5 in the Appendix, we have Empirical likelihood 5 (p) / {n( X µ) T ( X µ) p} N(, ) () in distribution as n, which is true under much weaker conditions, for instance p/n c, by applying the martingale central limit theorem. Derivations given in the Appendix show that the other two terms on the right-hand side of (9) are both o p (p / ). These lead us to establish () as summarized in the following theorem. THEOREM. If k = in (4) and tr 5 ( )γ 5 p = o(np), then () is valid. Theorem indicates that, when γ p is bounded, () istrueifp = o(n /4 ), which improves the order p = o(n /6 ) obtained by Hjort et al. (8) under the finite fourth moment condition of X i, which we do not need in our study. The conditions assumed under Theorem are liberal compared to Assumptions and, and there is no explicit restriction on γ p, which may diverge to infinity as n. Next we show that the dimension p can increase more rapidly if Z il possesses more than four moments. Assuming higher-order moments allows us to evaluate those terms in (9) more accurately. Specifically, we will assume Z il has at least finite th moment, k in model (4). The case k can be considered as a part of the case k whose analysis is covered by Theorem. The following theorem, whose proof is given in an Iowa State University technical report available from the authors, shows that p = o(n / ) is approachable. THEOREM. If k in (4), {tr( )} 4k γ p = O(n k ) and p γ 5 p = o{n(4k)/(4k) }, then () is valid. When γ p is bounded, Theorem implies that w n (µ) is asymptotically normally distributed if p = o(n //(8k) ), which is close to o(n / )fork and improves the earlier rate o(n / ) attained in Hjort et al. (8). By reviewing the proof of Theorem, we can see that if Z ij are all bounded random variables the dimensionality p can reach o(n / ). We believe that p = o(n / )is the best rate for the asymptotic normality of the empirical likelihood ratio with the normalizing constants p and (p) /, based on the following considerations. Lemma 4 in the Appendix implies that, when the largest eigenvalue of is bounded, S n tr in probability if and only if p = o(n / ). Here A tr ={tr(a A)} / is the trace norm. Bai & Yin (99) established the convergence of S n to with probability one if p = o(n) under the matrix norm based on the largest eigenvalue by assuming each Z nl are independent and identically distributed. However, it can be seen from our proofs in the technical report that the convergence of S n to under the trace norm is the one used in establishing various results for the empirical likelihood. As shown by Theorems and, when() is valid, the asymptotic mean and variance of the empirical likelihood ratio are respectively p and p, which are known. This means that the empirical likelihood carries out internal studentizing even when p increases along with n. However, it is apparent that the internal studentization prevents p from growing faster as it brings in those higher-order terms. The least-squares empirical likelihood is a simplified version of the empirical likelihood. The least-squares empirical likelihood ratio for µ is q n (µ) = min (nπ i ) subject to n i= π i = and π i (X i µ) =. The least-squares empirical likelihood uses (nπ i ) to approximate log(nπ i ). As shown in Brown & Chen (998), the optimal weights π i admit closed-form solutions so that q n (µ) = n( X µ) T H n ( X µ), ()

6 6 SONG XI CHEN, LIANG PENG AND YING-LI QIN where H n = S n ( X µ)( X µ) T. Thus, q n (µ) can be readily computed without solving the nonlinear equation (5) as for the full empirical likelihood. The least-squares empirical likelihood ratio is a first-order approximation to the full empirical likelihood ratio, and q n (µ) χ p in distribution when p is fixed. The least-squares empirical likelihood is less affected by higher dimension. In particular, if k in(4), then (p) / {q n (µ) p} N(, ) () in distribution as n when p = o(n / ), which improves the rate given by Theorem for the full empirical likelihood ratio w n (µ). To appreciate (), we note from () that q n (µ) = n( X µ) T ( X µ) + n( X µ) T (H n )( X µ). () Then, following a similar line to the proof of Lemma 6, n( X µ) T( H n ) ( X µ) = O p (p /n) = o p (p / ). As the first term on the right-hand side of () is asymptotically normal with mean p and variance p as conveyed in (), () is valid. If we confine ourselves to specific distributions, faster rates for p can be established. For example, if the data are normally distributed, the least-squares empirical likelihood ratio is the Hotelling-T statistic, which is shown in Bai & Saranadasa (996) to be asymptotically normal if p/n c [, ). 4. NUMERICAL RESULTS We report results from a simulation study designed to evaluate the asymptotic normality of the empirical likelihood ratio. The p independent and identically distributed data vectors {X i } i= n were generated from a moving average model, X ij = Z ij + ρ Z ij+ (i =,...,n, j =,...,p), where, for each i, the innovations {Z ij } p+ j= were independent random variables with zero mean and unit variance. We considered two distributions for the innovation Z ij. One is the standard normal distribution, and the other is a standardized version of a Pareto distribution with distribution function ( x 4 5 )I (x ). We standardized the Pareto random variables so that they had zero mean and unit variance. As the Pareto distribution has only four finite moments, we had k = in (4), whereas k = for the normally distributed innovations. In both distributions, X i is a multivariate random vector with zero mean and covariance = (σ ij ) p p,whereσ ii =,σ ii± = ρ and σ ij = for i j >. We set ρ to be 5 throughout the simulation. To make p and n increase simultaneously, we considered two growth rates for p with respect to n: (i)p = c n 4 and (ii) p = c n 4. We chose the sample size n =, 4 and 8. By assigning c =, 4 and 5 in the faster growth rate setting (i), we obtained three dimensions for each sample size, which were p = 5, and 4 for n = ; p =, 44 and 58 for n = 4; and p = 4, 55 and 7 for n = 8, respectively. For the slower growth rate setting (ii), to maintain a certain amount of increase between successive dimensions when n was increased, we assigned larger c = 4, 6 and 8, which led to p = 4, 7 and for n = ; p =, 5 and for n = 4; and p = 9, 4 and 4 for n = 8, respectively. We carried out 5 simulations for each of the (p, n)-combinations and for each of the two innovation distributions. Figure displays Q-Q plots of standardized empirical likelihood ratio

7 Empirical likelihood 7 Normal n = Normal n = 4 Normal n = 8 EL quantile Pareto n = Pareto n = 4 Pareto n = 8 EL quantile Fig.. Normal Q-Q plots with the faster growth rate p = c n 4 for the normal (upper panels) and the Pareto (lower panels) innovations: c = (solid line), 4 (dotted lines) and 5 (dashed lines). statistics for the faster growth rate (i). Those for the slower growth rate (ii) are presented in Fig.. Asn and p were increased simultaneously, there was a general convergence of the standardized empirical likelihood ratio to N(, ). We also observed that the convergence in Fig. for the slower growth rate setting (ii) was faster than that in Fig. for the faster growth rate setting. This is expected as the setting (i) ensured much higher dimensionality. The convergence for the normal innovation was faster than that for the Pareto case when p = c n 4 in Fig.. This may be explained by the fact that the Pareto distribution has only finite fourth moments, which corresponds to k =, whereas the normal innovation has all moments finite. According to Theorems and, the growth rate for p depends on the value of k: the larger the k, the higher the rate. For the lower growth rate in setting (ii), Fig. shows that, there was substantial improvement in the convergence in the Q-Q plots as p was increased at the slower rate for both distributions of innovations. It is observed that the most of the lack-of-fit in the N(, ) Q-Q plots in Figs. and appeared at the lower and upper quantiles. This could be attributed to the lack-of-fit between χp and N(, ), as χp may be viewed as the intermediate convergence of the empirical likelihood ratio. To verify this, we carried out further simulations by inverting settings (i) and (ii) so that for a given dimension p, three sample sizes were generated according to (iii) n = (p/c ) / 4 and (iv) n = (p/c ) / 4, with c =, 4 and 5 and c = 4, 5 and 6, respectively. We chose p = 5, 45 and 55 for the setting (iii) and p = 7, and 5 for the setting (iv). Two figures of χp -based Q-Q plots for (iii) and (iv), given in the Iowa State University technical report, show that there was a substantial improvement in the overall fit of the Q-Q plots, and that the lack-of-fit in the N(, )-based Q-Q plots largely disappeared.

8 8 SONG XI CHEN, LIANG PENG AND YING-LI QIN Normal n = Normal n = 4 Normal n = 8 EL quantile Pareto n = Pareto n = 4 Pareto n = 8 EL quantile Fig.. Normal Q-Q plots with the slower growth rate p = c n 4 for the normal (upper panels) and the Pareto (lower panel) innovations: c = 4 (solid line), 6 (dotted lines) and 8 (dashed lines). ACKNOWLEDGEMENT The authors thank the editor, Professor D. M. Titterington, and two referees for valuable suggestions that have improved the presentation of the paper. The authors acknowledge grants from the U.S. National Science Foundation. and APPENDIX Technical details We first establish some lemmas. LEMMA. If m 4k < for some k, then E( X i µ k ) = O{tr k ( )} and var( X i µ k ) = O{tr k ( )γ p }. Proof. We only show the case of k = since other cases are similar. It is easy to check that E( X i µ ) = tr{e(x i µ) T (X i µ)} =tr( ) E( X i µ 4 ) = E( ƔZ i 4 ) = E ( ) { Zi TƔT ƔZ i Zi TƔT ƔZ i = tr ƔT ƔE ( )} Z i Zi TƔT ƔZ i Zi T. Write Ɣ T Ɣ = (ν sl ) s,l m. Then Z i Zi TƔT ƔZ i Zi T = ( m m j= l= Z ik Z il ν lj Z ij Z ik ) k,k m. When k = k = s, E { m m j= l= Z } ik Z il ν lj Z ij Z ik = νss E(Z is ) 4 + l s ν ll. (A)

9 When k k, E( m j= m l= Z ik Z il ν lj Z ij Z ik ) = ν k k. Hence, E( X i µ 4 ) ={m 4 } Empirical likelihood 9 m νss + tr ( ) + tr( ). Note that m s= ν ss m m j= s= ν js = tr{(ɣt Ɣ) }=tr( ). This together with (A) and (A) implies that var( X i µ ) ={m 4 } m s= ν ss + tr( ) = O{tr( )}. LEMMA. Proof. We note that s= If m 4k < for some k, then, with probability one, max X i µ = o [ {tr( )} (k)/(4k) γp /(4k) n /(4k)] + O{tr / ( )}. i=,...,n (A) and max i=,...,n X i µ { max i=,...,n X i µ k E( X i µ k ) +E( X i µ k ) } /(k) max [ X i µ k E( X i µ k ){var( X i µ k )} / ] = o(n / ) i=,...,n with probability one as n. The lemma is proved by applying Lemma A of Owen (99) and Lemma. From now on, we let Y i = / (X i µ), V n = n n i= Y iy T i, Ȳ = n n i= Y i and D n = V n I p = (d sl ) l,s =,...,p. LEMMA. Under the conditions of Theorem, tr(d n ) = O p(p /n). Proof. We only need to show that E{tr(Dn )}=O(p /n). Note that V n = / ƔS z Ɣ T /, where S z = n n i= Z i Zi T and = Ɣ T Ɣ = ( ) σ jl, say. Then j,l =,...,m and ( m E{tr(S z )} =E tr(d n ) = tr(s z S z ) tr(s z ) + p n ) m Z ij Z il σ lj = δ jl σ lj = j,l= i= j,l= m σ jj = p j= (A) (A4) since tr( ) = tr(i p ) = p. By utilizing information of Z i in (4), ( m ) m E[tr{(S Z ) }] = E n Z i j Z i l Z i l Z i l σ l l σ l j j,l= l,l = i,i = m = m 4 n σ jj + ( ) n σ jl + σ jj σ ll + ( n ) j l = j= m σ jl + (m 4 )n j,l= m j= m σ jj + n j l ( σ jl + σ jj σ ll ). σ jl j,l= It is easy to check that m j,l= σ jl = tr( ) = p, m j= σ jj m j,l= σ jl = p, j l σ jl m j,l= σ jl = p and j l σ jj σ ll ( m j= σ jj) = p. These together with (A) and (A4) imply E{tr(Dn )}=O(p /n). LEMMA 4. Under condition (4), max i=i,...,p γ i(s n ) γ i ( ) =O p (γ p pn / ).

10 SONG XI CHEN, LIANG PENG AND YING-LI QIN Proof. Note that γ i (S n ) γ i ( ) p i= γ / ( ) i S / n γ i ( ) = tr ( S n) + tr( ) By Von Neumann s inequality, p i= γ i(s n )γ i ( ) tr(s n ). Hence Now by applying Lemma. p γ i (S n )γ i ( ). i= max γ i(s n ) γ i ( ) {tr(s n ) } /. i=,...,p tr { (S n ) } = tr(d n D n ) γp ( )tr( Dn ) { = O p γ p ( )p /n } This lemma implies that all the eigenvalues of S n converge to those of uniformly at the rate of O p (γ p pn / ). Proof of Theorem. By (5), λ R p satisfies = X i µ = g(λ). (A5) n + λ T (X i µ) i= Write λ = ρθ, where ρ and θ =. Hence, Hence, = g(ρθ) θ T g(ρθ) { } = n (X θ T i µ)θ T (X i µ) (X i µ) ρ + ρθ T (X i= i= i µ) ρθ T S n θ { + ρ max X i µ } n θ T (X i µ) i=,...,n. { ρ θ T S n θ max X i µ n i=,...,n i= } θ T (X i µ) n θ T (X i µ). i= Since n n i= θ T (X i µ) =O p [{tr( )/n} / ], it follows from Lemma that max X i µ n θ T (X i µ) i=,...,n = o { p {tr( )} /(4k) γp /(4k) n /+/(4k) { + O p tr( )n / } = o p (). i= By Lemma 4, for a positive constant C, P(θ T S n θ C ) as n. Hence λ = ρ = O p [{tr( )/n} / ]. By repeating (A6) in the proof of the above theorem and Lemma, wehave i= max i=,...,n λt (X i µ) λ max X i µ = o p (). i=,...,n We need the following lemmas to prove Theorem. LEMMA 5. n. (A6) (A7) If p/n c, then (p) / {n( X µ) T ( X µ) p} N(, ) in distribution as

11 Empirical likelihood Proof. The proof entails applying the martingale central limit theorem (Hall & Hyde, 98). Bai & Saranadasa (996) used this approach to establish asymptotic normality of a two sample test statistic for high-dimensional data. LEMMA 6. Under the conditions of Theorem, n( X µ) T( S n ) ( X µ) = o p (p n / ). Proof. Recall that D n = V n I p = (d sl ) s p, l p. It follows from Lemma that P ( max d k k >ɛ ) k,k =,...,p p k = k = p ɛ E ( dk ) k = ɛ E { tr ( Dn)} = O(p /n). Hence, d jl = O p (pn / ) = o p () uniformly in j, l p. It is easy to check that Vn I p = D n + Dn + ( ) D n V n I p and n( X µ) T( S n ) ( X µ) = nȳ T( V n I p )Ȳ. From Lemma, E( Ȳ ) = n E( Y ) = p/n. Since Ȳ T AȲ Ȳ {tr(a )} / for any symmetric matrix A, it follows from Lemma 4 and the condition p = o(n / ) that nȳ T D n Ȳ n Ȳ { tr ( D n)} / = O p (p n / ) = o p (p / ). Similarly, nȳ T D nȳ n Ȳ tr(d n ) = O p(p /n) = o p (p / ). Furthermore, we note the following facts: Ȳ T DnȲ max { γ i(d n ) }Ȳ T DnȲ = o p i=,...,p since max { γ i(d n ) } {tr(dn i=,...,p )}/ and p = o(n / ), for any positive integer l, (Ȳ T D nȳ ) Ȳ T DnȲ 4 γ p(dn )Ȳ T DnȲ = o p(ȳ T DnȲ ). In general, if Ȳ T Dn +l (Ȳ Ȳ = o T p DnȲ ). The lemma follows from summarizing the above results. Proof of Theorem. Put W i = λ T (X i µ). Then (A7) implies that max i=,...,n W i =o p (). Expand equation (A5), W i = g(λ) = X µ S n λ + β n (A8) where β n = n n i= (X i µ) (+ξ i and ξ ) i λ T (X i µ). As max W i =o p (), max ξ i = i=,...,n i=,...,n o p () as well. Hence β n = β n { + o p ()}, where β n = n n i= (X i µ)wi. Apply Theorem and Lemma with k =, we have, if tr( ) = O(γp 5/ n / ), It follows from (A8) that β n = max i=,...,n X i µ λ O p (γ p ( )) = o p ( λ ). (A9) λ = Sn ( X µ) + Sn β n (A) and log( + W i ) = W i Wi / + W i /( + ξ i) 4 for some ξ i such that ξ i W i. Therefore, w n (µ) = n( X µ) T Sn ( X µ) nβ n Sn β n + n i= { λt (X i µ) } ( + ξi ) 4 = n( X µ) T ( X µ) + n( X µ) T( S n ) ( X µ) nβ n S n β n + R n{ + o p ()},

12 SONG XI CHEN, LIANG PENG AND YING-LI QIN where R n = n i= {λt (X i µ)}.by(a9), (A) and Lemma 4, nβ n Sn β n n β n /γ (S n ) = O p { γ p tr ( )n } + o p { γ 5/ p tr 5/ ( )n /} = o p (p / ). We also note that R n = (nλ T S n λ) / ( n i= λ 4 X i µ 4 ) / = o p (p / ). Hence the theorem follows from Lemmas 5 and 6. REFERENCES ANDERSON, T. W. (). An Introduction to Multivariate Statistical Analysis. NewYork:Wiley. BAI, Z.&SARANADASA, H. (996). Effect of high dimension: by an example of a two sample problem. Statist. Sinica 6, 9. BAI, Z. D., SILVERSTEIN, J. W.& YIN, Y. Q. (998). A note on the largest eigenvalue of a large-dimensional sample covariance matrix. J. Mult. Anal. 6, BAI, Z.& YIN, Y. Q. (99). Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. Ann. Prob., BROWN, B. M.& CHEN, S. X. (998). Combined and least squares empirical likelihood. Ann. Inst. Statist. Math. 5, CHEN, S. X. (996). Empirical likelihood confidence intervals for nonparametric density estimation. Biometrika 8, 9 4. CHEN, S. X.& CUI, H. J. (6). On bartlett correction of empirical likelihood in the presence of nuisance parameters. Biometrika 9, 5. DICICCIO, T. J., HALL, P.& ROMANO, J. P. (99). Empirical likelihood is Bartlett-correctable. Ann. Statist. 9, 5 6. FANG, K.T&ZHANG, Y. T. (99). Generalized Multivariate Analysis. Beijing: Science Press. HALL, P. G.& HYDE, C. C. (98). Martingale Central Limit Theory and Its Applications. New York: Academic Press. HALL,P.&LA SCALA, B. (99). Methodology and algorithms of empirical likelihood. Int. Statist. Rev. 58, 9 7. HJORT, H. L., MCKEAGUE, I. W.& VAN KEILEGOM, I. (8). Extending the scope of empirical likelihood. Ann. Statist. 7, LEDOIT,O.&WOLF, M. (). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist., 8. OWEN, A. B. (988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, OWEN, A. B. (99). Empirical likelihood ratio confidence regions. Ann. Statist. 8, 9. PORTNOY, S. (984). Asymptotic behavior of M-estimations of p regression parameters with p /n is large. I. consistency. Ann. Statist. 4, PORTNOY, S. (985). Asymptotic behavior of M estimations of p regression parameters when p /n is large. II. normal approximation. Ann. Statist., 4 7. QIN, J.&LAWLESS, J. (994). Empirical likelihood and general estimating functions. Ann. Statist., 5. SCHOTT, J. R. (5). Testing for complete independence in high dimensions. Biometrika 9, SCHOTT, J. R. (7). Some high dimensional tests for a one-way MANOVA. J. Mult. Anal. 98, TSAO, M. (4). Bounds on coverage probabilities of the empirical likelihood ratio confidence regions. Ann. Statist., 5. YIN, Y. Q., BAI, Z. D.& KRISHNAIAH, P. R. (988). On the limit of the largest eigenvalue of the large-dimensional sample covariance matrix. Prob. Theory Rel. Fields 78, 59. [Received July 7. Revised November 8]

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Adjusted Empirical Likelihood for Long-memory Time Series Models

Adjusted Empirical Likelihood for Long-memory Time Series Models Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Multivariate-sign-based high-dimensional tests for sphericity

Multivariate-sign-based high-dimensional tests for sphericity Biometrika (2013, xx, x, pp. 1 8 C 2012 Biometrika Trust Printed in Great Britain Multivariate-sign-based high-dimensional tests for sphericity BY CHANGLIANG ZOU, LIUHUA PENG, LONG FENG AND ZHAOJUN WANG

More information

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood ICSA Applied Statistics Symposium 1 Balanced adjusted empirical likelihood Art B. Owen Stanford University Sarah Emerson Oregon State University ICSA Applied Statistics Symposium 2 Empirical likelihood

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Penalized Empirical Likelihood and Growing Dimensional General Estimating Equations

Penalized Empirical Likelihood and Growing Dimensional General Estimating Equations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (0000), 00, 0, pp. 1 15 C 0000 Biometrika Trust Printed

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Empirical likelihood for linear models in the presence of nuisance parameters

Empirical likelihood for linear models in the presence of nuisance parameters Empirical likelihood for linear models in the presence of nuisance parameters Mi-Ok Kim, Mai Zhou To cite this version: Mi-Ok Kim, Mai Zhou. Empirical likelihood for linear models in the presence of nuisance

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Miscellanea Multivariate sign-based high-dimensional tests for sphericity

Miscellanea Multivariate sign-based high-dimensional tests for sphericity Biometrika (2014), 101,1,pp. 229 236 doi: 10.1093/biomet/ast040 Printed in Great Britain Advance Access publication 13 November 2013 Miscellanea Multivariate sign-based high-dimensional tests for sphericity

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

High-dimensional two-sample tests under strongly spiked eigenvalue models

High-dimensional two-sample tests under strongly spiked eigenvalue models 1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

Extending the two-sample empirical likelihood method

Extending the two-sample empirical likelihood method Extending the two-sample empirical likelihood method Jānis Valeinis University of Latvia Edmunds Cers University of Latvia August 28, 2011 Abstract In this paper we establish the empirical likelihood method

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Statistica Sinica Preprint No: SS R1

Statistica Sinica Preprint No: SS R1 Statistica Sinica Preprint No: SS-2017-0059.R1 Title Peter Hall's Contribution to Empirical Likelihood Manuscript ID SS-2017-0059.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0059

More information

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood Advances in Decision Sciences Volume 2012, Article ID 973173, 8 pages doi:10.1155/2012/973173 Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

More information

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices by Jack W. Silverstein* Department of Mathematics Box 8205 North Carolina State University Raleigh,

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Calibration of the Empirical Likelihood Method for a Vector Mean

Calibration of the Empirical Likelihood Method for a Vector Mean Calibration of the Empirical Likelihood Method for a Vector Mean Sarah C. Emerson and Art B. Owen Department of Statistics Stanford University Abstract The empirical likelihood method is a versatile approach

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Inference on reliability in two-parameter exponential stress strength model

Inference on reliability in two-parameter exponential stress strength model Metrika DOI 10.1007/s00184-006-0074-7 Inference on reliability in two-parameter exponential stress strength model K. Krishnamoorthy Shubhabrata Mukherjee Huizhen Guo Received: 19 January 2005 Springer-Verlag

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

CONSTRUCTING NONPARAMETRIC LIKELIHOOD CONFIDENCE REGIONS WITH HIGH ORDER PRECISIONS

CONSTRUCTING NONPARAMETRIC LIKELIHOOD CONFIDENCE REGIONS WITH HIGH ORDER PRECISIONS Statistica Sinica 21 (2011), 1767-1783 doi:http://dx.doi.org/10.5705/ss.2009.117 CONSTRUCTING NONPARAMETRIC LIKELIHOOD CONFIDENCE REGIONS WITH HIGH ORDER PRECISIONS Xiao Li 1, Jiahua Chen 2, Yaohua Wu

More information

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho M. de Carvalho and F. J. Marques Abstract We discuss jackknife Euclidean likelihood-based inference methods, with a special focus on the

More information

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL Jrl Syst Sci & Complexity (2009) 22: 483 494 TESTIG SERIAL CORRELATIO I SEMIPARAMETRIC VARYIG COEFFICIET PARTIALLY LIEAR ERRORS-I-VARIABLES MODEL Xuemei HU Feng LIU Zhizhong WAG Received: 19 September

More information

Carl N. Morris. University of Texas

Carl N. Morris. University of Texas EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province

More information

Tests for High Dimensional Generalized Linear Models

Tests for High Dimensional Generalized Linear Models MPRA Munich Personal RePEc Archive Tests for High Dimensional Generalized Linear Models Song Xi Chen and Bin Guo Peking Univeristy and Iowa State University, Peking Univeristy 2014 Online at http://mpra.ub.uni-muenchen.de/59816/

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Interval Estimation for AR(1) and GARCH(1,1) Models

Interval Estimation for AR(1) and GARCH(1,1) Models for AR(1) and GARCH(1,1) Models School of Mathematics Georgia Institute of Technology 2010 This talk is based on the following papers: Ngai Hang Chan, Deyuan Li and (2010). Toward a Unified of Autoregressions.

More information

A Review on Empirical Likelihood Methods for Regression

A Review on Empirical Likelihood Methods for Regression Noname manuscript No. (will be inserted by the editor) A Review on Empirical Likelihood Methods for Regression Song Xi Chen Ingrid Van Keilegom Received: date / Accepted: date Abstract We provide a review

More information

Chapter 7, continued: MANOVA

Chapter 7, continued: MANOVA Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to

More information

Empirical likelihood based modal regression

Empirical likelihood based modal regression Stat Papers (2015) 56:411 430 DOI 10.1007/s00362-014-0588-4 REGULAR ARTICLE Empirical likelihood based modal regression Weihua Zhao Riquan Zhang Yukun Liu Jicai Liu Received: 24 December 2012 / Revised:

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

What s New in Econometrics. Lecture 13

What s New in Econometrics. Lecture 13 What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

Multivariate Normal-Laplace Distribution and Processes

Multivariate Normal-Laplace Distribution and Processes CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by

More information

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) II Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 1 Compare Means from More Than Two

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Empirical likelihood-based methods for the difference of two trimmed means

Empirical likelihood-based methods for the difference of two trimmed means Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

An Information Criteria for Order-restricted Inference

An Information Criteria for Order-restricted Inference An Information Criteria for Order-restricted Inference Nan Lin a, Tianqing Liu 2,b, and Baoxue Zhang,2,b a Department of Mathematics, Washington University in Saint Louis, Saint Louis, MO 633, U.S.A. b

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Weighted empirical likelihood estimates and their robustness properties

Weighted empirical likelihood estimates and their robustness properties Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Weighted empirical likelihood estimates and their robustness properties N.L. Glenn a,, Yichuan Zhao b a Department of Statistics,

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Variance Function Estimation in Multivariate Nonparametric Regression

Variance Function Estimation in Multivariate Nonparametric Regression Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered

More information

A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report -

A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report - A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report - Edgar Brunner and Marius Placzek University of Göttingen, Germany 3. August 0 . Statistical Model and Hypotheses Throughout

More information

The Hybrid Likelihood: Combining Parametric and Empirical Likelihoods

The Hybrid Likelihood: Combining Parametric and Empirical Likelihoods 1/25 The Hybrid Likelihood: Combining Parametric and Empirical Likelihoods Nils Lid Hjort (with Ingrid Van Keilegom and Ian McKeague) Department of Mathematics, University of Oslo Building Bridges (at

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Likelihood and p-value functions in the composite likelihood context

Likelihood and p-value functions in the composite likelihood context Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University Introduction to the Mathematical and Statistical Foundations of Econometrics 1 Herman J. Bierens Pennsylvania State University November 13, 2003 Revised: March 15, 2004 2 Contents Preface Chapter 1: Probability

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan

More information

Likelihood ratio testing for zero variance components in linear mixed models

Likelihood ratio testing for zero variance components in linear mixed models Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,

More information

Marginal tests with sliced average variance estimation

Marginal tests with sliced average variance estimation Biometrika Advance Access published February 28, 2007 Biometrika (2007), pp. 1 12 2007 Biometrika Trust Printed in Great Britain doi:10.1093/biomet/asm021 Marginal tests with sliced average variance estimation

More information

Inferences about a Mean Vector

Inferences about a Mean Vector Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Empirical Likelihood Methods for Pretest-Posttest Studies

Empirical Likelihood Methods for Pretest-Posttest Studies Empirical Likelihood Methods for Pretest-Posttest Studies by Min Chen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

Nuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation

Nuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation Biometrika Advance Access published October 24, 202 Biometrika (202), pp. 8 C 202 Biometrika rust Printed in Great Britain doi: 0.093/biomet/ass056 Nuisance parameter elimination for proportional likelihood

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information