Restricted Likelihood Ratio Lack of Fit Tests using Mixed Spline Models

Size: px
Start display at page:

Download "Restricted Likelihood Ratio Lack of Fit Tests using Mixed Spline Models"

Transcription

1 Restricted Likelihood Ratio Lack of Fit Tests using Mixed Spline Models Gerda Claeskens Texas A&M University, College Station, USA and Université catholique de Louvain, Louvain-la-Neuve, Belgium Summary Penalised regression spline models afford a simple mixed model representation in which variance components control the degree of non-linearity in the smooth function estimates. This motivates the study of lack of fit tests based on the restricted maximum likelihood ratio statistic which tests whether variance components are zero versus the alternative of taking on positive values. For this one-sided testing problem a further complication is that the variance component belongs to the boundary of the parameter space under the null hypothesis. Conditions are obtained on the design of the regression spline models under which asymptotic distribution theory applies, and finite sample approximations to the asymptotic distribution are provided. Test statistics are studied for simple as well as multiple regression models. Keywords: Boundary hypothesis, Lack of fit, Likelihood ratio test, Mixed model, Regression spline model, Restricted maximum likelihood. Address for correspondence: Gerda Claeskens, Institute of Statistics, Université catholique de Louvain, Voie du Roman Pays 20, B-348 Louvain-la-Neuve, Belgium. claeskens@stat.ucl.ac.be

2 2 Introduction We construct test statistics based on splines for testing a parametric null model versus a nonparametric alternative. Adaptive tests are obtained using penalised spline regression modelling (Eilers and Marx, 996; Aerts, Claeskens and Wand, 2002) where a relatively small set of spline basis functions is used, but with a penalty for each spline coefficient. As an example suppose we observe data (y i, x i ), i =,..., n for which we wish to test whether the mean of Y i, with corresponding covariate x i, is a particular parametric function of the covariate, for example a polynomial of degree q: H 0 : E(Y ) = β 0 + β x + + β q x q. () A nonparametric lack of fit test does not assume any particular form of the alternative model, that is, the response variable Y i (i =,..., n) depends on the covariate x i through a semiparametric model involving an arbitrary univariate function g, Y i = β 0 + β x i + + β q x q i + g(x i) + ε i. (2) The function g is estimated by a regression spline estimator with smoothing parameter. In particular, we fit a semiparametric model K n Y i = β 0 + β x i + + β q x q i + u k ψ k (x i ) + ε i, where ψ k, k =,..., K n are spline basis functions, for example the truncated polynomial basis consisting of piecewise continuous qth degree polynomials with knots at values κ k. The estimator can also be formulated in terms of other bases such as the Demmler Reinsch basis (Nychka and Cummins, 996) and the B-spline basis (Eilers and Marx, 996). In regression spline modelling, the knots are specified by the user. See for example Ruppert and Carroll (2000) who suggest choosing the number of knots based on quantiles of the data. Penalised spline estimation restricts the influence of the knots on the fitted model, for example by bounding the squared L 2 norm of the spline coefficients. In a mixed regression spline model we explicitly assume that the spline coefficients are independent and identically distributed random variables; see Brumback, Ruppert and Wand (999). Mixed regression spline models have proven useful for a variety of estimation settings including hazard rate estimation (Cai, Hyndman and Wand, 2002) and geoadditive modelling (Kammann and Wand, 2003). The construction of such a mixed effects model, where the β coefficients remain fixed, has as a particular advantage that the hypothesis test above reduces to testing whether the single variance component of the random effects, say σu, 2 is equal to zero; the alternative k=

3 3 hypothesis states that σu 2 > 0. The variance component plays the role of the smoothing parameter. The introduction of random spline coefficients dramatically reduces the dimensionality of the testing problem. In a purely non-random model, a nonparametric test in the same setting requires testing whether all K n spline coefficients are equal to zero. From a practical point of view, another advantage is that a test on the variance component is easily obtained using standard statistical software. The focus of this paper is on the use of the restricted likelihood ratio statistic for testing in a random spline model. Using kernel smoothing methods, Staniswalis and Severini (99) likewise assess the fit of regression models using likelihood-based diagnostics. Orthogonal series based tests of Eubank and Hart (992) and extensions thereof by Aerts, Claeskens and Hart (999, 2000), using both likelihood ratio and score statistics, form another class of omnibus nonparametric tests, explicitly using the smoothing parameter, in this case the number of terms in the series, as a test statistic. Cantoni and Hastie (2002) construct a likelihood ratio-type statistic in a setting of mixed smoothing splines, thereby reformulating the hypotheses in terms of degrees of freedom. Guo (2002) uses similar tests in smoothing spline analysis of variance. Earlier work on the usage of smoothing splines for testing includes Cox, Koh, Wahba and Yandell (988) and Cox and Koh (989). The testing problem is nonstandard for several reasons. First of all, we wish to use the restricted likelihood ratio test for a one-sided testing problem. In our application the parameter of interest is a variance component which has a meaningful interpretation only if its value is non-negative, implying that the parameter space for σu 2 is the half-line of non-negative numbers [0, ), with the value of σu 2 under the null hypothesis at its boundary. Chernoff (954) gave a rigorous treatment of the use of (full) likelihood ratio statistics for onesided testing, under the assumptions that data are independent and identically distributed, and that the true value belongs to the interior of the parameter space. Applications of likelihood ratio testing for one-sided alternatives in linear models can be found in Gouriéroux, Holly and Monfort (982). Self and Liang (987) obtain the asymptotic distribution of the (full) likelihood ratio statistic for a situation of independent and identically distributed data when the true parameter value under the null hypothesis is on the boundary of the parameter space. This result has further been applied by Stram and Lee (994) for testing in a longitudinal mixed effects model. It is important to note that the results of Self and Liang (987) cannot be applied to regression spline models where the regression setting causes the distribution of the response variable Y i to depend on the value of the covariates x i, and in addition the random effects induce dependence between the response values. Extending the results of Geyer (994), Vu and Zhou (997) obtain the asymptotic distribution of likelihood ratio statistics dealing with this type of regression data. For an overview of existing methods, see Sen and Silvapulle (2002).

4 4 A novel aspect of this paper is that we particularly focus attention to the use of the profile restricted likelihood ratio test (RLRT) as opposed to likelihood ratio testing and this within a setting of nonparametric, mixed, regression spline fitting, with a growing number of knots. In Section 3 we obtain design conditions under which the results of Vu and Zhou (997) are applicable for testing with restricted maximum likelihood statistics and we obtain the asymptotic null distribution of the accompanying test statistic, where the probability of estimating a zero variance component plays a key role. To address the issue of the asymptotic mixing proportions in the limiting distribution, Section 4 explains how to obtain exact finite sample mixing proportions, involving the probability of obtaining zero valued variance components. We first restrict the discussion to tests involving only a single smoothing parameter; an extension to lack of fit testing in models with several smoothing parameters is presented in Section 5. Results of a simulation study are presented in Section 6. 2 Regression splines as mixed effect models The design matrices of fixed and random effects are, for sample size equal to n, given by x x q ψ (x ) ψ Kn (x ) X =......, Z =....., x n x q n and a penalised least squares criterion leads to the estimators ψ (x n ) ψ Kn (x n ) (ˆβ, û) = arg min{ Y Xβ Zu 2 + u T u} which also can be obtained via ridge regression (Eilers and Marx, 996; Aerts, Claeskens and Wand, 2002). The penalisation constant in the ridge regression framework plays the role of the smoothing parameter. The introduction of random effects on the u j s results in equivalence of best linear unbiased predictors and estimators obtained via the penalised least squares criterion. More specifically, we encompass the following linear mixed model, in matrix notation: Y = Xβ + Zu + ε, where u and ε are independent, Var(u) = σ 2 ur and Var(ε) = σ 2 ε I n. Note that Var(Y) = σ 2 ε I n + σ 2 u ZRZt = σ 2 ε V, with V = I n + ZRZ t and = σ 2 u/σ 2 ε. For truncated polynomial basis functions R = I Kn. For B-splines, the matrix R is defined via the transformation which expresses B-splines in terms of truncated polynomials. We denote the number of β components by q = q + and the number of u components, which is related to the number of knots, by K n. Testing hypothesis () versus the two-sided alternative that the conditional mean response has any different structure, is in the mixed model representation equivalent to testing H 0 : σu 2 = 0 versus H a : σu 2 > 0. (3)

5 5 For the test to be genuinely nonparametric, we let the number of spline basis functions K n, that is, the number of knots, grow with the sample size. In multiple regression, more than one random effect can be included. For a random effects u,..., u a, the linear mixed model is in vector notation Y = Xβ+Z u +...+Z a u a +ε, see also Section Profile restricted maximum likelihood We assume that the error terms ε follow a normal distribution, independent of the normal random effects u. It is anticipated that similar results can be obtained for different error distributions. The log likelihood of the data is L ml (β,, σ 2 ε ) = 2 { n log(2πσ 2 ε ) + log V + σ 2 ε } (Y Xβ) t V (Y Xβ). Restricted maximum likelihood estimation of the variance components explicitly takes the degrees of freedom associated with estimation of the fixed effects into account. The restricted likelihood function is the likelihood of a linear combination A t Y where the n (n q) matrix A has full column rank and is constructed such that A t X = 0. The resulting function does not depend on the precise choice of A; it is L reml (, σ 2 ε ) = 2 { (n q) log(2πσε 2 ) + log At V A + } Y t A(A t V σε 2 A) A t Y. (4) Define the projection matrix P() = I n X(X t V X) X t V. Also, let P = V P(); as shown in Searle, Casella and McCulloch (992, p ), P = A(A t V A) A t. Since our main focus in on testing σu 2 = 0, or equivalently, = 0, an important ingredient of the test statistic is the profile log REML function L(), obtained by substituting σε 2 in the restricted likelihood function by its REML estimator ˆσ ε 2 = Yt P() t V P()Y/(n q). Specifically, L() = 2 log V 2 log Xt V n q X 2 log{y t P()V P()Y}. (5) Next we state an eigenvalue representation of the profile log REML function, which may also be of interest in general linear mixed models. Lemma In the normal linear mixed effects model Y = Xβ + Z u Z a u a + ε, the profile restricted log likelihood function evaluated at (,..., a ), where k = σ 2 u k /σ 2 ε, with corresponding true values k, and where σε 2 is the residual variance, has the eigenvalue representation L(,..., a ) = n q log[ + 2 ξ n,i (,..., a )] (n q) 2 log{σ2 ε n q + ξ n,i (,..., a ) + ξ n,i (,..., a ) U 2 i },

6 6 where ξ nk (,..., a ), k =,..., K n are the non-zero eigenvalues of the matrix a k= kw t Z k R k Z t k W, and the matrix W is such that Wt W = I n q, WW t = P(0) and W t V W=diag[ + ξ nj (,..., a )]. A proof is given in the appendix. A few special cases are worth mentioning. When there is only one random effect in the sense that =... = a =, or trivially when a =, the eigenvalues depend in a multiplicative way on, that is ξ ni = ξ ni where the ξ ni are the non-zero eigenvalues of the matrix a k= Wt Z k R k Z t kw. Explicit dependence of the eigenvalues on the smoothing parameters k is also arrived at when the matrices Q k = W t Z k R k Z t kw, k =,..., a can be simultaneously diagonalised, a necessary and sufficient condition for which is pairwise commutativity of the matrices Q k, more explicitly when for k l: P(0)Z k R k Z t k P(0)Z lr l Z t l P(0) = P(0)Z lr l Z t l P(0)Z kr k Z t k P(0). In this situation ξ ni = ξ ni a ξ nai, where the ξ nki are the non-zero eigenvalues of A k = Z t k P(0)Z kr k. 3 Tests in simple regression models For testing hypothesis (3), which equivalently can be formulated in terms of the parameter, we study the restricted profile likelihood ratio statistic R n = 2 log{l(ˆ)/l(0)}, (6) where ˆ is the maximiser of L() in (5). As explained earlier, an additional hurdle in obtaining the distribution of likelihood ratio type test statistics is the one-sided alternative. Vu and Zhou (997) extend the work of Chernoff (954) to parameter values on a boundary and non-identically distributed data. In the next theorem we obtain conditions for the random spline models under which those results apply. For ease of interpretation, the conditions are formulated using both the eigenvalue representation and the original matrix notation. (A) The number of knots K n, which equals the number of columns of matrix Z, grows to infinity at a slower rate than n, K n = o(n). (A2) The non-zero eigenvalues ξ n,..., ξ nkn of Z t P(0)ZR = Z t (I n X(X t X) X t )ZR satisfy the following two conditions. As n, k= ξ 2 nk = tr{(zt P(0)ZR) 2 } and Kn k= ξ4 nk ( = tr{(ztp(0)zr)4 } K n k= ξ2 nk )2 (tr{(z t P(0)ZR) 2 }) 2 0. The assumption on a growing number of spline basis functions is natural in nonparametric lack of fit testing. In order to construct a test which is powerful against a large class of smooth

7 7 models, the number of knots K n should, first of all, be sample size dependent, and secondly, grow with the sample size to provide enough flexibility to capture non-trivial structural features. Condition (A2) has a more technical origin. The first assumption guarantees that the smallest eigenvalue of the Fisher information matrix tends to infinity, which guarantees that this matrix is positive definite for n large. The second condition is necessary and sufficient for the standardised score statistic, whose major component is the quadratic form Y t P(0) t ZRZ t P(0)Y, to converge to a standard normal random variable (de Jong, 987). Under (A) a sufficient condition for (A2) is that the non-zero eigenvalues of Z t P(0)ZR are O(n ζ ) with ζ 0. In Section 6 we investigate these conditions for two types of spline basis functions. Theorem Under conditions (A) and (A2), the restricted likelihood ratio statistic R n for testing H 0 in () versus the nonparametric alternative (2) has asymptotically, as the sample size n tends to infinity, a distribution which is an equal mixture of a point mass at zero and a chi-squared distribution with one degree of freedom, abbreviated as 2 χ χ2. In the proof of this result (see appendix for details), we verify that for the random spline model, the conditions stated above are sufficient to apply the results of Vu and Zhou (997). The first set of conditions in Vu and Zhou (997) (A2 ) requires Chernoff regularity, and extends the original results of Chernoff (954) to this more complicated setting. Chernoff regularity ensures the existence of an asymptotic distribution of a consistent sequence of global maximisers of the likelihood. Under the stronger assumption of Clarke regularity, Geyer (994) proves in the setting of M-estimation that a root-n consistent sequence of local maximisers has that same asymptotic distribution. Since a convex set is Clarke regular at each of its points, convexity is a simple argument to prove Clarke regularity, which in its turn implies Chernoff regularity. Further details are given in the appendix. In addition to a test on the value of, it might be of interest to construct a confidence interval for. Cressie and Lahiri (993) and Richardson and Welsh (994) obtain, under some regularity conditions, the asymptotic normality of REML estimators of variance components, without explicitly taking the possibility of zero variance components into account. Based on the normality result a Wald-type confidence interval can be constructed. Using the asymptotic distribution obtained in Theorem, a restricted likelihood based confidence interval for, the variance ratio σ 2 u/σ 2 ε, can be obtained directly. Denote by r α the ( α) quantile of the asymptotic distribution of R n (), that is of the mixture distribution 2 χ χ2. A likelihood based confidence interval C α is the set of all values for which R n () is below the critical value r α, C α = { Ω : R n () r α }. Note that by definition of L(), R n () does not depend on σ 2 ε. Advantages of this likelihood based method include that the confidence interval only covers values belonging to the

8 8 parameter space, hence giving no negative values for variances, and the use of the (profile restricted) likelihood function avoids the construction of a symmetric interval around the estimated parameter value, an aspect which is automatically imposed by Wald-type intervals. The point mass at zero contributes to critical values which are smaller than the standard ones obtained from a full chi-squared distribution with one degree of freedom. 4 Calculation of the mixing proportions In practice the asymptotic approximation of a 50:50 mixture between a point mass at zero and a χ 2 distributed random variable can be poor, and Pinheiro and Bates (2000) suggest using a 65:35 mixture of the same random variables. A maximum of the restricted likelihood occurs at the boundary value = 0 when the right derivative lim 0 + L () 0. The notation 0 + denotes convergence of a positive sequence, approaching the value 0 only from the right. Hence, the probability of obtaining an extremum at the boundary equals P (lim 0 + L () 0). Using the algorithm of Davies (989), the finite sample probability of a local extremum at the boundary can be calculated; this is the subject of Section 4.. In Section 4.2 an approximation is derived using the Wilson-Hilferty (93) transformation. The finite sample mixing probability is an alternative to the asymptotic value /2. 4. Exact finite sample calculation Starting from the profile restricted likelihood function (5), using some results on matrix algebra, the probability of a local extremum at zero, is equal to { Y P (L t P()V () 0) = P ZRZt V P()Y Y t P()V P()Y tr(zrzt V P()) }. n q This requires the calculation of the quantile function of a ratio of two quadratic forms in normal random variables. An eigenvalue representation of both numerator and denominator leads to the following approximation for p Q = P (lim 0 + L () 0). Approximation (Finite sample mixing proportions) Let U,..., U n q be independent and identically distributed standard normal random variables and let ξ nk, k =,..., K n be the non-zero eigenvalues of Z t P(0)ZR. If (A) and (A2) hold, the distribution of the restricted likelihood ratio test for testing hypothesis (3) can be approximated by p Q χ 2 0 +( p Q )χ 2, where p Q is the exact probability of obtaining a local maximum at the boundary, p Q = P { Kn k= ξ nku 2 k n q k= U 2 k } Kn k= ξ nk n q

9 9 From the above expression, it is immediately obtained that p Q converges to /2 as n. To appreciate this, let Q = k= ξ nk U 2 k Kn k= ξ nk n q n q Uk 2. The expected value E( Q) equals zero, for each n, and under the assumptions on the eigenvalues both quadratic forms converge to a Gaussian limit distribution. This implies that p Q converges to the probability that a zero-mean Gaussian random variable is smaller than its mean, which is equal to /2. Davies (989) algorithm, available as Applied Statistics Algorithm AS55, can be used to obtain the exact mixing probability. k= 4.2 Wilson Hilferty approximation We start with the eigenvalue representation of the restricted likelihood. hypothesis, the true value of equals zero, therefore L() = n p log( + ξ n,i ) (n p) (σ 2 2 log ε 2 where ξ nk, k =,..., K n are the eigenvalues of Z t P(0)ZR. n p ) Ui 2 + ξ n,i Under the null Taking the derivative of L() with respect to and using the fact that the REML estimator σ ε 2 is consistent for σ 2 ε (see also in the proof of Theorem ), L () = 2 Hence, p Q is approximated by p n = P ( Kn ξ ni + ξ ni + 2 ) ξ ni Ui 2 < ξ ni = P ξ ni ( + ξ n,i ) 2 U 2 i + o P () ( Q ) E(Q) < where the quadratic form Q = K n ξ niu 2 i goes slowly to a normal random variable. To accelerate the convergence, Mathai and Provost (992, Sec. 4.6) present a normalising transformation based on the Wilson Hilferty Gaussian approximation for the chi-squared distribution. For r =, 2, 3, let θ r = K n k= ξr nk. Then for h 0 = 2θ θ 3 /(3θ2 2 ), the Wilson Hilferty approximation states that when n tends to infinity, (Q/θ ) h 0 N(µ Q = + θ 2 h 0 (h 0 )/θ 2; σ2 Q = 2θ 2h 2 0 /θ2 ). This implies that ( ) ( ) µq P lim L () 0 ˆp Q = Φ, 0 + where Φ denotes the standard normal cumulative distribution function. σ Q

10 0 Approximation 2 (Wilson Hilferty) If conditions (A) and (A2) hold, the distribution of the restricted likelihood ratio statistic R n can be approximated by ˆp Q χ ( ˆp Q)χ 2. The estimated mixing probability ˆp Q depends solely on the eigenvalues of the quadratic form Q, which can be obtained directly from the data. Terrell (2003) connected the Wilson Hilferty transformation to a local saddlepoint approximation, see also Kuonen (999). This method requires a positive definite quadratic form, which here is not guaranteed since there are only K n non-zero eigenvalues while the quadratic form contains all n q random variables U k s. 5 Tests in multiple regression For a vector of covariates x = (x,..., x a ) t we wish to test a parametric linear null hypothesis of the form H 0 : µ(x,..., x a ) = xβ (7) against a nonparametric alternative that µ is a smooth function of (x,..., x a ). In this situation there are several testing strategies possible, presented here according to the number of different smoothing parameters. 5. Single smoothing parameter tests If there is only one smoothing parameter involved, the asymptotic distribution of the test statistic is obtained by Theorem. Semiparametric models are build using an a-dimensional basis, that is, Y = xβ+zu+ε, where the i-th row of the n K n dimensional matrix Z consists of ( ) ψ (x i,..., x ai ),..., ψ Kn (x i,..., x ai ) and u = (u,..., u Kn ) t N(0, σu 2 R). Examples of multivariate spline basis functions include tensor products of univariate basis functions, in two-dimensional problems thin-plate splines are often used for estimation of spatial structures, for three dimensions radial splines can be used to fit response surfaces. If conditions (A) and (A2) hold, the asymptotic distribution of the profile REML statistic for testing hypotheses (7) is given in Theorem. With high dimensional data the problem of needing a large sample size remains. Additive spline modelling provides an interesting alternative. Often it is reasonable to assume additive effects, and then an alternative model is built as Y = X β X a β d + Z u Z a u a + ε,

11 with a possibly smaller than a to allow some of the effects to be modelled parametrically only. Under the assumption that all additive components possess the same degree of smoothness, a single smoothing parameter can be used. In the linear mixed model this translates to all random vectors u j being independent with the same variance structure σu 2 R. Following from Lemma, the relevant eigenvalues ξ ni are the nonzero eigenvalues of the matrix Z t P(0)Z R Z t a P(0)Z a R a. The equal mixture of a point mass at zero and a chisquared distribution at one degree of freedom, holds asymptotically as in Proposition, where we now take X = (X,..., X a ). For both tests, finite sample corrections as in Section 4 can be used in practice. 5.2 Two smoothing parameter tests In this section the situation is considered where the linear mixed model contains two variance components in addition to the residual variance. Without loss of generality we can write the mixed regression spline model under the alternative as Y = X β + X 2 β 2 + Z u + Z 2 u 2 + ε, where the design matrices Z and Z 2 contain the spline basis functions for covariates x and x 2 respectively, and u j N(0, σ 2 u j R j ), for j =, 2. The number of spline basis functions K nj is allowed to be different for the different additive components. We assume that u, u 2, ε are independent. The natural parameter space for (σ 2 u, σ 2 u 2, σ 2 ε ) equals [0, )2 (0, ). If the null hypothesis constrains only one of the variance components to zero, the asymptotic distribution of R n is again as given in Theorem. Testing whether both variance components are at the boundary of the parameter space yields a test with a different asymptotic distribution. The hypotheses are H 0 : σ 2 u = σ 2 u 2 = 0 versus H a : σ 2 u > 0 or σ 2 u 2 > 0 (8) The function L reml (, 2, σ 2 ε ) has a matrix representation as in (4) with V = σ2 ε I n + σ 2 uz R Z t + σ 2 u2z 2 R 2 Z t 2 = σ 2 εv, where V = V(, 2 ) = I n + Z R Z t + 2 Z 2 R 2 Z t 2 and j = σ 2 uj /σ2 ε. Similarly, also P(, 2 ) depends on both smoothing parameters. Define the 2 2 matrix G n with entries G n,kl = tr{(z t k P(0, 0)Z kr k )(Z t l P(0, 0)Z lr l )}. Further, let r n = cos (G n,2 / G n, G n,22 )/(2π), and s n = G n,2 / G n. Hence, r n = cos (s n / + s 2 n )/(2π). Theorem 2 Assume conditions (A) and (A2) hold for both A and A 2. Let (N, N 2 ) N(0, I 2 ), and denote s = lim n s n, r = lim n r n. The restricted likelihood ratio statistic

12 2 R n for testing the hypothesis in (8) has asymptotically, as the sample size n tends to infinity, the following mixture distribution R n d A proof is given in the appendix. 0, with probability /2 r, N 2, with probability /4, (N sn 2 ) 2 /( + s 2 ), with probability /4, N 2 + N 2 2, with probability r. The method of proof differs from that in earlier work on one-sided testing problems. In particular, the explicit expression of the observed log likelihood as a sum of individual contributions, as directly used by Vu and Zhou (997), is not immediately available for REML estimation. Instead, the obtained results rely on the eigenvalue representation stated in Lemma. This provides an alternative method of proof for (full) likelihood ratio testing. In the limiting distribution above, the finite sample versions r n and s n can be calculated exactly, and do not depend on any unknown parameters. A special case is when s n = 0, which implies that r n = /4, it now follows that R n d 4 χ χ2 + 4 χ2 2. The situation where s n = 0 occurs if and only if S Z t P(0, 0)Z 2S t 2 = 0. Here, the matrices S j are such that the covariance matrix can be written as R j = S t js j. For example, s n = 0 when the spline basis functions are orthogonal, that is Z t Z 2 = 0, and at least one of the spline design matrices is orthogonal with respect to the parametric design matrix: Z t j X = 0 for at least one j. 5.3 More than two smoothing parameters equal to zero If more than two variance components are set to zero under H 0, for orthogonal sets of spline basis functions such that Z t j Z k = 0 (j k), the asymptotic distribution of the likelihood ratio statistic is a mixture of χ 2 0,..., χ2 a distributed random variables where a denotes the number of variance components set to zero. The limiting mixing proportions are determined by the lower Cholesky square root of the Fisher information matrix, which is defined as the expected value of minus the second derivatives of the log REML function with respect to the parameters in the model. When exact expressions for the mixing proportions are unknown or difficult to obtain, the use of bounds on the P-value provides an alternative. Silvapulle (994) and Kodde and Palm (986) explain that the P -value can be bounded as {P 2 (χ2 0 > ˆR n ) + P (χ 2 > ˆR n )} P -value {P 2 (χ2 a > ˆR n ) + P (χ 2 a > ˆR n )},

13 3 where ˆR n denotes the observed value of R n. The fewer variance components equal to zero, the more accurate the bounds are. In most realistic applications tests would only be performed for a relatively small number of variance components. An alternative idea of performing tests on the smoothing parameters in multiple regression models is to look at maximal deviations in one direction only. More specifically, for each j =,..., a, test the parametric null hypothesis H 0 : E(Y ) = µ(x,..., x a ) against each of the following alternative hypotheses: H a,j : E(Y ) = µ(x,..., x a ) + u k ψ k (x j ), (j =,..., a), using the REML test statistic R nj. The final test statistic is R n = max j a R nj. The level of the test can for example be controlled by application of Bonferroni s inequality. Alternatively a bootstrap method can be used where data are generated under the parametric null model. Instead of focussing on one-dimensional departures only, low order interaction splines can provide a more powerful testing approach. These tests are constructed by considering alternative models of the type H a,ij : E(Y ) = µ(x,..., x a ) + u k ψ k (x i, x j ), (i j) and taking the maximum over pairs of indices (i, j). A similar max test statistic using orthogonal series estimators in multivariate regression models has been constructed by Aerts, Claeskens and Hart (2000). k= k= 6 Numerical results One of the main advantages of performing a lack of fit test using the mixed regression splines is that the test statistic can be computed easily using statistical software packages which provide tools for fitting linear mixed models, such as for example proc MIXED in SAS or the function lme() in S-Plus and R. We first address the issue of calculating the exact mixing proportions in the asymptotic distribution of a test for linearity in the one-smoothing parameter case. If the probability of estimating a zero value for is higher than the expected value this leads to oversmoothing, and hence to a lower power of the test. Figure presents boxplots summarising the results for the calculation of the probability of a zero smoothing parameter in 000 simulated sets of data generated from a normal linear regression model Y = x+ε, with the error variance set to 0. and x generated from a uniform distribution on (0, ). Four sample sizes are chosen, for n = 00 and 200 we take 25 knots at sample quantiles while for n = 300 and 500, 45 knots are used. For the calculation we

14 Figure : Boxplots showing the exact finite sample probability of zero smoothing parameter for simulated data under the linear null hypothesis for sample sizes n = 00, 200 (using 25 knots), 300 and 500 (using 45 knots) using a truncated polynomial basis (left panel) and B-spline basis (right panel). used either a truncated linear basis where ψ j (x) = max{(x κ j ), 0} for knots κ j (left panel) or a B-spline basis (results are in the right panel). For truncated polynomial basis functions even for larger sample sizes the probabilities do not tend to the value 0.5. Calculated values stay above 0.67 because assumption (A2) on the eigenvalues of the matrix Z t P(0)Z does not hold for this type of basis function, while (A) is satisfied. For the B-spline functions both conditions are fulfilled and convergence to 0.5 holds. Note that especially when using the truncated polynomial basis there are many large outliers in the calculated probabilities. A simulation study is performed to investigate power properties of the test using R n. Data are generated from a model Y i = f(x i ) + σε i with independent standard normal error terms ε i. We test for linearity: H 0 : f(x) = β 0 + β x and obtain simulated rejection probabilities under a sequence of alternative models with higher frequency terms: Y i = x i cos(πjx i ) ε i for j = 2,..., 9. All the tests are performed at the 0.05 level, using linear splines and sample size equal to 00. Simulated power curves, based on 2000 simulated datasets, are depicted in Figure 2. Curves are shown using the asymptotic critical value as obtained from the mixture distribution (see Theorem ), as well as from a simulated distribution under the null hypothesis (based on 5000 simulated datasets) and from two approximated distributions where for each dataset the mixing proportion is calculated using the exact finite sample calculations according to the algorithm by Davies and the approximation method of Wilson and Hilferty. In the figure we clearly distinguish two groups of tests, those using a truncated polynomial basis which have high power at low frequency alternatives, but lose power quickly for high frequencies, and the tests using B-spline basis functions, which have reasonably high power for all frequencies considered in the simulation study. The approximations result in slightly larger power of the tests than when using the

15 5 Simulated Power Frequency of alternative Figure 2: Simulated power curves using statistic R n for a test of linearity using asymptotic critical values (solid line), values from a simulated distribution (dot-dashed line), as well as values from approximated distributions (dashed lines, Davies; dotted lines, Wilson Hilferty). The group of tests with the rapid decrease uses truncated polynomial basis functions, the others a B-spline basis. asymptotic distribution. For B-splines the asymptotic approximation agrees closely with the results from the simulated distribution and corrections are not necessary. For the truncated polynomial basis, simulated power values of the approximation methods follow closely those using the simulated distribution, for this basis function they seem necessary. As a comparison the order selection test of Aerts, Claeskens and Hart (999) has been calculated. This test (not shown) has power behaviour similar to the truncated polynomial based tests. Using the asymptotic distribution the simulated probability of a type I error is for the test using a truncated polynomial basis for the asymptotic distribution, while for approximations and 2 give values and A B-spline basis leads to corresponding values , and 0.520, also showing that for these basis functions finite sample corrections are not necessary. The simulated level of the order selection test is

16 6 7 Discussion Difficulties with boundary values can often be circumvented by embedding the parameter space. Theorem 8.3 of Harville (997, p.433) ensures the existence of a strictly positive value c such that V is symmetric positive definite for > c. This guarantees that it is possible to reparametrise the likelihood and embed the current parameter space in a bigger space, namely Ω = ( c, ) for which the value of under the null hypothesis belongs to the interior region. Feng and McCulloch (992) use the idea of extending the parameter space for the construction of confidence intervals for a parameter possibly on the boundary, in the setting of independent and identically distributed data. While the embedding takes away the technical difficulties with directional derivatives, the complications of performing one-sided likelihood ratio based tests remain. As a consequence the asymptotic distribution of the test statistic does not change using the approach of embedding. There is a difference in distributions of the test statistic when a two-sided alternative hypothesis is considered: a limiting chi-squared distribution results, but the power of such a two-sided test will be smaller than the power of one-sided testing. While the focus of this paper is on the restricted likelihood ratio test, other test statistics can be considered. Silvapulle and Silvapulle (995) construct a score statistic for testing one-sided hypotheses which only requires estimation under the null hypothesis, and show that the asymptotic distribution is equivalent to that of a likelihood ratio statistic. For a twosided score test in variance component models see Lin (997). Full likelihood ratio statistics can also be used, while having the advantage of allowing the possibility of restricting fixed effects under the null hypothesis. A motivation for the use of restricted likelihood methods is that the degrees of freedom associated with the estimation of fixed effects in the model is more effectively taken into account in the estimation of the smoothing parameter, which is of importance in the construction of the test. Another motivation comes from the finite sample calculation of the mixing proportions, which are probabilities of a variance component being zero. Crainiceanu, Ruppert and Vogelsang (2002) calculate that for maximum likelihood estimation these probabilities can be much larger than for restricted likelihood, which is not advantageous in a hypothesis testing setting. Acknowledgements Part of this research was performed while the author was with the Department of Statistics at Texas A&M University. The author is thankful to M. Wand, M. Aerts, P. Janssen and D. Ruppert for interesting discussions related to the topic of this paper and to the editor and reviewers of this paper for helpful comments. This research was partly supported by NSF grant DMS and by the Belgian Federal Science Policy Office.

17 7 8 Appendix Proof of Lemma. Since P(0) is a symmetric and idempotent matrix, and by symmetry of V there exists a matrix W such that W t W = I n q, WW t = P(0) and W t V W = diag( ξ). Since W t V W = I n q + a k= Wt k Z k R k Z t k W, this motivates us to define ξ n,j such that ξ j = + ξ n,j (,..., a ), where ξ n,j are the (non-zero) eigenvalues of a k= Wt k Z k R k Z t k W. The number of non-zero eigenvalues is equal to the rank of the matrix, which in this case is at most K n = a k= K nk. For regular spline bases where there is no linear dependence between the different basis functions, the rank equals K n. Denote = (,..., a ). From Patterson and Thompson (97), Y t P()V P()Y = Yt W(I + diag( ξ n,i ())) W t Y. With denoting the true value of, W t Y N(0, σε 2(I n q + a k= k W t Z k R k Z t kw)). As a consequence, the quadratic form n p Y t P()V P()Y = + ξ n,i ( ) σ2 ε + ξ n,i () U i 2 with U,..., U n p independent and identically distributed N(0, ) random variables. It can be shown that (Kuo, 999) with C denoting a finite constant, log V + log X t V X = n p log( + ξ n,i ()) + C. This gives us an expression for the REML likelihood in terms of eigenvalues. We have, up to some constant not depending on any parameter value, n q L reml (, σε 2 ) = log( + 2 ξ n,i ()) (n q) 2 log(σ2 ε n q + ξ n,i ( ) + ξ n,i () U 2 i ). Proof of Theorem. The proof consists in showing that Theorem 2.2 of Vu and Zhou (997) is applicable for the profile REML statistic. We verify their conditions. There exists a neighbourhood N of the true value = 0, where the profile REML function L() is continuous, and first and second derivatives exist and are continuous on N Ω. Derivatives at = 0 are understood to be right-derivatives only. With this one-dimensional problem Chernoff regularity is satisfied. To verify the remaining conditions, we use the eigenvalue representation from Lemma. Under the null hypothesis, the true value of the parameter equals zero. First and second derivatives with respect to are obtained directly from this eigenvalue form, L () = 2 The second derivative is L () = 2 ξ n,i + (n q) + ξ n,i 2 ξn,i 2 (n q) ( + ξ n,i ) 2 ( Kn ( Kn ) ( n q ( + ξ n,i ) ξ n,iu 2 2 i ) ( n q ( + ξ n,i ) 3 ξ2 n,iui 2 ) Ui 2. + ξ n,i ) Ui 2 + ξ n,i

18 8 ( + Kn 2 (n q) ) 2 ( n q ( + ξ n,i ) ξ n,iu 2 2 i ) 2 Ui 2. + ξ n,i The above expressions can be simplified after showing that n q probability. For this Chebyshev s inequality is used (Kuo, 999), ( P n q n q Ui 2 n q + ξ ni n q U 2 i ) > a a n q n q +ξ n,i U 2 i ξ ni + ξ ni = O(K n /n)., in Since K n = o(n), the denominator is o P () away from the average of χ 2 s, which by the law of large numbers converges in probability to one. This is equivalent to showing that the REML estimator is consistent for the true residual variance σ 2 ε. Denote now L and L the simplified versions of the derivatives above, that is, The second derivative is L () = 2 L () = 2 ξni 2 Kn ( + ξ ni ) 2 ξ ni + ξ ni + 2 ( + ξ ni ) 3 ξ2 ni U i n q ( + ξ ni ) 2 ξ niu 2 i. ( Kn At the null value, E{L reml (0)} = 0, E{L reml (0)}2 = Kn 2 ξ2 ni and ( ) E{ L reml(0)} = ξni 2 ( ) 2 2 ξni 2 + ξ ni, 2 2 n q ) 2 ( + ξ ni ) ξ niu 2 2 i. which tends to infinity as n grows. Note that for these profile likelihood based mixed models, E{L reml (0)2 } E{ L reml (0)}. We now study the variance of the normal random variable to be used in the projections on the tangent cones, as in Theorem 2.2 of Vu and Zhou (997). The matrix V is the limit expression of V n = E{L reml (0)2 }/E{ L Kn reml (0)}. Consider Vn expressed in terms of eigenvalues, Vn = 3 n q n q j i ξ niξ nj / K n ξ2 ni = + O(K n /n). Since K n = o(n), we may take V = in the limit. Condition (A2) is sufficient to imply normality of the score value. The result now follows from Vu and Zhou (997). Proof of Proposition 2. The profile score vector has two components given by L = j 2 tr(z jz t jv n q P()) + 2 Y t P()V and for j, k =, 2 the partial second derivatives are obtained as 2 L = j k 2 tr{z jz t jp () t V Z kz t kp () t V Z jz t j V P()Y Y t P()V P()Y, j =, 2 }

19 9 + n q 2 { Yt P() t V Z jz t j V P()Z kz t k V Y t P() t V P()Y P()Y Yt P() t V Z kz t k V P()Z jz t j V P()Y Y t P() t V P()Y + {Yt P() t V Z kz t k V P()Y}{Yt P() t V Z jz t j V P()Y} } {Y t P()V. P()Y}2 Since Y t P()V P()Y/(n q) is a consistent estimator of the residual variance σ2 ε, we define the following quantities, which are o P () away from the derivatives given above: L = j 2 tr(z jz t j V P()) + 2 σ2 ε Yt P()V Z jz t j V P()Y; 2 L = j k 2 tr{z jz t j P ()t V Z kz t k P ()t V 2 σ 2 ε σ 4 ε Y t P() t V { Zj Z t j V } P()Z kz t k + Z kz t k V P()Z jz t j } V P()Y + 2 n q {Yt P() t V Z kz t k V P()Y}{Yt P() t V Z jz t j V P()Y}. Using properties of expected values of quadratic forms, at the null value, E{ L j (0, 0)} = 0, and while G n,jk = E{ 2 L (0, 0)} = ( 2 j k 2 2 ) tr{z j Z t j P (0)Z jz t j P (0)} n q n q tr{z jz t j P (0)}tr{Z kz t k P (0)}, D n,jk = E{ L j (0, 0) L k (0, 0)} = 2 tr{(zt jp (0)Z k )(Z t jp (0)Z k ) t }. The off-diagonal entries of D n are zero if and only if Z and Z 2 are orthogonal in the sense that Z t jp(0)z k = 0. Define the matrices A k = Z k Z t kp(0). The Cauchy Schwarz inequality gives that {tr(a A 2 )} 2 < tr(a 2 )tr(a 2) 2, where the strict inequality holds because we may rule out the situation where A is a multiple of A 2. This shows that D n is positive definite. Since we are working with profile restricted likelihood ratio tests, the matrices G n and D n are generally not expected to be identical. From the expressions above, G n = D n n q {2D n + 2 C n} = D n ( + O{min(K n, K 2n )/n}), (9) where the 2 2 matrix C n has (j, k)th entry given by tr(a j )tr(a k ). Hence, for n large enough, also G n is positive definite. The real symmetric matrix D n has two eigenvalues

20 20 ζ (D n ) < ζ 2 (D n ), given by 2 {tr(d n) ± tr(d n ) 2 4 D n }. By assumption (A) on the eigenvalues of A and A 2, tr(d n ), and since ζ (D n ) = O(tr(D n ) 2 ), also ζ (D n ), as n. By (9), Gn /2 D n (Gn /2 ) t I 2 = o(), where G /2 n represents the lower (left) Cholesky square root of G n, and is the sum of the absolute values of the matrix entries. This shows that the asymptotic covariance matrix of the bivariate random variable to be used in the projection described below is equal to the identity matrix. By results of de Jong (987) and condition (A2) on the eigenvalues of both matrices A and A 2, the score vector has a limiting normal distribution. For the set Ω = [0, ) [0, ), define the cone. { C Ωn = (, } 2 ) t = G t/2 n (, 2 ) t : (, 2 ) t Ω. Inserting the Cholesky decomposition matrix G t/2 n and letting n tend to infinity, defines the limiting cone C Ω = {(, 2 ) t : s 2 0, 2 0}. Under the results obtained above, the asymptotic distribution of the REML ratio statistic R n is now given by the distance of (N, N 2 ) N(0, I 2 ) to the set C Ω. This divides the plane into four regions, of which the orthogonal projection on C Ω of values in the area defined by {(, 2 ) t : s 2 < 0, s + 2 0} results in the component (sn + N 2 ) 2 /( + s 2 ). In this region, the component of the likelihood ratio type statistic R n is given by N 2 + N 2 2 (sn +s 2 + N 2 ) 2 = (N +s 2 sn 2 ) 2. Since N and N 2 are uncorrelated, this component follows a χ 2 distribution. The other components are obtained in a similar way. References Aerts, M., Claeskens, G. and Hart, J.D. (999) Testing the fit of a parametric function. J. Am. Statist. Assoc., 94, Aerts, M., Claeskens, G. and Hart, J.D. (2000). Testing lack of fit in multiple regression. Biometrika, 87, Aerts, M., Claeskens, G. and Wand, M.P. (2002). Some theory for penalized spline additive models. J. Statist. Plann. Inference, 03, , Brumback, B., Ruppert, D. and Wand, M.P. (999). Comment on Variable selection and function estimation in additive nonparametric regression using a data-based prior by Shively, T.S., Kohn, R. and Wood, S., J. Am. Statist. Assoc., 94, Cai, T., Hyndman, R.J. and Wand, M.P. (2002). Mixed model-based hazard estimation. J. Comp. Graph. Statist.,, Cantoni, E. and Hastie, T. (2002). Degrees-of-freedom tests for smoothing splines. Biometrika, 89,

21 2 Chernoff, H. (954). On the distribution of the likelihood ratio. Ann. Math. Statist., 25, Cox, D. and Koh, E. (989). A smoothing spline based test of model adequacy in polynomial regression. Ann. Inst. Statist. Math., 4, Cox, D., Koh, E., Wahba, G. and Yandell, B.S. (988). Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models. Ann. Statist., 6, 3-9. Crainiceanu, C.M., Ruppert, D. and Vogelsang, T.J. (2002). Probability of estimating zero variance of random effects in linear mixed models. Manuscript. Cressie and Lahiri (993). The asymptotic distribution of REML estimators. J. Multiv. Anal., 45, Davies, R.B. (980). (Algorithm AS 55) The distribution of a linear combination of χ 2 random variables. Appl. Statist., 29, de Jong, P. (987). A central limit theorem for generalized quadratic forms. Prob. Th. Rel. Fields, 75, Eilers, P.H.C. and Marx, B.D. (996). Flexible smoothing with B-splines and penalties (with discussion). Statist. Sci., 89, Eubank, R.L. and Hart, J.D. (992). Testing goodness-of-fit in regression via order selection criteria. Ann. Statist., 20, Feng, Z. and McCulloch, C.E. (992). Statistical inference using maximum likelihood estimation and the generalized likelihood ratio when the true parameter is on the boundary of the parameter space. Statist. Prob. Lett., 3, Geyer, C.J. (994). On the asymptotics of constrained M-estimation. Ann. Statist., 22, Gouriéroux, C., Holly, A. and Monfort, A. (982). Likelihood ratio test, Wald test, and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica, 50, Guo, W. (2002). Inference in smoothing spline analysis of variance. J. R. Statist. Assoc., 64, Harville, D.A. (997). Matrix Algebra from a Statistician s Perspective. Springer-Verlag, New York. Kammann, E.E. and Wand, M.P. (2003). Geoadditive models. Appl. Statist., 52, 8. Kodde, D.A. and Palm, F.C. (986). Wald criteria for jointly testing equality and inequality restrictions. Econometrica, 54, Kuo, B.-S. (999). Asymptotics of ML estimator for regression models with a stochastic trend component. Econom. Th., 5,

22 22 Kuonen, D. (999). Saddlepoint approximations for distributions of quadratic forms in normal variables. Biometrika, 86, Lin, X. (997). Variance component testing in generalised linear models with random effects. Biometrika, 84, Mathai, A.M. and Provost, S.B. (992) Quadratic Forms in Random Variables, M. Dekker, New-York. Nychka, D. and Cummins, D. (996). Comment on Flexible smoothing with B-splines and penalties by P.H.C. Eilers and B.D. Marx. Statist. Sci., 89, Pinheiro, J.C. and Bates, D.M.(2000). Mixed-Effects Models in S and S-PLUS, Springer- Verlag, New York. Richardson, A.M. and Welsh, A.H. (994). Asymptotic properties of restricted maximum likelihood (REML) estimates for hierarchical mixed linear models. Austr. J. Statist., 36, Ruppert, D. and Carroll, R.J. (2000). Spatially-adaptive penalties for spline fitting. Austr. N.-Zeal. J. Statist., 42, Searle, S.R., Casella, G. and McCulloch, C.E. (992). Variance Components. John Wiley & Sons, Inc., New York. Self, S.G. and Liang, K.-Y. (987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Statist. Assoc., 82, Sen, P.K. & Silvapulle, M.J. (2002). An appraisal of some aspects of statistical inference under inequality constraints. J. Statist. Plann. Inference 07, Silvapulle, M.J. (994). Likelihood ratio test of one-sided hypothesis in some generalized linear model. Biometrics, 50, Silvapulle, M.J. and Silvapulle, P. (995). A score test against one-sided alternatives. J. Am. Statist. Assoc., 90, Staniswalis, J.G. and Severini, T.A. (99). Diagnostics for assessing regression models. J. Am. Statist. Assoc., 86, Stram, D.O. and Lee, J.W. (994). Variance component testing in the longitudinal mixed effects model. Biometrics, 50, Terrell, G.R. (2003). The Wilson Hilferty transformation is locally saddlepoint. Biometrika, 90, Vu, H.T.V. and Zhou, S. (997). Generalization of likelihood ratio tests under nonstandard conditions. Ann. Statist., 25, Wilson, E.B. and Hilferty, M.M. (93). The distribution of chi-square. Proc. Acad. Nat. Sci., 7,

Exact Likelihood Ratio Tests for Penalized Splines

Exact Likelihood Ratio Tests for Penalized Splines Exact Likelihood Ratio Tests for Penalized Splines By CIPRIAN CRAINICEANU, DAVID RUPPERT, GERDA CLAESKENS, M.P. WAND Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street, Baltimore,

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

Some properties of Likelihood Ratio Tests in Linear Mixed Models

Some properties of Likelihood Ratio Tests in Linear Mixed Models Some properties of Likelihood Ratio Tests in Linear Mixed Models Ciprian M. Crainiceanu David Ruppert Timothy J. Vogelsang September 19, 2003 Abstract We calculate the finite sample probability mass-at-zero

More information

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Short title: Restricted LR Tests in Longitudinal Models Ciprian M. Crainiceanu David Ruppert May 5, 2004 Abstract We assume that repeated

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS Ø Ñ Å Ø Ñ Ø Ð ÈÙ Ð Ø ÓÒ DOI: 10.2478/v10127-012-0017-9 Tatra Mt. Math. Publ. 51 (2012), 173 181 ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS Júlia Volaufová Viktor Witkovský

More information

The Hodrick-Prescott Filter

The Hodrick-Prescott Filter The Hodrick-Prescott Filter A Special Case of Penalized Spline Smoothing Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Joint work with Rob Paige, Missouri University of Science

More information

Likelihood ratio testing for zero variance components in linear mixed models

Likelihood ratio testing for zero variance components in linear mixed models Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS Mem. Gra. Sci. Eng. Shimane Univ. Series B: Mathematics 47 (2014), pp. 63 71 ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS TAKUMA YOSHIDA Communicated by Kanta Naito (Received: December 19, 2013)

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Hypothesis Testing in Smoothing Spline Models

Hypothesis Testing in Smoothing Spline Models Hypothesis Testing in Smoothing Spline Models Anna Liu and Yuedong Wang October 10, 2002 Abstract This article provides a unified and comparative review of some existing test methods for the hypothesis

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Statistica Sinica 24 (2014), 1143-1160 doi:http://dx.doi.org/10.5705/ss.2012.230 A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Huaihou Chen 1, Yuanjia Wang 2, Runze Li 3 and Katherine

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Nonparametric Small Area Estimation Using Penalized Spline Regression

Nonparametric Small Area Estimation Using Penalized Spline Regression Nonparametric Small Area Estimation Using Penalized Spline Regression J. D. Opsomer Iowa State University G. Claeskens Katholieke Universiteit Leuven M. G. Ranalli Universita degli Studi di Perugia G.

More information

Nonparametric Small Area Estimation Using Penalized Spline Regression

Nonparametric Small Area Estimation Using Penalized Spline Regression Nonparametric Small Area Estimation Using Penalized Spline Regression 0verview Spline-based nonparametric regression Nonparametric small area estimation Prediction mean squared error Bootstrapping small

More information

Small Area Estimation for Skewed Georeferenced Data

Small Area Estimation for Skewed Georeferenced Data Small Area Estimation for Skewed Georeferenced Data E. Dreassi - A. Petrucci - E. Rocco Department of Statistics, Informatics, Applications "G. Parenti" University of Florence THE FIRST ASIAN ISI SATELLITE

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,

More information

Two Applications of Nonparametric Regression in Survey Estimation

Two Applications of Nonparametric Regression in Survey Estimation Two Applications of Nonparametric Regression in Survey Estimation 1/56 Jean Opsomer Iowa State University Joint work with Jay Breidt, Colorado State University Gerda Claeskens, Université Catholique de

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Semiparametric Mixed Model for Evaluating Pathway-Environment Interaction

Semiparametric Mixed Model for Evaluating Pathway-Environment Interaction Semiparametric Mixed Model for Evaluating Pathway-Environment Interaction arxiv:1206.2716v1 [stat.me] 13 Jun 2012 Zaili Fang 1, Inyoung Kim 1, and Jeesun Jung 2 June 14, 2012 1 Department of Statistics,

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive

More information

A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS

A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS K Y B E R N E T I K A V O L U M E 4 3 ( 2 0 0 7, N U M B E R 4, P A G E S 4 7 1 4 8 0 A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

RLRsim: Testing for Random Effects or Nonparametric Regression Functions in Additive Mixed Models

RLRsim: Testing for Random Effects or Nonparametric Regression Functions in Additive Mixed Models RLRsim: Testing for Random Effects or Nonparametric Regression Functions in Additive Mixed Models Fabian Scheipl 1 joint work with Sonja Greven 1,2 and Helmut Küchenhoff 1 1 Department of Statistics, LMU

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Interaction effects for continuous predictors in regression modeling

Interaction effects for continuous predictors in regression modeling Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix Estimating Variances and Covariances in a Non-stationary Multivariate ime Series Using the K-matrix Stephen P Smith, January 019 Abstract. A second order time series model is described, and generalized

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions Chapter 3 Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions 3.1 Scattered Data Interpolation with Polynomial Precision Sometimes the assumption on the

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Penalized Splines, Mixed Models, and Recent Large-Sample Results

Penalized Splines, Mixed Models, and Recent Large-Sample Results Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common

A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when

More information

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

ERRATA. for Semiparametric Regression. Last Updated: 30th September, 2014

ERRATA. for Semiparametric Regression. Last Updated: 30th September, 2014 1 ERRATA for Semiparametric Regression by D. Ruppert, M. P. Wand and R. J. Carroll Last Updated: 30th September, 2014 p.6. In the vertical axis Figure 1.7 the lower 1, 2 and 3 should have minus signs.

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference

More information

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines Nonparametric Small Estimation via M-quantile Regression using Penalized Splines Monica Pratesi 10 August 2008 Abstract The demand of reliable statistics for small areas, when only reduced sizes of the

More information

Sample size determination for logistic regression: A simulation study

Sample size determination for logistic regression: A simulation study Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

ANALYSIS OF VARIANCE AND QUADRATIC FORMS 4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information