ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS

Similar documents
Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Some properties of Likelihood Ratio Tests in Linear Mixed Models

A MODIFICATION OF THE HARTUNG KNAPP CONFIDENCE INTERVAL ON THE VARIANCE COMPONENT IN TWO VARIANCE COMPONENT MODELS

Fiducial Generalized Pivots for a Variance Component vs. an Approximate Confidence Interval

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Canonical Correlation Analysis of Longitudinal Data

arxiv: v2 [math.st] 22 Oct 2016

Likelihood ratio testing for zero variance components in linear mixed models

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Stat 5101 Lecture Notes

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

[y i α βx i ] 2 (2) Q = i=1

Lecture 11: Regression Methods I (Linear Regression)

Linear Models in Statistics

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Orthogonal decompositions in growth curve models

An Introduction to Multivariate Statistical Analysis

A note on the equality of the BLUPs for new observations under two linear models

Lecture 11: Regression Methods I (Linear Regression)

On V-orthogonal projectors associated with a semi-norm

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

Definition (T -invariant subspace) Example. Example

Mean. Pranab K. Mitra and Bimal K. Sinha. Department of Mathematics and Statistics, University Of Maryland, Baltimore County

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

HANDBOOK OF APPLICABLE MATHEMATICS

Exact Likelihood Ratio Tests for Penalized Splines

New insights into best linear unbiased estimation and the optimality of least-squares

Political Science 236 Hypothesis Testing: Review and Bootstrapping

1 Mixed effect models and longitudinal data analysis

Math 423/533: The Main Theoretical Topics

Ratio of Linear Function of Parameters and Testing Hypothesis of the Combination Two Split Plot Designs

Multivariate Regression

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

~ g-inverses are indeed an integral part of linear algebra and should be treated as such even at an elementary level.

MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete Observations

Chapter 1. Matrix Algebra

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

1 Data Arrays and Decompositions

Measure for measure: exact F tests and the mixed models controversy

Properties of Matrices and Operations on Matrices

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Linear Regression and Its Applications

Analysis of variance, multivariate (MANOVA)

Chapter 3. Matrices. 3.1 Matrices

Pattern correlation matrices and their properties

Mathematical foundations - linear algebra

CHAOS AND STABILITY IN SOME RANDOM DYNAMICAL SYSTEMS. 1. Introduction

A CLASS OF THE BEST LINEAR UNBIASED ESTIMATORS IN THE GENERAL LINEAR MODEL

NOTES ON MATRICES OF FULL COLUMN (ROW) RANK. Shayle R. Searle ABSTRACT

Lecture 3. Inference about multivariate normal distribution

Matrix Differential Calculus with Applications in Statistics and Econometrics

Chapter 7: Symmetric Matrices and Quadratic Forms

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Introduction to Matrix Algebra

Testing Statistical Hypotheses

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

11 Hypothesis Testing

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Statistical Inference On the High-dimensional Gaussian Covarianc

Chapter 3 Transformations

Binary choice 3.3 Maximum likelihood estimation

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

RLRsim: Testing for Random Effects or Nonparametric Regression Functions in Additive Mixed Models

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

Mathematical foundations - linear algebra

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Chapter 10: Inferences based on two samples

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Efficient Estimation for the Partially Linear Models with Random Effects

Vector Spaces and Linear Transformations

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

Recall the convention that, for us, all vectors are column vectors.

List of Symbols, Notations and Data

Lecture 3: Linear Algebra Review, Part II

MATH5745 Multivariate Methods Lecture 07

Lecture 26: Likelihood ratio tests

Inference on reliability in two-parameter exponential stress strength model

GLM Repeated Measures

Chapter 12 REML and ML Estimation

Math 494: Mathematical Statistics

Ma 3/103: Lecture 24 Linear Regression I: Estimation

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

SUMMARY OF MATH 1600

Multivariate Gaussian Analysis

Regression: Lecture 2

Asymptotic Statistics-VI. Changliang Zou

Transcription:

Ø Ñ Å Ø Ñ Ø Ð ÈÙ Ð Ø ÓÒ DOI: 10.2478/v10127-012-0017-9 Tatra Mt. Math. Publ. 51 (2012), 173 181 ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS Júlia Volaufová Viktor Witkovský ABSTRACT. Linear models with variance-covariance components are used in a wide variety of applications. In most situations it is possible to partition the response vector into a set of independent subvectors, such as in longitudinal models where the response is observed repeatedly on a set of sampling units (see, e.g., Laird & Ware 1982). Often the objective of inference is either a test of linear hypotheses about the mean or both, the mean and the variance components. Confidence intervals for parameters of interest can be constructed as an alternative to a test. These questions have kept many statisticians busy for several decades. Even under the assumption that the response can be modeled by a multivariate normal distribution, it is not clear what test to recommend except in a few settings such as balanced or orthogonal designs. Here we investigate statistical properties, such as accuracy of p-values and powers of exact (Crainiceanu & Ruppert 2004) tests and compare with properties of approximate asymptotic tests. Simultaneous exact confidence regions for variance components and mean parameters are constructed as well. 1. Introduction The maximum-likelihood-based computational procedures for inference in mixed models have been around since the seminal paper by H a r t l e y and Rao in 1967 (see [3]). Since the mid 1990s procedures have been available in statistical computing packages that make analysis of mixed models appear to be a routine. The following notation will be used: for a matrix A, A denotes the transpose of A, A 1 denotes the inverse of A if it exists, A and A + denote the generalized c 2012 Mathematical Institute, Slovak Academy of Sciences. 2010 Mathematics Subject Classification: Primary 62J05; Secondary 62F03, 62F10, 62F30. Keywords: linear mixed model, variance components, likelihood ratio test, exact test. Research of Witkovský was supported by the projects VEGA-2-0038-12, VEGA-2-0019-10, APVV-0096-10. 173

JÚLIA VOLAUFOVÁ VIKTOR WITKOVSKÝ inverse and the Moore-Penrose of A, respectively (see, e.g., [9]), R(A) denotes the linear subspace spanned by the columns ofa, andp A denotesthe orthogonal projection matrix onto R(A) (see [4]). If A is square, tr(a) denotes the trace of A. r(a) denotes the rank (the number of linearly independent columns) of A. A linear model with two variance-covariance components can be formulated as Y N n ( Xβ,σ 2 V(λ) ), where σ 2 > 0 and λ Λ are the variance-covariance components, β R k is the unknown mean parameter and X is a known model matrix, columns of which are population indicators or additional model covariates. Here we shall consider models with V(λ) = I + λv, where V is a given nonnegative definite matrix with r(v) 1. The parameter space Λ is defined by the requirement that V(λ) be a positive definite matrix. In other words, if ξ 1 ξ 2 ξ r(v ) > 0 are the nonzero eigenvalues of V, then Λ = {λ R : λ > 1/ξ 1 }. However, the most frequently considered parameter space Λ is {λ R : λ 0}. In applications, the variance-covariance components σ 2 and λ are not known, and most of the time the focus is inference on fixed effects vector parameters. The variance components in such case are nuisance parameters. Even in simple cases, when the design is unbalanced, it is not quite straightforward what the best method for inference is in a final sample setting. An alternative approach is to formulate the questions of interest simultaneously in terms of the function of the mean vector parameter and the function of variance components. The inference is often testing statistical hypotheses and/or construct confidence intervals on functions of model parameters. A suggestive method is the likelihood ratio approach (LR). Here we adopt the development presented by Crainiceanu and Ruppert in 2004 (see [2]) and try to extend/investigate their approach. Primarily, our interest is in testing a simultaneous linear hypothesis H 0 : H β = H β 0 and λ = λ 0 vs. H A : H β H β 0 or λ λ 0, (1) where H is a given matrix, for which R(H) R(X ), i.e., the hypothesis is estimable (see, e.g., [10]). The vector H β 0 = θ 0 and the constant λ 0 are known. Alternatively, we may be interested in simultaneous confidence regions for somecombinationsoftheparametersθ,λ, andσ 2,whereθ = H β.the construction of the exact confidence regions is based on inverting the exact (restricted) likelihood ratio tests of following null hypotheses: 174 H 0 : θ = θ 0, (2) H 0 : θ = θ 0 and λ = λ 0, (3) H 0 : θ = θ 0 and λ = λ 0 and σ 2 = σ 2 0, (4) H 0 : λ = λ 0, (5) H 0 : λ = λ 0 and σ 2 = σ 2 0. (6)

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE COMPONENTS For developing the exact likelihood ratio test for testing hypothesis (1) (i.e., (3)) we start with investigating the profile likelihood L λ (β,σ 2 Y) for fixed λ Λ. Then the 2logL λ takes the form 2logL λ (β,σ 2 Y) = nlog(2π)+nlog(σ 2 )+log V(λ) +(1/σ 2 )(Y Xβ) V(λ) 1 (Y Xβ). (7) Hence the profile maximum likelihood estimators of σ 2 is ˆσ 2 λ = (1/n)Y ( M X V(λ)M X ) +Y, (8) where M X = I P X. Since H β is estimable, H = X G, R(G) R(X), for some matrix G. Then if we denote X = (I P G )X, the affine set of expectation vectors of Y is a shifted linear subspace parametrized by an unconstrained vector parameter γ, i.e., (see, e.g., [1] or [6]). {Xβ : H β = H β 0 } = Xβ 0 +{X γ : γ R k }, (9) The MLE of σ 2 under H 0 is then σ 2 λ 0 = (1/n)Y ( M X V(λ 0 )M X ) +Y, where Y = Y Xβ 0. Since Y ( M X V(λ)M X ) +Y = Y ( M X V(λ)M X ) +Y, we get the test statistic in the form 2logLRT = nlog ( +sup λ Λ 1+ Y ( (M X V(λ 0 )M X ) + (M X V(λ 0 )M X ) +) Y Y ( M X V(λ 0 )M X ) +Y { ( ( ) Y +Y ) M X V(λ 0 )M X nlog Y ( ) +Y +log M X V(λ)M X ) ( ) } V(λ0 ). V(λ) (10) Let B be a n (n p)- matrix, n p = tr(m X ), for which BB = M X, B B = I. Let the spectral decomposition of B VB result in B VB = r µ iq i. Here 0 = µ 0 < µ 1 < < µ r denote the distinct eigenvalues of B VB with the multiplicity ν i for µ i. Q i are mutually orthogonal projection matrices. Following the lines in Olsen, Seely, and Birkes (see [7]), we note that Y BQ i B Y form a set of r + 1 mutually independent random variables, Y BQ i B Y H 0 (1+λ 0 µ i )σ 2 χ 2 (ν i ).Byχ 2 (ν i )wedenoteachi-square distributed randomvariable with ν i degrees of freedom. Hence Y ( ) +Y H M X V(λ)M 0 r X σ 2 (1+λ 0 µ i )/(1+λµ i )χ 2 (ν i ). (11) 175

JÚLIA VOLAUFOVÁ VIKTOR WITKOVSKÝ Further, since R(G) R(X), it is straightforward to derive that Y ( ( ) + ( ) + MX V(λ 0 )M X MX V(λ 0 )M X )Y H 0 σ 2 χ 2 (h), (12) where h = r(h), χ 2 (h) independent with all χ 2 (ν i ). Hence, under null hypothesis (1), the exact distribution of the LRT statistic is expressed as 2logLRT H 0 nlog 1+ χ 2 (h) r χ 2 i(ν i ) r (λ λ 0 )µ i χ 2 + sup nlog λ Λ 1+ 1+λµ i(ν i ) r(v ) i r 1+λ 0 µ i χ 2 + ( ) 1+λ0 ξ j log 1+λξ j j=1 1+λµ i(ν i ) i r χ 2 i(ν i )+χ 2 (h) r(v ) sup nlog λ Λ r 1+λ 0 µ i χ 2 + ( ) 1+λ0 ξ j log. (13) 1+λξ j j=1 i(ν i ) 1+λµ i This result coincideswith the oneobtainedby Crainiceanu and Ruppert in [2]. In their paper, they provide a detailed algorithm obtaining the LRT and its distribution. On the other hand, one may consider an asymptotic approach to approximate the distribution if the sample size is large enough. For the hypothesized λ 0 in the interior of Λ, under H 0, 2logLRT D χ 2 (h+1). For λ 0 on the border of the parameter space Λ, Stram and Lee in [11] suggest that under not very restrictive conditions, which are usually met in the model we investigate here, 2logLRT D 0.5χ 2 (h) : 0.5χ 2 (h+1). Pinheiro and Bates in [8] suggest the asymptotic distribution of the LRT to be 2logLRT D 0.65χ 2 (h) : 0.35χ 2 (h + 1). In this paper we compare the finite sample behavior of the approximate and exact distribution of the LRT by simulation specifically for small sample sizes in a simple longitudinal model. 2. Simulations We set up a simulation study to investigate how well the asymptotic distribution approximates the exact one. We compare here the empirical quantiles of LRT compared to asymptotic χ 2 distribution under H 0. It is obvious 176

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE COMPONENTS that the distribution depends on the hypothesized value of λ but not of β. Hence a secondary question that we want to address is the sensitivity (or insensitivity) of the LRT test distribution under H 0 with respect to λ 0. Figure 1. Quantile plots: left column (5, 7), central column (10, 15), right column (30,40), r(h) = 1. 177

JÚLIA VOLAUFOVÁ VIKTOR WITKOVSKÝ The simulation study is set up with independent samples from two populations. Three configurations with different number of sampling units (m 1,m 2 ) are investigated: (5,7),(10,15), and (30,40). For each sampling unit unequal repeated observations are generated. First the number of observations on the ith sampling unit, T i, is generated using Poisson distribution with parameter θ = 4. Then time points are generated as discrete uniform with support being a subset of {1,2,...,20} of the size T i that resulted from Poisson distribution. The response vector Y is modeled as Y = (X 1,X 2 )(β 1,β 2), where Y is n = m i=1 T i dimensional, with m = m 1 +m 2, X 1 is a matrix containing covariates (functions oftime) andthe columns ofx 2 arethe dummy variablesfor the twopopulations. ThreeconfigurationsformatrixX 1 wereconsideredlinear,linear+quadratic,and linear+quadratic+cubic polynomials of time. Hence the dimension of β 2 is 2 and the dimension of β 1 depends on the configuration: it is either a scalar, or two-, or three-dimensional vector. The response is modeled by a random intercept model, which results in a two-variance components model with the parameter space for λ, Λ, being the set of all nonnegative real numbers. The null hypothesis is formulated as H 0 : β 1 = 0 and λ = λ 0. The hypothesized values λ 0 are chosen to be the values 0, 0.2, 0.4, 0.6, 0.8, and 1.0. Notice that the hypothesized value λ 0 = 0 falls on the border of the parameter set Λ. Hence 54 different configurations were investigated. For each configuration 10,000 simulations were carried out, the empirical quantiles recorded and compared to asymptotic(approximate) distribution. The selected results are presented in the form of graphs. Figure 1 corresponds to one-column setting for matrix X 1, which means that the hypothesized parameter β 1 is a scalar. We notice that if the hypothesized λ 0 is in the interiorofthe parameterspace,i.e., λ 0 0,withincreasing samplesize theχ 2 (2) distribution provides a pretty good approximation to the exact distribution (the first row of panels). For all λ 0 = 0 the χ 2 (2) approximation is not good even for higher sample sizes. However, the 0.65χ 2 (2) : 0.35χ 2 (3) mixture provides a very accurate approximation to probabilities (bottom row of panels).we observe the same phenomenon for higher rank of H, Figure 2 shows the results for r(h) = 2. Based on our study, as a good approximation for the exact CDF at λ 0 = 0 we suggest to use a mixture distribution with estimated value of the optimum weight w based on small sample generated from the exact distribution wχ 2 (h) : (1 w)χ 2 (h+1). For hypothesized values of λ 0 in the interior of the parameter space, the approximation does not seem to depend too strongly on λ 0 (all lines are almost on top of each other). That would suggest that the underlying exact distribution ofthe LRT is close to be invariant with respect to λ 0 (for sufficiently largevalues). In particular, forlarge valuesofλ 0 the exactdistribution converges quickly to fixed distribution. For large λ 0 we suggest to use an approximation based on non-central chi-square distribution, χ 2 (h+1,δ), with the noncentrality parameter δ, to be estimated based on small sample generated from the exact distribution, see Figure 3. 178

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE COMPONENTS Figure 2. Quantile plots: left column (5, 7), central column (10, 15), right column (30,40), r(h) = 2. 3. Conclusion The main goal in this paper was to investigate the approximate (asymptotic) characteristics of the distribution of the likelihood ratio test for small sample size setting in regard to exact distribution. We found that even for very small total number of observations, the asymptotic χ 2 approximation is adequate and complex exact calculations can be easily avoided. 179

JÚLIA VOLAUFOVÁ VIKTOR WITKOVSKÝ 1 0.9 0.8 0.7 0.6 F(x) 0.5 0.4 0.3 0.2 0.75χ 2 (h) : 0.25χ 2 (h+1) 0.65χ 2 (h) : 0.35χ 2 (h+1) 0.1 χ 2 (h+1) χ 2 (h+1,δ = 0.45) 0 0 2 4 6 8 10 12 x Figure 3. Exact and approximate CDFs of the LRT for model configuration with sample sizes (5,7) and h = r(h) = 1, plotted for different values of λ 0. The thin lines present exact CDFs for λ 0 = 0 : 0.05 : 1 (from left to right). The thick lines present approximations: (i) mixture distribution wχ 2 (h) : (1 w)χ 2 (h+1), with the estimated weight w = 0.75, (ii) mixture distribution 0.65χ 2 (h) : 0.35χ 2 (h+1), (iii) χ 2 (h+1), and (iv) noncentral chi-square distribution χ 2 (h + 1,δ) with the estimated non-centrality parameter δ = 0.45. REFERENCES [1] CHRISTENSEN, R.: Plane Answers to Complex Questions: The Theory of Linear Models. Springer-Verlag, New York, 1987. [2] CRAINICEANU, C. M. RUPPERT, D.: Likelihood ratio tests in linear mixed models with one variance component, J. R. Stat. Soc. Ser. B Stat. Methodol. 66 (2004), 165 185. [3] HARTLEY, H. O. RAO, J. N. K.: Maximum-likelihood estimation for the mixed alysis of variance model, Biometrika 54 (1967), 93 108. [4] HARVILLE, D. A.: Matrix Algebra from a Statistician s Perspective. Springer, New York, 1997. [5] LAIRD, N. M. WARE, J. H.: Random-effects models for longitudinal data, Biometrics, 38 (1982), 963 974. [6] LAMOTTE, L. R.: On non-estimable conditions and restricted least squares in linear regression models, in: Proceedings of the Conference in Honor of Shayle R. Searle, Aug. 9 10, 1996, Biometrics Unit, Cornell University, 1998, pp. 145 160. [7] OLSEN, A. SEELY, J. BIRKES, D.: Invariant quadratic unbiased estimation for two variance components, Ann. Statist. 4 (1976), 878 890. 180

ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE COMPONENTS [8] PINHEIRO, J. C. BATES, D. M.: Mixed-effects Models in S and S-Plus. Springer, New York, 2000. [9] RAO, C. R. MITRA, S. K.: Generalized Inverse of Matrices and Its Applications. John Wiley & Sons, New York, 1971. [10] SEELY, J.: Estimability and linear hypotheses, Amer. Statist. 31 (1977), 121 123. [11] STRAM, D. O. LEE, J. W.: Variance components testing in the longitudinal mixed effects model, Biometrics 50 (1994), 1171 1177. Received May 31, 2012 Júlia Volaufová School of Public Health LSU Health Sciences Center 2020 Gravier Street New Orleans, LA 70112 USA E-mail: jvolau@lsuhsc.edu Viktor Witkovský Institute of Measurement Science Slovak Academy of Sciences Dúbravská cesta 9 SK 841-04 Bratislava SLOVAKIA E-mail: witkovsky@savba.sk 181