Mixed Models for Assessing Correlation in the Presence of Replication
|
|
- Camilla Atkins
- 6 years ago
- Views:
Transcription
1 Journal of the Air & Waste Management Association ISSN: (Print) (Online) Journal homepage: Mixed Models for Assessing Correlation in the Presence of Replication Anthony Hamlett, Louise Ryan, Paulina Serrano-Trespalacios & Russ Wolfinger To cite this article: Anthony Hamlett, Louise Ryan, Paulina Serrano-Trespalacios & Russ Wolfinger (003) Mixed Models for Assessing Correlation in the Presence of Replication, Journal of the Air & Waste Management Association, 53:4, , DOI: / To link to this article: Published online: Feb 01 Submit your article to this journal Article views: 865 View related articles Citing articles: 33 View citing articles Full Terms & Conditions of access and use can be found at Download by: [ ] Date: 06 December 017, At: 13:59
2 TECHNICAL PAPER ISSN J Air & Waste Manage Assoc 53: Copyright 003 Air & Waste Management Association Mixed Models for Assessing Correlation in the Presence of Replication Anthony Hamlett and Louise Ryan Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts Paulina Serrano-Trespalacios Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts Russ Wolfinger SAS Institute, Inc, Cary, North Carolina Downloaded by [ ] at 13:59 06 December 017 ABSTRACT The need to assess correlation in settings where multiple measurements are available on each of the variables of interest often arises in environmental science However, this topic is not covered in introductory statistics texts Although several ad hoc approaches can be used, they can easily lead to invalid conclusions and to a difficult choice of an appropriate measure of the correlation Lam et al approached this problem by using maximum likelihood estimation in cases where the replicate measurements are linked over time, but the method requires specialized software We reanalyze the data of Lam et al using PROC MIXED in SAS and show how to obtain the parameter estimates of interest with just a few lines of code We then extend Lam et al s method to settings where the replicate measurements are not linked Analysis of the unlinked case is illustrated with data from a study designed to assess correlations between indoor and outdoor measurements of benzene concentration in the air INTRODUCTION An important first step in any environmental science research project is assessing the accuracy and reliability of IMPLICATIONS While replicate measurements are commonly taken in environmental science research settings, it is unclear how to use these replicates to assess correlations When the number of replicates varies by subject, use of ad hoc approaches to correlation results in an efficiency loss and, hence, in unreliable correlation estimates Formulating the problem as a mixed model leads to results that are more reliable and that overcome the problems of the ad hoc approaches In addition, the SAS approach is very userfriendly and lends itself to extensions for more complex settings the measurement tools to be used For example, researchers may wish to compare the results of air pollution measurements based on the use of two different types of sampling or analytical techniques Correlation analysis is the primary statistical tool used in this context As discussed in almost every introductory statistics book, the Pearson correlation coefficient is the appropriate measure of association between two variables when these two variables are jointly normally distributed The Spearman correlation coefficient provides a nonparametric alternative based on ranks As discussed by Rosner, 1 the more specialized intraclass correlation coefficient is appropriate in settings where the two variables of interest are expected to have the same means and variances Pearson and Spearman correlations are easily and directly obtained from most statistical packages While some packages (eg, Stata) also provide commands to compute intraclass correlations directly, this quantity is easily obtained in packages such as SAS by formulating the problem as a mixed model (see the SAS manual ) In this paper, we address a question not covered in introductory texts, namely, the assessment of correlation in settings where multiple measurements are available on each of the variables of interest Two settings, linked and unlinked, are considered In the linked setting, the repeated measurements are linked together in some way, for example, repeats are taken together on different days Table 1 (taken from Bland and Altman 3 ) shows a linked data set of repeated measurements of intramural ph and PaCO in a study designed to assess within-subject correlations of clinical information gained from blood gas analysis and from gastric ph of critically ill patients Each pair of measurements was taken on different days In the unlinked setting, repeated measures are not linked 44 Journal of the Air & Waste Management Association Volume 53 April 003
3 Downloaded by [ ] at 13:59 06 December 017 Table 1 Repeated measures of intramural ph(x) and PaCO (Y) for eight critically ill patients (PATI) PATI ph PaCO PATI ph PaCO PATI ph PaCO together Table shows such a data set, corresponding to replicate measurements of benzene concentration in indoor and outdoor air, measured on 35 Mexican families Note that while some families have a single replicate measurement on indoor and outdoor air (eg, family 1), others (eg, family 6) have two replicate measurements on each Several ad hoc approaches can be taken to compute correlations in the presence of replication One naive approach is to ignore the repeated measurements and treat the data as if it were a simple random sample, and then compute the standard Pearson correlation coefficient Another choice is to compute the mean response for each variable for each subject, and then compute the standard Pearson correlation coefficient using the subject-specific averages for each variable Yet another choice is to compute a weighted correlation coefficient, 4 using the subject-specific averages for each variable and the number of repeated measurements for each subject as weights There are inherent problems with each of these approaches The simple correlation coefficient ignores the number of subjects as the (correct) sample size and uses instead the total number of observations as the (incorrect) sample size, thereby erroneously increasing the degrees of freedom, which can lead to overly frequent rejection of the null hypothesis when in fact it is true (ie, an invalid type I error 5 ) The simple correlation coefficient based on subject means avoids this problem but does not take into account the different number of replicate measurement per subject In addition, it tends to underestimate the true between-subject correlation 6 The weighted correlation coefficient does take into account the different number of replicate measurements per subject; however, the number of replicate measurements per variable for a given subject must be the same In addition, because the subject means are used in the computation, it too tends to underestimate the true between-subject correlation Several authors have proposed more technical solutions to the problem of measuring the correlation between two variables in the presence of replication For example, Bland and Altman 3 proposed using a partial correlation coefficient, which requires removing differences between subjects The partial correlation coefficient is useful if we want to know whether an increase (decrease) in one variable within a subject is associated with an increase (decrease) in the other variable However, if there are many subjects, there is loss in power, caused by the increased number of parameters that are to be estimated Chinchilli et al 7 proposed the use of a weighted correlation coefficient, using the sample variances and covariances to compute the weights While they considered both the unlinked and the linked case, their method is complicated Furthermore, it is empirically based and does not naturally arise from an underlying statistical model Lam et al 6 used maximum likelihood (ML) estimation to estimate the true correlation between the variables when the repeated measurements are linked over time They derived their estimates through formulation of the problem as a mixed-effects model Unfortunately, their approach is rather technical and requires the use of specialized software An important purpose of this paper is to show how to reproduce the parameter estimates of Lam et al, 6 using PROC MIXED in SAS We show that the analysis can be easily achieved with just a few lines of code We then extend Lam et al s approach to the nonlinked setting and also obtain parameter estimates using PROC MIXED in SAS In addition to reanalyzing the data of Lam et al, the analysis of data in the nonlinked setting is illustrated with data from a study designed to assess the correlation between indoor and outdoor measurements of benzene concentration in the air STATISTICAL MODELS Linked Repeated Measurements of Two Variates To proceed, some notation must be introduced We begin with the linked case, in which the repeated observations on the variables of interest, X and Y, are Volume 53 April 003 Journal of the Air & Waste Management Association 443
4 Downloaded by [ ] at 13:59 06 December 017 Table Repeated measures of benzene concentration (g/m 3 ) in indoor(x) and outdoor(y) air taken at the homes of 35 Mexican families Family Benzene Location Family Benzene Location Family Benzene Location In Out In Out In Out 75 In In In 1363 Out Out In Out In 8 86 Out 4 71 In Out Out Out In In In Out In Out In Out 6 69 In Out In In In In Out Out Out Out In Out In Out In In In In Out In Out Out Out Out In Out 3 46 In In 1 38 Out Out Out 60 In Out Out 703 In 3 71 Out 9 96 In 686 Out In In 3 46 Out In Out In 33 3 Out 9 98 Out In In In 5 31 Out In In 5 76 Out Out Out 6 75 In Out Out In In In Out In In 6 87 Out Out Out In where X and Y are the variances of X and Y, respectively, and XY is their correlation Note that XY is our main parameter of interest For notational convenience later, we will define the covariance form X Y XY XY Full specification of the model also requires assumptions regarding the relationships between X s and Y s measured at different times Like Lam et al, we assume that correlations between measurements taken at two different times, j and j, j j, are given by CorrX ij, X ij X CorrY ij, Y ij Y () CorrX ij, Y ij XY Heuristically, we would expect the term to generally be less than 1, indicating that correlations between variables measured at different times are lower in magnitude than those taken at the same time The assumed correlation structure is depicted in Figure 1 To better visualize the covariance structure, it is helpful to write out the full covariance matrix for the entire set of n i repeated measurements for the ith subject linked, for example, by being taken at the same point in time Let (X ij, Y ij ) be the jth repeated observation (j 1,,n i )ofthex, Y variables taken on the ith subject (i 1,,n), in a sample of n individuals, and define N to be the total number of observations Suppose that the pair (X ij, Y ij ) have a bivariate normal distribution with mean ( X, Y ) and variance-covariance matrix The parameters X and Y represent the overall mean values of the variables of interest Assumptions about are important, because this is where the correlations of interest are defined Following Lam et al, it is assumed X X Y XY X Y XY Y (1) Y X i C i CovXi1 Y i X X Y ini XY X X XY X X XY XY Y XY Y Y XY Y Y X X XY X XY X X XY XY Y Y XY Y XY Y Y X X XY X X XY X XY XY Y Y XY Y Y XY Y Note that to allow for a more parsimonious expression, we are using the covariance term XY Note also the block structure of this matrix, with submatrices corresponding to down the main diagonal The covariance matrix will have the same structure for each subject, except that the dimension will vary For example, Table 1 shows that person 1 has eight observations, 4 on each variable; hence, the covariance matrix for person 1 has eight rows (3) 444 Journal of the Air & Waste Management Association Volume 53 April 003
5 CorrX ij, X ij X CorrY ij, Y ij Y (5) CorrX ij, Y ij XY CorrX ij, Y ij It follows that one can think of the unlinked case as a special case of the linked setting, with set equal to 1 The covariance matrix for the ith subject in this setting is given by Downloaded by [ ] at 13:59 06 December 017 Figure 1 Correlation structure and eight columns (8 8) with the previously given structure Further insight into the structure of C i is seen if the data is reordered Instead of setting up the covariance matrix in terms of successive X, Y pairs (ie, X i1, Y i1, X i, Y i,, Y ini, Y ini ), suppose the n i X values are written first, followed by the n i Y values With this rearrangement, the covariance matrix becomes X i X i3 X Cov Xi1 Y i1 Y i Y i3 Y inix X X X X X X XY XY XY XY X X X X X X X XY XY XY XY X X X X X X X XY XY XY XY X X X X X X X X X XY XY XY XY XY XY XY XY Y Y Y Y Y Y Y XY XY XY XY Y Y Y Y Y Y Y XY XY XY XY Y Y Y Y Y Y Y Y Y XY XY XY XY Y Y Y Y Y Y Y (4) The covariance matrix can now be seen to fall into four distinct blocks The upper left block shows a constant covariance X X between the n i repeated X values taken on the ith subject Similarly, the lower right block shows a constant covariance Y Y between the n i repeated Y values taken on the ith subject The off-diagonal blocks show a compound symmetric covariance structure between the n i X and Y values, with XY on the main diagonal and XY on the off-diagonal Unlinked Repeated Measurements of Two Variates An appropriate model for the unlinked repeated measures design is easily obtained by a simple alteration to the model corresponding to the linked repeated measures design The fundamental difference between the linked and the unlinked settings is that the X ij and Y ij are no longer linked together That is, there is no time effect in the problem, and hence, the correlation between any two X and Y measurements should be the same, regardless of when they are taken The correlation structure thus becomes Y X i C i CovXi1 Y i X X Y ini XY X X XY X X XY XY Y XY Y Y XY Y Y X X XY X XY X X XY XY Y Y XY Y XY Y Y X X XY X X XY X XY XY Y Y XY Y Y XY Y (6) C i can also be written with the n i X values first, followed by the n i Y values Note that in this unlinked version of the covariance matrix, the difference between C i and C i is that there are no terms in C i involving and, hence, the blocks on the off-diagonal are now constant MODEL FITTING IN SAS Models for both the linked and unlinked settings can be easily fit using PROC MIXED in SAS To use PROC MIXED, the data must be entered in univariate form; that is, each row of data must correspond to a different measurement A variable needs to be defined, which indicates whether each line of data corresponds to an X or Y observation This variable is called Vtype A Replicate variable is used to keep track of the repeated measurements within subjects Note that the Replicate variable will be nested within subjects Appropriate SAS data format is illustrated below by Example 1, for the data in Table 1, where ph is chosen as Vtype 1 and PaCO is chosen as Vtype Response is the value of Vtype 1orVtype and Persnum is the subject number It is of no significance that ph is chosen as Vtype 1 and PaCO as Vtype, because the coding scheme was arbitrary Example 1 Input Persnum Vtype Response Replicate; cards; Volume 53 April 003 Journal of the Air & Waste Management Association 445
6 Downloaded by [ ] at 13:59 06 December The appropriate formulation of the PROC MIXED code, however, is not immediately obvious, because of the relative complexity of the covariance matrices C i and C i As described in the SAS documentation, PROC MIXED allows the fitting of regression models, where the covariance of the response involves the sum of two components, a matrix G involving the random effects in the model and specified through the random command, as well as a matrix R corresponding to the error term in the model and specified through the use of the repeated command While most familiar mixed models use either the random or repeated commands, the models described in the previous sections require the use of both We begin with the linked case To see how the SAS code should be written, it is useful to note that for each subject, the covariance matrix C i can be written as the sum of two matrices, one a matrix of constants whose values depend on whether the corresponding pair is two X s, two Y s oranx, Y pair, and the other a block diagonal, with blocks corresponding to X, Y pairs measured at the same time Hence, C i can be written as C i X X XY X X XY X X XY XY Y Y XY Y Y XY Y Y X X XY X X XY X X XY XY Y Y XY Y Y XY Y Y X X XY X X XY X X XY XY Y Y XY Y Y XY Y Y (7) where X (1 X ), Y (1 Y ) and XY (1 ) These two matrices can be set up through judicious use of the random and repeated statements in PROC MIXED Consider first the matrix on the left side of the expression Careful scrutiny indicates that the matrix can be constructed by assigning X- and Y-specific random effects to individual i, and allowing these random effects to be correlated This can be achieved by declaring the variable Vtype (ie, the indicator of whether a particular observation is an X or a Y) to be random across individual subjects Covariance between the X- and Y-specific random effects can be achieved by specifying an unstructured covariance matrix Now consider the matrix on the right side of the expression This structure is relatively straightforward and can be achieved by declaring the variable Vtype to be repeated within each individual-specific replicate (ie, declaring the subject to be replicate nested within individual) and using an unstructured covariance In the case of linked repeated measurements, the SAS code to obtain the parameter estimates is given by SAS code; data dataname; input persnum vtype response replicate; datalines; ; proc mixed; class persnum vtype replicate; model response vtype/solution ddfmkr; random vtype/typeun subjectpersnum g gcorr v vcorr; repeated vtype/typeun subjectreplicate(persnum) r rcorr; run; where Persnum corresponds to subject number; Vtype refers to the two variables, which are coded as 1 and ; Response corresponds to the values of the two variables; and Replicate corresponds to the number of repeated measurements for each subject, which need not be the same The CLASS statement specifies Persnum, Treatment, and Replicate as classification (categorical) effects, and the MODEL statement specifies the mean (regression) model for the data SOLUTION requests that the fixed effects (specified on the right side of the equal sign in the model statement, before /) estimates be printed, and DDFM KR specifies the Kenward- Roger 8 method for computing the denominator degrees of freedom for the fixed effects Note that while this latter option is not necessary, it tends to yield more reliable results in general (see the SAS manual for more details) As indicated earlier, the RANDOM and REPEATED statements are used to set up the structure of the G and R matrices Declaring SUBJECT Persnum after the specification of Vtype as random instructs PROC MIXED to make the N N variance-covariance matrix for the entire data vector to be block diagonal, with block corresponding to subject The size of the blocks depends on the number of measurements each subject has These subject blocks are in themselves block diagonal of size with structure specified by TYPE option For example, from 446 Journal of the Air & Waste Management Association Volume 53 April 003
7 Downloaded by [ ] at 13:59 06 December 017 the data in Table 1, the first person has a total of eight measurements; hence, the size of the block for the first person is 8 8, while the third person has 16 measurements and, thus, the size of the block for the third person is TYPE UN specifies a general variance-covariance matrix and makes the subject-specific X and Y random effects correlated On the REPEATED statement line, SUBJECT Replicate(Persnum) instructs PROC MIXED to make the N N variance-covariance matrix for the data vector to be a diagonal matrix of blocks Each of these blocks has the structure specified by the TYPE option In this case, TYPE UN specifies a general variance-covariance matrix G and GCORR request that the estimated random effect variance-covariance and correlation be printed, respectively V and VCORR request that the estimated response variance-covariance and correlation be printed, respectively R and RCORR request that the variance-covariance and correlation between the within subject replicate X, Y pairs be printed, respectively The V matrix is a combination of the G and R matrices By default, for R, RCORR, V, and VCORR, the first block, determined by the SUBJECT effect, is printed However, the default can be changed by specifying a specific value for R, RCORR, V, and VCORR (see the SAS manual ) In the PROC MIXED statement, a METHOD option can be given to specify the method of estimation for the covariance parameters If no METHOD option is given in the PROC MIXED statement, the covariance parameters are estimated using restricted maximum likelihood (REML) estimation, the default option Similarly, in the MODEL statement, the method of computation for the denominator degrees of freedom can be specified by using the DDFM option If no DDFM option is given in the MODEL statement, for the SAS code given here, the CONTAINMENT option is used For further details on the METHOD option and DDFM option, see the SAS manual For the unlinked case, the code is the same as that described previously, except that the repeated statement is replaced by repeated vtype/typeun(1) subjectreplicate(persnum) r rcorr; where TYPE UN(1) specifies a variance-covariance matrix whose off-diagonal element is zero Equivalently, one can use the following code for the repeated statement: repeated/groupvtype r rcorr; where GROUP vtype specifies heterogeneity of variances between observations with vtype 1 and vtype (ie, for X and Y) EXAMPLES Linked Data Table 1 provided by Bland and Altman 3 and reproduced in Lam et al 6 shows linked repeated measurements of intramural ph and PaCO for eight subjects Table 3 gives the simple Pearson correlation, the simple Pearson correlation based on subject means, the weighted correlation (Bland and Altman 4 ), and the 95% bootstrapped confidence interval (CI) for this data set It is important to note here that bootstrapping was accomplished by resampling individuals, thus maintaining the appropriate correlation structure of the data Inspection of Table 3 reveals that these correlation measures are of different magnitudes and signs Thus, one is faced with the dilemma of choosing one of these measures as the appropriate measure of the true correlation The values presented here differ from those of Bland and Altman 4 because of rounding Of the three correlation measures, the naive Pearson correlation measure has the shortest interval Lam et al 6 obtained parameter estimates (Table 4) for the data, using an ML estimation program These results can be reproduced using the SAS code, by specifying METHOD ML in the PROC MIXED statement The main difference between ML and REML (the default option) is that ML gives biased estimates of the covariance parameters, whereas REML does not For comparison with the naive estimates reported in Table 3, we provide bootstrap confidence intervals for the correlation parameter estimates obtained using SAS s PROC MIXED Selected Table 3 Simple correlations between ph(x) and PaCO (Y) and 95% bootstrap confidence interval for the data in Table 1 Correlation Value 95% CI Naive Pearson correlation , Pearson correlation based on means , Weighted correlation , 0813 Table 4 Parameter estimates from Lam et al 6 for the ph(x)-paco data in Table 1 with 95% bootstrap confidence interval for XY Parameter Estimate 95% CI X Y 5008 X Y X Y 0654 XY , Volume 53 April 003 Journal of the Air & Waste Management Association 447
8 Downloaded by [ ] at 13:59 06 December 017 portions of the SAS output are given in Tables 5 8, where labels have been added for clarity Table 5 gives the results obtained from the SAS code for the R and G matrices From Table 5, and 0547 are the estimated variances ( XR ; YR )ofxand Y, ( X ; Y )ofx and Y, respectively Note that and Note also that the elements in G appear in V From Table 7, the estimated correlation between X and Y ( XY )is For j j, the estimated correlation between X ij and X ij respectively, obtained from the R matrix Similarly, ( X ) is and the estimated correlation between Y ij and 045 are the estimated variances ( XG ; YG )ofxand Y, respectively, obtained from the G matrix The respective covariances from the R and G matrices are and 005 The correlations derived from the R ( R ) and G ( G ) matrices are 0509 and 01416, respectively The results in Table 4 are obtained from Tables 6, 7, and Y ij ( Y ) is 0654 The estimated correlation between X ij and Y ij ( XY ) is 0104 and, thus, the estimate of is (0104/000995) The means X and Y are obtained from Table 8 In SAS, when the variables are categorical, the highest value is taken as the point of reference, which in this case is the variable labeled as [(PaCO (Y)] The estimate for X is 71151, which is the and 8 Table 6 gives the results obtained from the SAS sum of the estimated values for the intercept and code for the V matrix and Table 7 gives the corresponding correlations associated with the V matrix From Table 6, PaCO (Y) The estimate for Y is 5008, the intercept value and are the overall estimated variances Unlinked Data Table 5 Estimated R and G matrices obtained for the data in Table 1 using SAS PROC MIXED procedure The data in Table is from an environmental study that focused on measuring the amount of benzene concentration R Matrix G Matrix (in g/m 3 ) in the air inside and outside the homes of several Mexican families The data are entered into SAS Variable ph PaCO ph PaCO similarly as was done for Example 1 Table 9 gives the ph simple correlation coefficients along with the 95% bootstrapped confidence intervals Of the two correlation PaCO measures, the naive Pearson correlation measure has the shorter confidence interval Inspection of Table 9 indicates that it is much more difficult to Table 6 Estimated variance-covariance matrix for the ph(x)-paco (Y) data in Table 1, for PATI 1 choose an appropriate measure of the true X 1 Y 1 X Y X 3 Y 3 X 4 Y 4 correlation in this setting because the number of observations is not the same in the two cases In addition, not all of the data X (95 observations) are used in computing Y these correlations The reason for the discrepancy in the number of observations is X Y that, for some subjects, there are measurements X Y missing Consequently, because of X missing measurements, a weighted correlation Y would be difficult to compute These problems do not occur if the SAS code is used to obtain the correlation Table 7 Estimated correlation matrix between ph(x) and PaCO (Y) data in Table 1, for PATI 1 Table 10 gives the results obtained X 1 Y 1 X Y X 3 Y 3 X 4 Y 4 from the SAS code for the R and G matrices From Table 10, 8 and are the estimated variances ( XR ; YR )ofx and Y, X respectively, obtained from the R matrix Y Similarly, 1196 and are the estimated variances ( XG ; YG )ofx and Y, X Y respectively, obtained from the G matrix X Y The covariance from the G matrix is X and the correlation ( G ) is 0655 Y Note here that for the R matrix there is no covariance and, hence, no correlation, 448 Journal of the Air & Waste Management Association Volume 53 April 003
9 Downloaded by [ ] at 13:59 06 December 017 Table 8 Regression results for the ph(x)-paco (Y) data in Table 1 Effect Estimate SE DF t Value Pr > t Intercept ph PaCO 0 Table 9 Simple correlations between indoor(x) and outdoor(y) air, and 95% bootstrap confidence interval for the benzene data in Table Correlation # of Obs Value 95% CI Naive Pearson correlation , Pearson correlation based on means , Table 10 Estimated R and G matrices obtained for the data in Table using SAS PROC MIXED procedure Variable R Matrix G Matrix Indoor Outdoor Indoor Outdoor Indoor Outdoor because of the TYPE UN(1) specified in the REPEATED statement The results in Tables 11, 1, and 13 are used to obtain the results in Table 14 Table 11 gives the results obtained from the SAS code for the V matrix and Table 1 gives the corresponding correlations associated with the V matrix From Table 11, and 4493 are the overall estimated variances ( X ; Y )ofx and Y, respectively Note that and Note also that the elements in G appear in V From Table 1, the estimated correlation between X and Y ( XY ) is For j j, the estimated correlation between X ij and X ij ( X ) is 088 and the estimated correlation between Y ij and Y ij ( Y ) is From Table 13, X ( ) and Y Table 14 also provides bootstrap confidence interval for the parameter of interest, XY DISCUSSION In this paper, we investigated methods to assess the correlation between two variates, X and Y, in the presence of repeated measures or replicates Both linked and unlinked settings were considered, in both cases under the assumption that the two variates follow a multivariate normal distribution Ad hoc approaches as well as PROC MIXED in SAS were used to estimate the correlation for two examples Of the ad hoc approaches, the bootstrapped confidence interval was shortest for the naive Pearson approach Bootstrapped confidence intervals for the mixed model formulation were approximately equal to or shorter than the bootstrapped confidence intervals for the ad hoc approaches This confirms that the mixed-model approach is indeed using the data in a more efficient manner The mixed-model formulation overcomes some of the inherent problems with the ad hoc approaches and is very easy to apply using PROC MIXED in SAS Although not of direct relevance to the topic of this paper, our data examples revealed some interesting features in relation to the effects of outliers For both the ad hoc approaches and the SAS PROC MIXED approach, the estimates were sensitive to the exclusion of an extreme Table 11 Estimated variance-covariance matrix for the indoor(x)-outdoor(y) benzene data in Table for family 6 X 1 Y 1 X Y X Y X Y Table 1 Estimated correlation matrix between indoor(x) and outdoor(y) air for the benzene data in Table for family 6 X 1 Y 1 X Y X Y X Y Table 13 Regression results for the indoor(x)-outdoor(y) benzene data in Table Effect Estimate SE DF t Value Pr > t Intercept Indoor Outdoor 0 Table 14 Parameter estimates for the indoor(x)-outdoor(y) benzene data in Table with 95% bootstrap confidence interval for XY Parameter Estimate 95% CI X Y X Y X 088 Y XY , Volume 53 April 003 Journal of the Air & Waste Management Association 449
10 Downloaded by [ ] at 13:59 06 December 017 observation For example, in the benzene analysis, the correlation coefficient was reduced by 69% when an influential observation was removed This finding suggests that users should be cautious to make sure the results are not driven by extreme values before interpreting them If the data are skewed, thereby violating the normality assumption, a transformation, such as the log, might be appropriate before applying the SAS PROC MIXED procedure On the other hand, one can compute Spearman s correlation 9 in the simple case (no repeats) However, it is not clear how one would generalize our method to compute a Spearman correlation in the presence of replication We have treated X and Y as being distinct variables, each having its own mean and variance However, in many instances, one may not be able to distinguish between X and Y For example, consider two different devices used to measure the lung capacity of a subject In this situation, one is more interested in the agreement of measurement of the two devices A measure of this agreement is the concordance correlation 1,7 On the other hand, interest may focus on the degree to which a single measure of an event describes the mean of repeated measurements of that event In this case, an intraclass correlation 1 can be computed For both of the data sets presented here, intraclass correlations can be computed Finally, one can compute a correlation for each subject and then use the subject correlations in the computation of an overall correlation 7 This procedure would work well if there were several repeated measurements per subject per variable ACKNOWLEDGMENTS This work was supported by NIH grants ES0000, ES0714, and ES05947 REFERENCES 1 Rosner, B Fundamentals of Biostatistics, 5th ed; Duxbury: Pacific Grove, CA, 000 SAS Institute Inc SAS/STAT User s Guide: Version 8, Volume ; SAS Institute, Inc: Cary, NC, Bland, JM; Altman, DG Calculating Correlation Coefficients with Repeated Observations: Part 1 Correlation within Subjects; Brit Med J 1995, 310, Bland, JM; Altman, DG Calculating Correlation Coefficients with Repeated Observations: Part Correlation between Subjects; Brit Med J 1995, 310, Bland, JM; Altman, DG Correlation, Regression and Repeated Data; Brit Med J 1994, 308, Lam, M; Webb, CA; O Donnell, DE Correlation between Two Variables in Repeated Measures In American Statistical Association, Proceedings of the Biometric Section; American Statistical Association: Alexandria, VA, 1999; pp Chinchilli, VM; Martel, JK; Kumanyika, S; Lloyd, T A Weighted Concordance Correlation Coefficient for Repeated Measures Designs; Biometrics 1996, 5, Kenward, MG; Roger, JH Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood; Biometrics 1997, 53, Zar, JH Biostatistical Analysis, 4th ed; Prentice Hall: Upper Saddle River, NJ, 1999 About the Authors Anthony Hamlett is a research fellow and Louise Ryan is a professor of biostatistics in the Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 0115 Paulina Serrano-Trespalacios is a doctoral student in the Department of Environmental Science, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 0115 Russ Wolfinger is the director of geonomics at SAS Institute Inc, SAS Campus Drive, Cary, NC Journal of the Air & Waste Management Association Volume 53 April 003
over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */
CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason
More informationAnalysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.
Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED. Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Longitudinal data refers to datasets with multiple measurements
More informationAnswer to exercise: Blood pressure lowering drugs
Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationAn R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM
An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department
More informationIntroduction to SAS proc mixed
Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The
More informationIntroduction to SAS proc mixed
Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building
More informationTesting Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED
Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED Here we provide syntax for fitting the lower-level mediation model using the MIXED procedure in SAS as well as a sas macro, IndTest.sas
More informationLongitudinal Data Analysis of Health Outcomes
Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationMultiple Linear Regression
Chapter 3 Multiple Linear Regression 3.1 Introduction Multiple linear regression is in some ways a relatively straightforward extension of simple linear regression that allows for more than one independent
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors
More informationBiostatistics 301A. Repeated measurement analysis (mixed models)
B a s i c S t a t i s t i c s F o r D o c t o r s Singapore Med J 2004 Vol 45(10) : 456 CME Article Biostatistics 301A. Repeated measurement analysis (mixed models) Y H Chan Faculty of Medicine National
More informationA SAS/AF Application For Sample Size And Power Determination
A SAS/AF Application For Sample Size And Power Determination Fiona Portwood, Software Product Services Ltd. Abstract When planning a study, such as a clinical trial or toxicology experiment, the choice
More informationChapter 3 ANALYSIS OF RESPONSE PROFILES
Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over
More informationDesigning Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.
Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models
More informationChapter 13 Correlation
Chapter Correlation Page. Pearson correlation coefficient -. Inferential tests on correlation coefficients -9. Correlational assumptions -. on-parametric measures of correlation -5 5. correlational example
More informationRepeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models
Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models
More informationA Re-Introduction to General Linear Models (GLM)
A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing
More informationSAS Syntax and Output for Data Manipulation:
CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent
More informationBIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES
BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationThe MIANALYZE Procedure (Chapter)
SAS/STAT 9.3 User s Guide The MIANALYZE Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete
More informationSTAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.
STAT 01 Assignment NAME Spring 00 Reading Assignment: Written Assignment: Chapter, and Sections 6.1-6.3 in Johnson & Wichern. Due Monday, February 1, in class. You should be able to do the first four problems
More informationSplit-Plot Designs. David M. Allen University of Kentucky. January 30, 2014
Split-Plot Designs David M. Allen University of Kentucky January 30, 2014 1 Introduction In this talk we introduce the split-plot design and give an overview of how SAS determines the denominator degrees
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationRank parameters for Bland Altman plots
Rank parameters for Bland Altman plots Roger B. Newson May 2, 8 Introduction Bland Altman plots were introduced by Altman and Bland (983)[] and popularized by Bland and Altman (986)[2]. Given N bivariate
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationSAS/STAT 13.1 User s Guide. The MIANALYZE Procedure
SAS/STAT 13.1 User s Guide The MIANALYZE Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationSAS/STAT 13.1 User s Guide. The Four Types of Estimable Functions
SAS/STAT 13.1 User s Guide The Four Types of Estimable Functions This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as
More informationTopic 12. The Split-plot Design and its Relatives (continued) Repeated Measures
12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have
More informationMATH Notebook 3 Spring 2018
MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationKeywords: One-Way ANOVA, GLM procedure, MIXED procedure, Kenward-Roger method, Restricted maximum likelihood (REML).
A Simulation JKAU: Study Sci., on Vol. Tests 20 of No. Hypotheses 1, pp: 57-68 for (2008 Fixed Effects A.D. / 1429 in Mixed A.H.) Models... 57 A Simulation Study on Tests of Hypotheses for Fixed Effects
More informationSample Size / Power Calculations
Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two
More informationFrom Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...
From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...
More informationGraphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects. H.J. Keselman University of Manitoba
1 Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects by H.J. Keselman University of Manitoba James Algina University of Florida and Rhonda K. Kowalchuk University of Manitoba
More informationIntroduction to Matrix Algebra and the Multivariate Normal Distribution
Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate
More informationMIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010
MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf
More informationDISPLAYING THE POISSON REGRESSION ANALYSIS
Chapter 17 Poisson Regression Chapter Table of Contents DISPLAYING THE POISSON REGRESSION ANALYSIS...264 ModelInformation...269 SummaryofFit...269 AnalysisofDeviance...269 TypeIII(Wald)Tests...269 MODIFYING
More informationResearch Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.
Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationSC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)
SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,
More informationNemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015
Nemours Biomedical Research Biostatistics Core Statistics Course Session 4 Li Xie March 4, 2015 Outline Recap: Pairwise analysis with example of twosample unpaired t-test Today: More on t-tests; Introduction
More informationModels for longitudinal data
Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen
More informationMultivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation
Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics
More informationAccounting for Correlation in the Analysis of Randomized Controlled Trials with Multiple Layers of Clustering
Duquesne University Duquesne Scholarship Collection Electronic Theses and Dissertations Spring 2016 Accounting for Correlation in the Analysis of Randomized Controlled Trials with Multiple Layers of Clustering
More informationChapter 11. Analysis of Variance (One-Way)
Chapter 11 Analysis of Variance (One-Way) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result
More informationChapter 11. Correlation and Regression
Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of
More informationAdvanced Experimental Design
Advanced Experimental Design Topic 8 Chapter : Repeated Measures Analysis of Variance Overview Basic idea, different forms of repeated measures Partialling out between subjects effects Simple repeated
More informationAnalysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA
Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -
More informationMulticollinearity and A Ridge Parameter Estimation Approach
Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com
More informationSAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1
Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from
More informationPLS205 Lab 2 January 15, Laboratory Topic 3
PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way
More informationHypothesis Testing for Var-Cov Components
Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationRejection regions for the bivariate case
Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test
More informationCorrelation. Martin Bland. Correlation. Correlation coefficient. Clinical Biostatistics
Clinical Biostatistics Correlation Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Correlation Example: Muscle and height in 42 alcoholics A scatter diagram: How
More informationRepeated Measures Data
Repeated Measures Data Mixed Models Lecture Notes By Dr. Hanford page 1 Data where subjects are measured repeatedly over time - predetermined intervals (weekly) - uncontrolled variable intervals between
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationPOWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA
POWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA A Thesis Submitted to the Graduate Faculty of the North Dakota State University of Agriculture
More informationChapter 7: Simple linear regression
The absolute movement of the ground and buildings during an earthquake is small even in major earthquakes. The damage that a building suffers depends not upon its displacement, but upon the acceleration.
More informationRandomized Complete Block Designs
Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized
More informationdm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;
dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************; *** Moore, David
More informationSRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018
SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationSAS/STAT 15.1 User s Guide The GLMMOD Procedure
SAS/STAT 15.1 User s Guide The GLMMOD Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Topics: Summary of building unconditional models for time Missing predictors in MLM Effects of time-invariant predictors Fixed, systematically varying,
More informationREPEATED MEASURES USING PROC MIXED INSTEAD OF PROC GLM James H. Roger and Michael Kenward Live Data and Reading University, U.K.
saug '93 ProceedioJls REPEATED MEASURES USING PROC MIXED INSTEAD OF PROC GLM James H. Roger and Michael Kenward Live Data and Reading University, U.K. Abstract The new procedure Mixed in Release 6.07 of
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38
BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to
More informationCorrelated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data
Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of
More informationTutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances
Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function
More information2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable.
1. The diagnostic plots shown below are from a linear regression that models a patient s score from the SUG-HIGH diabetes risk model as function of their normalized LDL level. a. Based on these plots,
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationwith the usual assumptions about the error term. The two values of X 1 X 2 0 1
Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The
More informationThe bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap
Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationSigmaplot di Systat Software
Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical
More informationDESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective
DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,
More informationDescribing Change over Time: Adding Linear Trends
Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationOdor attraction CRD Page 1
Odor attraction CRD Page 1 dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************;
More informationFactor Analysis. Qian-Li Xue
Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale
More information