Analysis of 24-Hour Ambulatory Blood Pressure Monitoring Data using Orthonormal Polynomials in the Linear Mixed Model

Size: px

Start display at page:

Download "Analysis of 24-Hour Ambulatory Blood Pressure Monitoring Data using Orthonormal Polynomials in the Linear Mixed Model"

Deborah Martin
5 years ago
Views:

1 Analysis of 24-Hour Ambulatory Blood Pressure Monitoring Data using Orthonormal Polynomials in the Linear Mixed Model "* # Lloyd J. Edwards and Sean L. Simpson " Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina # Department of Biostatistical Sciences, Wake Forest University School of Medicine, Winston-Salem, NC * edwards@bios.unc.edu

2 SUMMARY. The use of 24-hour ambulatory blood pressure monitoring (ABPM) in clinical practice and observational epidemiological studies has grown considerably in the past 25 years. ABPM is a very effective technique for assessing biological, environmental, and drug effects on blood pressure. In order to enhance the effectiveness of ABPM for clinical and observational studies via graphical and analytical results, developing a unified data analysis approach is paramount. The linear mixed model for the analysis of longitudinal data is particularly wellsuited for the estimation of, inference about, and interpretation of both population and subject- specific profiles for ABPM data. Subject-specific profiles are of great importance in ABPM studies, especially in clinical practice, but little emphasis has been placed on this dimension of the problem. We propose using a linear mixed model with orthonormal polynomials across time in both the fixed and random effects to analyze ABPM data. Our method provides a powerful approach to the analysis of these data. The results can be used as the basis for standardizing analytical approaches and model-based graphical presentations of ABPM data. We demonstrate the proposed technique using data from the Dietary Approaches to Stop Hypertension (DASH) study, a multicenter, randomized, parallel arm feeding study that tested the effects of dietary patterns on blood pressure. We contrast our approach with the restricted cubic spline linear mixed model proposed by Lambert et al (2001) and demonstrate a better model fit, better predictions for individual subject profiles, and improved computational stability. KEY WORDS: Cubic Spline, DASH Study, Graphical Display, Hypertension, Longitudinal, Model Selection, Orthogonal Polynomials

3 1 1. Introduction - Motivation and Background Ambulatory blood pressure monitoring (ABPM) is a powerful research tool for examining blood pressure (BP) variability and the physiologic and environmental factors that affect BP (Stason and Appel 1993). The evidence that ABPM gives information over and above conventional blood pressure measurement (CBPM) has been growing steadily over the past 25 years and 24-hour ABPM has seen an increasing role in clinical practice (O'Brien 2003, 2007, 2008). ABPM is a non-invasive technique in which a standard cuff, attached to a lightweight, portable data recording unit, is placed around the upper arm and inflated at regular preset intervals during a 24 hour time period. Traditionally, clinicians have used office blood pressure measurements as the preferred method of monitoring blood pressure and consequently the method of diagnosing hypertension. Unfortunately, physician's office blood pressure measurements often can give rise to what is termed "white-coat hypertension", that is, an artificially high blood pressure reading (Owens et al. 1999). ABPM provides a profile of blood pressure away from the medical environment, thereby allowing identification of individuals with a white coat response (O'Brien 2003). It also provides several other important advantages discussed by O'Brien (2003): ABPM can demonstrate the efficacy of antihypertensive medication over a 24 hour period rather than making a decision based on one or a few CBPMs confined to a short period of the diurnal cycle; ABPM can identify patients whose blood pressure does not reduce at night-time (the non-dippers) who are probably at high risk for a number of conditions; and the technique can demonstrate a number of patterns of blood pressure behavior that may be relevant to clinical management (isolated systolic hypertension, hypotension, dipping and nondipping, etc.). As noted by O'Brien (2003), recent longitudinal studies have shown that ABPM is a much stronger predictor of cardiovascular morbidity and mortality than CBPM. The importance of the technique is further evidenced by the fact that the Centers for Medicare and Medicaid Services (CMS) in the US have approved ABPM for reimbursement. O'Brien (2003) concluded that

4 2 ABPM should be available to all hypertensive patients given its international acceptance as an indispensable tool for patients with established and suspected hypertension. This recommendation has important implications for clinical practice. Among the questions that O'Brien (2003) posed were: 1. how should the data be presented?; 2. how best can doctors and nurses unfamiliar with the technique be educated in its use and interpretation of the data? Lambert et al. (2001) noted that there has been relatively little work done on the longitudinal analysis of 24-hour ABPM data. Though some authors have attempted to address the issue, there is not a generally accepted 'standard' method of analyzing 24-hour ABPM. Some analyses of ABPM data have used means and/or medians, either obtaining means over the entire 24-hour period or obtaining means over defined intervals of the 24-hour period, for example daytime and nighttime intervals (Weber et al. 1988, Ferguson et al. 1992, Appel et al. 1997). Other analyses have stressed the use of traditional simple or extended cosinor models (Halberg et al. 1984, Marler et al. 1988, Gaffney, et al. 1993). Some authors have proposed smoothing splines or other smoothing techniques for the analysis of 24-hour ABPM data (Streitberg et al. 1989, Streitberg et al 1990, Dickson et al. 1992). Jaccard and Wan (1993) proposed using cross-sectional pooled time series designs for the analysis of 24-hour ABPM data. Schwartz et al. (1994) proposed a very limited variance-components model which is equivalent to a mixed model using very simple explanatory factors (time is not explicitly used as a factor). Selwyn and DiFranco (1993) demonstrated the use of the general linear mixed model as an approach to analyzing 24-hour ABPM data. Few studies have directly compared alternative approaches and most such comparisons used small sample sizes (Selwyn and DiFranco 1993, Marler et al. 1988, Gaffney et al 1993, Streitberg et al 1989). Streitberg and Mayer-Sabellek (1993) made comparisons between the spline model and the traditional extended cosinor model. Selwyn and DiFranco (1993) made comparisons between a traditional extended cosinor model and a mixed model that was a combination of an extended cosinor model and fourth-degree natural polynomials (the simple cosinor and spline

5 3 models were eliminated from consideration based on placebo period results). Streitberg and Mayer-Sabellek (1993) used 17 subjects and Selwyn and DiFranco (1993) used 35 subjects. Lambert et al. (2001) used 206 subjects in fitting a restricted cubic spline model. We use 357 subjects from the DASH study for our example. The development of robust and flexible statistical methods for analyzing 24-hour ABPM data will be important in defining a clinical role for 24-hour ABPM. Therefore, we are motivated to develop appropriate statistical models for analyzing 24-hour ABPM data which will overcome the limitations of conventional analyses and can lead to a unifying procedure for the way 24-hour ABPM data are analyzed and graphically presented. The general linear mixed model (Harville 1976, Laird and Ware 1982) has become a very powerful statistical tool in the analysis of longitudinal data with continuous outcomes in both clinical and non-clinical studies (Edwards 2000, Gurka and Edwards 2008, Cheng et al 2010). The linear mixed model for the analysis of longitudinal data is particularly well-suited for the estimation of, inference about, and interpretation of both population and subject-specific profiles for ABPM data. Subject-specific profiles are of great importance in ABPM studies, especially in clinical practice, but little emphasis has been placed on this dimension of the problem. We propose using a linear mixed model with orthonormal polynomials in both the fixed and random effects as the basis for standardizing analytical approaches and graphical presentations that include predicted subjectspecific profiles for ABPM data. We demonstrate the proposed technique using data from the Dietary Approaches to Stop Hypertension (DASH) study, a multicenter, randomized, parallel arm feeding study that tested the effects of dietary patterns on blood pressure. We contrast our proposed technique with the restricted cubic spline linear mixed model proposed by Lambert et al (2001) and demonstrate a better model fit, better predictions for subject-specific profiles, and improved computational stability. Following the more detailed guidelines discussed by O'Brien (2008), we provide methods for developing model-based graphical presentations of 24-hour ABPM based on linear mixed model

6 4 results that can be used to help make 24-hour ABPM a much more common tool for diagnosing and treating cardiovascular disease. The model-based graphical presentations include display s of the 90% prediction interval for normals and user selected subject-specific profiles. The 90% prediction interval corresponds to the outcomes-based mean 2 standard deviations for each time point. In Section 2, a description of the DASH trial is given. In Section 3, the basic linear mixed model formulation and the details for the orthonormal polynomial and restricted cubic spline statistical methods are presented. We discuss the results of applying the models to the DASH data in Section 4. Section 5 discusses computational issues arising from the applications. Section 6 discusses the new graphical displays created, and a summary and conclusions are given in Section Description of DASH Clinical Trial The Dietary Approaches to Stop Hypertension (DASH) trial was a multicenter, randomized, parallel arm feeding study that tested the effects of dietary patterns on blood pressure (Appel et al 1997). The three diets were a control diet (low in fruits, vegetables, and dairy products, with a fat content typical of the average diet in the United States), a diet rich in fruits and vegetables (a diet similar to the control except it provided more fruits and vegetables and fewer snacks and sweets), and a combination diet rich in fruits, vegetables, and low-fat dairy foods and reduced in saturated fat, total fat, and cholesterol. The combination diet will be subsequently referred to as the DASH diet. Participants were healthy, community-dwelling adults 22 years of age or older who were not taking antihypertensive medication. Each subject had an average systolic blood pressure of less than 160 mm Hg and a diastolic blood pressure of 80 to 95 mm Hg (mean of six measurements across three screening visits). Study subjects were enrolled sequentially in groups; the first group began the run-in phase of the trial in September 1994, and the fifth and last group started in January 1996.

7 5 For each group, data were collected during three phases (screening, run-in, and intervention). Run-in was a three-week period in which all participants were fed the control diet. Toward the end of run-in, 24-hour ambulatory blood pressure monitoring was obtained once. This constituted the "baseline" ABPM reading. During the third week, participants were randomized to one of three diets. Intervention was an eight-week period in which participants were fed their assigned diets. During the last two weeks, one 24-hour ambulatory blood pressure monitoring was obtained (end-of-intervention ABPM). ABPM was attempted on the 362 participants enrolled in groups 2-5. We use 357 subjects for our example. For each subject 24 measurements were constructed at hourly intervals. 3. The Linear Mixed Model As discussed in Section 1.2, several statistical methods have been proposed for the analysis of 24-hour ABPM data. In this paper, two linear mixed model methods will be used for comparison: linear mixed model with orthonormal polynomials and restricted cubic spline linear mixed model. With R independent sampling units (often persons in practice), the linear mixed model for person 3 may be written (Muller and Stewart 2006 notation, ch.5) C œ \ " ^. /. (1) Here, C is a : " vector of observations on person 3 ; \ is a : ; known, constant design matrix for person 3, with full column rank ; while " is a ; " vector of unknown, constant, population parameters. Also ^3 is a : 3 7 known, constant design matrix with rank 7 for person 3 corresponding to the 7 " vector of unknown random effects., while / is a : " vector of unknown random errors. Gaussian. 3 and / 3 are independent with mean! and. 3.3.! iœ œ / D 7! D 7. (2) 3 /3 / Here i is the covariance operator, while both D 7 and D 7 are positive-definite, Ð Ñ.3. /3 / w symmetric covariance matrices. Therefore iðc Ñ may be written D œ ^ D 7 ^ D /3 /

8 6 We assume that D 3 can be characterized by a finite set of parameters represented by an < " R vector 7 which consists of the unique parameters in 7 and 7. Throughout 8 œ :.. / 3œ" 3 We may also need to refer to a stacked data formulation of model (6) given by C œ \ " ^. /, (3) = = = = = = w = w " R " R = " R = w " R with C œ cc w â C w d, \ œ c\ w â \ w d, ^ œ diag ^ ßâß^,. œ c. w â. w d, and w = " R = R7.3. R = 8 /= /= w w / œ c/ â / d. Here. µ a c! ßD 7 Œ M d and / µ a! ßD for D œ diag cd 7 ßâß D 7 d. In turn C µ a \ ", D withd œ iðc Ñ œ diag D ßâßD. /" / /R / = 8 = = = = " R We may also need to refer to a stacked data formulation of model (6) given by C= œ \ = " ^ =. = / =, (4) w w w = " R = " R = " R = " R with C œ cc w â C w d, \ œ c\ w â \ w d, ^ œ diag ^ ßâß^,. œ c. w â. w d, and w = " R = R7.3. R = 8 /= /= w w / œ c/ â / d. Here. µ a c!ßd 7 ŒM d and / µ a!ßd for D œ diag cd 7 ßâßD 7 d. In turn C µ a \ ", D withd œ iðc Ñ œ diag D ßâßD. /" / /R / = 8 = = = = " R The advantage of reducing bias in covariance estimation has made restricted maximum likelihood (REML) estimation very popular for the linear mixed model. Given our focus on variance estimates, all parameter estimates in this paper are done using REML. However, the formulations also apply to computations based on maximum likelihood estimates. 3.1 Linear Mixed Model with Orthonormal Polynomials Polynomials are commonly used to describe curved relationships in statistical models where a model must be developed empirically. The use of natural polynomials for the analysis of 24-hour ABPM recordings poses several problems. A plot of an individual's 24-hour ABPM recordings clearly shows a nonlinear curve suggesting higher order polynomials (greater than 3) are necessary (Lambert et al 2001). For example, suppose a quartic natural polynomial model is to be used to model the 24-hour ABP recordings. This means that the design matrix will consist of 3 4 very large numbers such as 24 œ 13,824 and 24 œ 331,776. In this paper we use orthogonal polynomials up to degree 9 which would correspond to natural polynomials in the design matrix with values as high as œ 2.64 x 10. The existence of very large numbers in the design

9 7 matrix can cause computational problems and adversely affect statistical estimation and inference. In addition, the use of natural polynomials inherently lead s to problems of 2 q multicollinearity since B, B, á., B are correlated. Multicollinearity in the linear mixed model can have similar bias effects (estimation, inference, and statistical power) as in univariate regression models (Stinnett 1993). Orthonormal polynomial predictors in linear regression are transformations of the natural polynomials that provide a new set of predictors that meet the following criteria (Muller and Fetterman, 2002). (1) The new predictors contain the same information as the original set. (2) The new predictors are linear combinations of the original natural polynomials. (3) The new columns of predictors all have mean zero (except for the constant term). (4) The new columns of predictors are all mutually orthogonal. Combining (3) and (4) implies that the new predictors (except the constant) are mutually uncorrelated. (5) The first new column captures the information in the intercept, the second captures all of the linear term information adjusted for the intercept, the third captures quadratic information above and beyond the linear and intercept information and so forth. With the advance in computer technology over the past two decades, generating orthonormal polynomials and using them in regression problems are relatively easy to do. For example, in SAS Proc IML, the ORPOL matrix function generates orthonormal polynomials. The main interest is in the shape of the profiles that the orthonormal polynomials produce collectively and not in any one polynomial. Some of the advantages of using orthonormal polynomials are that a) the problem of multicollinearity is greatly reduced; b) the orthonormal polynomials are "normalized", which, in effect, is a form of "scaling" what would otherwise be large values with each orthonormal polynomial value in the interval ( 1, 1). Thus, orthonormal polynomials have few problems with rounding error or very large (small) regression coefficients; and c) orthonormal polynomial coefficients can be directly compared since they are normalized to the

10 8 same units. This means that the magnitude and sign of the regression coefficients can provide a way of assessing the relative contribution of each of the polynomials. Using the linear mixed model with orthonormal polynomials, individual 24-hour ABPM profiles can be represented and parameterized, within- and between- treatment comparisons based on individual profiles can be made, and discrimination between different profiles can be obtained. The linear mixed model can account for circadian rhythms, subject effects, and treatment effects. The mixed model also accommodates missing observations and irregularly timed observations. 3.2 The Case of Complete, Balanced Data with Equally Spaced Observations For the analysis of 24-hour ABPM data, we advocate the use of a linear mixed model with orthonormal polynomials used in both the fixed and random effects. We assume that the degree of the polynomials used in the fixed effects is equal to that used in the random effects. For demonstration purposes, we first start with a linear mixed model with only time effects and intercept, but no additional covariates. We assume that time is equally spaced 1-hour intervals ranging from 1 to 24 and that each subject has all 24 observations (complete and balanced). We have C œ W " Y. /. (5) where W is a fixed effects design matrix of orthonormal polynomials of degree + and W œ Y (see Web Appendix for constructing orthonormal polynomials). Note that the same orthonormal polynomials used in the fixed effects are also used in the random effects. Here " œ (", ", á, w " +. 3 ) so that " is ; " where ; œ + 1 and is 7 " where 7 œ + 1. Because the! " polynomials are orthogonal, we can assume D.3 7. is diagonal with heterogeneous diagonal elements. This greatly simplifies the complexity of the random effects covariance. The number of parameters in D 7 is 7, the number of diagonal elements. The maximum.3. degree orthonormal polynomial that we consider for the DASH data is a 9th degree. In the 9th degree model, we estimate 10 fixed effects (intercept and 9 orthonormal polynomials), 10

11 9 random effects covariance parameters and 5 # since assuming D/3 7/ œ 5 # M: 3. However, if we were using natural polynomials with an unstructured D.3 7., then the number of unique random effects covariance parameters would be 10(10 1)/2 œ 55 parameters. The linear mixed model with orthonormal polynomials would have 45 less random effects covariance parameters to estimate. 3.3 The Case of Incomplete, Unbalanced Data Though 24-hour ABPM is designed to be collected at regular intervals, the reality is that all too often there are missing ABPM data for each individual which may be attributable to one of several factors including invalid ABPM readings and/or patient noncompliance (e.g., removing the monitor to shower or patient not remaining sufficiently still while the blood pressure measurement was being taken). The degree of missing data in individuals is a complicating feature in the analysis of 24-hour ABPM data. Also, actual times at which ABPM data are sampled can be irregular due in part to the same mechanisms which may account for missing data. In such cases, approximate orthonormal polynomials can be generated using a two step process based on the technique discussed in Emerson (1968). See the Web Appendix for constructing approximate orthogonal polynomials for the DASH data. We can then use the model given by equation The Restricted Cubic Spline Linear Mixed Model Of particular interest for comparison in this paper is the restricted cubic spline linear mixed model proposed by Lambert et al (2001). Regression splines are piecewise polynomials joined together at points called knots. Consider 5 knots (in time > ) > 1 > 2 â > 5 which define 5 1 intervals [A, > Ñ, Ò>, > Ñ, á, Ò>, B] partitioning the range of the data, [A, B]. For the fixed " " # 7 effects the spline segment in interval 4 is a polynomial of degree 1 4. The piecewise nature of spline functions results in a better description of the data locally without affecting global properties of the approximation. Optional side conditions for continuity and some degree of smoothness are imposed at each knot. These constraints on the model are expressed as set of

12 10 linear equations which the fixed effects and/or the random effects must satisfy. The resulting linearly constrained model has a reduced number of fixed-effect regression coefficients that must be estimated and/or a reduced number of covariance parameters that must be estimated for the random effects. A restricted cubic spline model can be fitted by including terms for the intercept, knots, and a further 5 2 variables, B, B, á, B, defined by 1 2 > > > > > > > > B3 œ > > 3 $ $ $ ( 5 " ) ( 5 3) ( 5) ( 5 " 3) ( ), 3 œ ", á, 5 2, > > > > 5 # 5 5 " 5 5 " > with 5 where? œ? if?! and? œ! if? Ÿ! (Smith 1979). Because the derived covariates are highly correlated, which can lead to software convergence problems, Lambert et al. (2001) advocates transforming the derived restricted cubic spline covariates using the Graham-Schmidt orthogonalization. Though knot number and location can be optimized, a fixed number of knots and locations were chosen by Lambert et al. (2001) for the restricted cubic spline model. The initial choice was 9 knots at hours (24-hour clock) 13.00, 15.00, 18.00, 21.00, 00.00, 03.00, 06.00, and In the basic model involving only time effects and intercept, Lambert et al (2001) used restricted cubic splines in the fixed effects and natural cubic polynomials in the random effects: The ) $ C34 œ 7 " 5= 534 > / 34. (6) 5œ! 7œ! Gaussian. (4 1) and / are independent with mean! and D.3 7.! iœ œ 5 / #, D unstructured. (7)! M : 3 3 's are nine coefficients associated with the Gram Schmidt transformed restricted cubic " 5 spline covariates, = 534, > 34 is the time of the 3th blood pressure measurement on the 4th subject,. 73 are random effects corresponding to intercept, linear, quadratic, and cubic natural polynomials. Lambert et al. (2001) discussed using 9 random effects corresponding to the 9

13 11 derived fixed effects. They observed that this would lead to 9(9 1)/2 œ 45 parameters in the random effects covariance matrix. Hence, Lambert et al. (2001) decided to use a cubic polynomial in the random effects as a reasonable compromise. The restricted cubic spline model uses 9 fixed effects parameters, including the intercept, 4(4 1)/2 œ 10 unique random effects # # covariance parameters, and 5 from assuming D/3 7/ œ 5 M: 3. Observe that the restricted cubic spline model here has the same number of unique random effects covariance parameters to estimate (from using a cubic model in the random effects) as the orthonormal polynomial model of degree 9. Lambert et al (2001) applied the model to a study that had 206 subjects. As stated previously, predicted subject-specific profiles are of great importance in ABPM studies, especially in clinical practice. Lambert et al (2001) transformed the restricted cubic splines derived for the fixed effects using Gram-Schmidt orthgonalization. However, they did not apply the Gram-Schmidt orthgonalization to the natural polynomials used in the random effects. We note several concerns regarding this treatment of random effects in their model. First, the random and fixed effects have different units since natural polynomials are used for the former but not the latter. Second, the cubic polynomials may not be flexible enough to predict the subject-specific trajectories adequately. Third, natural polynomials have very undesirable properties when used with ABPM data: collinearity, large numbers such as 24 $ œ 13,824 used in the design matrix. From our experience, it is very likely that the software implementing the linear mixed model, like SAS Proc Mixed, will not converge when higher order effects. natural polynomials are used in the random 4. Results - Estimation, Inference and Goodness-of-fit 4.1 Baseline We use only baseline 24-hour ABPM to fit our models. The linear mixed model easily accommodates additional covariates, both static variables like diet group indicators and age of subject, and time-varying covariates (in this case the polynomials). All computations were done

14 12 using SAS v9.2. Restricted maximum likelihood estimation (REML) was used for estimation and the Kenward-Roger F and adjusted denominator degrees of freedom were used for all fixed effect inference. Table 1 provides estimates, standard errors (SE), p-values, and semi-partial V # (Edwards et al. 2008) values for the 9th degree orthonormal polynomial fit to the DASH Data. Table 2 provides estimates, standard errors (SE), and p-values for random effects and residual variances. From Table 1, we can see that the sizes of the absolute values of the orthonormal polynomial regression coefficients indicate that orthonormal polynomials 1-6 have the largest effect on BP, changing from being greater than 4 to less than 2. The semi-partial V # values also provide an ordering of the orthonormal polynomials that match the absolute values of the orthonormal polynomial regression coefficients. Similar to the sizes of the absolute values of the orthonormal polynomial regression coefficients, the semi-partial V # values also suggests stronger association exists for orthonormal polynomials 1-6. Table 3 provides model selection criteria results for the Akaike Information Criterion (AIC, Akaike 1974), Bayesian Information Criterion (BIC, Schwarz 1978) and model V # (Edwards et al. 2008) for orthonormal polynomial models using 4-9th degree polynomials in both the fixed and random effects and for the restricted cubic spline model proposed by Lambert et al (2001). Morrell et al (2009) commented that the best way to select among linear mixed-effects models based on various information criteria is still not clearly determined. The conclusion from both the AIC and BIC is that the 9th degree orthonormal polynomial model is the best model for DBP and SBP since it yields the smallest AIC and BIC values when compared to all other orthonormal polynomial models and compared to the restricted cubic spline model. The AIC and/or BIC reveal that restricted cubic spline model is only better than the 4th degree orthonormal polynomial model. The model V # clearly shows that there is a relatively large difference between the variability explained by the fixed effects for each of the orthonormal polynomial models compared to the restricted cubic spline model. However, the model V # values suggest a slightly

15 13 different conclusion than the AIC and BIC. They suggest that the 8th degree orthonormal polynomial model is better than the 9th degree for both DBP and SBP, though the difference is in the 3rd decimal place. The model V # also supports the notion that for all practical purposes, the 6th degree orthonormal polynomial is as good as the 7-9th degree for DBP ( V # changes from 0.84 to 0.83). For SBP, the model V # suggests that a 5th degree orthonormal polynomial model may be good enough. If the most parsimonious orthonormal polynomial model is important, then # # the model V and the semi-partial V values support using a 6th degree orthonormal polynomial model for both DBP and SBP. The restricted cubic spline regression model (Lambert et al. 2001) provided a worse fit than the orthonormal polynomial models according to the AIC, BIC, and the model V #. The normality assumption for residual error and random effects was met for both the orthonormal polynomial model and the restricted cubic spline model. 4.2 Post-baseline The linear mixed model easily accommodates additional explanatory variables. For example, a post-baseline analysis added the effects of the 3 diet groups (two indicator variables with control diet as the reference group) and the interactions of diet group and time (additional 18 variables) using the 9th degree orthonormal polynomial and restricted cubic spline models. First testing for parallelism (no interaction), we were able to conclude that the average group profiles were parallel in both models. The model was reduced by removing all interactions with time so that the model included diet effects in the intercepts and time. Both the orthonormal polynomial model and the restricted cubic spline models yielded very similar results for diet effects (see Table 4). 5. Computational Issues Computational difficulties using SAS Proc Mixed did arise when fitting the orthonormal polynomial models. First, for orthonormal polynomial models of degree 5 or higher, Proc Mixed could not provide the subject-specific predictions due to computer memory limitations (no model convergence problems). In order to compute the subject-specific profiles, we wrote a separate

16 14 Proc IML program that used the parameter estimates from the linear mixed model and the input data to compute the estimates of random effects using the well-known formula: " s œ s w D ^ D s Š C \ " s. We then computed the subject-specific predicted values to produce the subject-specific profiles using the equation Cs œ \ " s ^. s Secondly, the KR denominator degrees of freedom (ddf) for the joint null hypothesis # corresponding to the model using the V (numerator df 1) was surprisingly different than the Satterthwaite approximated ddf. For example, for the 9th degree orthonormal polynomial model, the KR ddf was 1286 and 1282 for SBP and DBP, respectively. However, the corresponding Satterthwaite approximation ddf yielded 363 and 359. In other words, for a balanced, (near) complete case data, the KR ddf were 3-4 times larger than the Satterthwaithe approximation. We decided to use the Satterthwaithe approximation for ddf in the calculation of V # as motivated by # # alignment concerns of the model V and partial V for this problem. The authors are investigating why such a discrepancy exists between the Satterthwaite and KR ddf for a model that should provide similar results. We also ran into computational problems implementing the restricted cubic spline model. Using a cubic natural polynomial model in the random effects would not converge. Hence, we had to orthogonalize the random effects natural polynomials using the Gram-Schmidt process in order to achieve convergence. We discussed in section 3.4 the undesirable properties of the natural polynomials particularly when using large values. We note that in the paper by Lambert et al. (2001), the small variance and covariance estimates for the random effects covariance matrix (see Table 1 in their publication) suggest that the authors may have had difficulty in achieving convergence also. 6. New Model-Based Graphical Displays for Evaluating 24-hour ABPM Data As discussed in section 1, O'Brien (2003) posed the important question regarding how the data should be presented. The author provided depictions of 24-hour ABPM data that contained a plot of normal bands across the 24 hours and overlayed selected individual subject profiles to

17 15 demonstrate examples of normal blood pressure and various instances of deviations from normal blood pressure (above or below normal range). O'Brien (2003) noted that as with conventional measurement, normal ranges for ABPM have been the subject of much debate over the years. The focus of the outcomes-based figures is to provide practicing physicians with a graphical approach to evaluating an individual's 24-hour BP. However, for clinical and observational studies that require group comparisons and exploring additional explanatory factors that may impact BP, the outcomes-based graphical display is not as useful. A model-based approach is required. The linear mixed model with orthonormal polynomials provides the statistical framework for producing a model-based graphical display which will be extremely useful for clinical and observational studies of 24-hour ABPM. Figures 1 and 2 and Web Figures 1 and 2 for DBP and SBP provide model-based graphical displays of 24-hour ABPM based on O'Brien (2008) outcomes-based concepts. In the figures, we used the 90% prediction interval for the 9th degree orthonormal polynomial model to provide a model-based definition of normal range. The 90% prediction interval corresponds to the outcomes-based mean 2 standard deviations for each time point. We constructed the 90% prediction interval by selecting all subjects that had normal SBP at each time point, fitting a linear mixed model with orthonormal polynomials, and then computing a 90% prediction interval using a SAS macro developed by Kunthel By (2005, see The 90% prediction intervals are the shaded regions in the figures. Predicted subject-specific profiles are of great importance in ABPM studies, especially in clinical practice. In practice, it is the subject-specific values that are used to provide diagnoses and to design treatment regimens. However, little emphasis has been placed on this dimension of the problem. The linear mixed model with 9th degree orthonormal polynomials in both the fixed and random effects is used here as the basis for graphical presentations of DASH ABPM data. The results of the linear mixed model allow the depiction of a smoothed predicted curve for each

18 16 individual. The smoothed predicted curves will provide researchers with the ability to construct additional measures of hypertension based on concepts such as the area under the curve to address differences in severity of hypertension. In Figures 1 and 2, there are 3 subject-specific curves provided (same individuals for DBP and SBP), illustrated using a solid black line, a heavy dashed line, and a small dashed line. The subject corresponding to the solid black line is outside the upper limit of the 90% prediction interval for SBP for the entire 24-hour duration. The subject's predicted DBP is outside the upper limit of the 90% prediction interval upwards of 15-hours (most of the daytime) and then is within the 90% prediction interval during nighttime. The subject with the small dashed line is an example of a complicated scenario. The subject's predicted SBP stays mostly above the 90% prediction interval for SBP but spen ds 6-7 hours (10:00 am - 4:00 pm) below the upper limit of the 90% prediction interval for daytime. For DBP, this subject 's predicted curve vacillates above and below the 90% prediction interval for the entire 24 hour duration. The predicted BP curves of the subject with the solid dashed line depict values that are primarily within the 90% prediction interval with a couple of hours (7:00-9:00 pm) where the prediction is below the 90% prediction interval for both SBP and DBP. For SBP and DBP, Web Figures 1 and 2 present a single subject-specific profile used in Figures 1 and 2 (small dashed line) in order to show how the predicted subject-specific profile (solid blue line) "fits" the observed data (dashed blue line). In addition, the population regression curve (solid black line) is plotted which allows visual comparison of a selected subject and the overall population profile (all 357 subjects). The population regression profile can also be visually compared with the 90% prediction interval bounds. For SBP, the population regression curve borders or is only slightly below the upper limit of the 90% prediction interval ; and for DBP it is below the upper limit of the 90% prediction interval for the entire duration. The population regression profiles indicate that at baseline the group of 357 subjects had possibly

19 17 elevated SBP and normal but slightly elevated DBP as compared to the upper limit of the 90% prediction interval. 7. Discussion The development of more appropriate, robust and flexible statistical methods for analyzing 24- hour ABPM data will be important in refining its role in clinical research, clinical practice, and observational studies. We have demonstrated how the linear mixed model with orthonormal polynomials across time in both the fixed and random effects provides a powerful approach to the analysis of 24-hour ABPM data. We also have shown how the results can be used as the basis for standardizing analytical approaches and graphical presentations of ABPM data. Though there are no generally accepted 'standard' methods of analyzing 24-hour ABPM, the linear mixed model with orthonormal polynomials show s great promise in becoming a unifying statistical procedure for the analysis of 24-hour ABPM data. In this paper, the linear mixed model with orthonormal polynomials demonstrated very good estimation, inference, computing, and goodness-of-fit properties. The linear mixed model is relatively easy to implement (given the complexity of the technique) using commercially available software, allows for straight-forward testing of multiple hypotheses, and the results can be communicated effectively to clinicians using both graphical and tabular displays. The linear mixed model with orthonormal polynomials provides a solid statistical foundation to produce the graphical displays in Figures 1 and 2 and Web Figures 1 and 2. The predicted subject-specific profiles were derived from random effects that represent deviations about the population regression curve. Thus, with the estimated population regression parameters and variance-covariance of random effects and residual error, predicted subject-specific curves are easily generated. In the case of 24-hour ABPM, if a properly constructed population study of individuals with normal blood pressure were available, then a reference set of estimated population regression parameters and variance-covariance parameters could be used to generate predicted subject-specific curves rather easily. The procedure could easily be embedded into

20 18 devices or display software so that the predicted curves are made available as soon as the data is available. Several options could be made available to researchers to display the data as shown in Figures 1 and 2 and Web Figures 1 and 2 including changing the width of the prediction interval to reflect a more stringent requirement for relaxed requirement). "normal" blood pressure classification (or a more We note that the DASH data have very little missing data for the analyses presented here. Where missing data was present we had the approximate orthogonality of polynomials. However, we believe that as long there is not a large amount of missing data, then approximate orthonormal polynomials are good enough for analyzing 24-hour ABPM data with missing data. The DASH study provided a great opportunity to explore the effectiveness of the linear mixed model with orthonormal polynomials for the analysis of 24-hour ABPM data. Further analyses will be required to estimate, test hypotheses, and interpret the effects of potential risk factors such as treatment, gender, race, interactions with the 24-hour profiles, and daytime and nighttime profiles. However, the linear mixed model easily accommodates additional explanatory variables. Also, we focused our attention on baseline 24-hour ABPM. The DASH study measured ABPM during a baseline period and intervention period. In order to include both baseline and intervention 24-hour ABPM, a multivariate linear mixed model could be used. Finally, the discrepancy found between the Kenward-Roger denominator degrees of freedom and the Satterthwaite denominator degrees of freedom when the numerator degrees of freedom is greater than 1 is troublesome. Working with SAS programmers has convinced us that the software is not the issue since SAS provided independent Proc IML code that verified their results. The Kenward-Roger and Satterthwaite approximations to ddf are just that, approximations. For use in defining statistics such as the V # we will need further research to find the relationship, if any, between the two approaches that perhaps can be used as a diagnostic measure to determine which is closer to the "truth". We are attempting to investigate this surprise result further.

21 19 REFERENCES Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transaction on Automatic Control AC-19, Appel, L. J., Moore, T. J., Obarzanek, E., Vollmer, W. M., Svetkey, L. P., Sacks, F. M., Bray, G. A., Vogt, T. M., Cutler, J. A., Windhauser, M. M., Lin, P. H., Karanja, N. (1997). A Clinical Trial of the Effects of Dietary Patterns on Blood Pressure. DASH Collaborative Research Group. New England Journal of Medicine 336, Cheng, J., Edwards, L. J., Maldonado-Molina, M. M., Komro, K. A., Muller, K. E. (2010). Real Longitudinal Data Analysis for Real People: Building a Good Enough Mixed Model. Statistics in Medicine 29, Dickson, D., Hasford, J. (1992). 24-hour Blood Pressure Measurement in Antihypertensive Drug Trials: Data Requirements and Methods of Analysis. Statistics in Medicine 11, Edwards, L. J., Muller, K. E., Wolfinger, R. D., Qaqish, B. F., and Schabenberger, O. (2008). An 2 R Statistic for Fixed Effects in the Linear Mixed Model. Statistics in Medicine 27, Emerson, P. L. (1968). Numerical Construction of Orthogonal Polynomials From a General Recurrence Formula. Biometrics 24, (Correction: V25 p778). Ferguson, J. H., Shaar, C. J. (1992). The Effective Diagnosis and Treatment of Hypertension by the Primary Care Physician: Impact of Ambulatory Blood Pressure Monitoring. Journal of the American Board of Family Practice 5, Gaffney, M., Taylor, C., Cusenza, E. (1993). Harmonic Regression Analysis of the Effect of Drug Treatment on the Diurnal Rhythm of Blood Pressure and Angina. Statistics in Medicine 12, Gurka, M. J., Edwards, L. J. (2008). Mixed Models. Handbook of Statistics, Volume 27: Epidemiology and Medical Statistics. Elsevier, North-Holland: Amsterdam. Halberg, J., Halberg, F., Leach, C. N. (1984). Variability of Human Blood Pressure With Reference Mostly to the Non-Chronologic Literature. Chronobiologia 11,

22 20 Harville, D. A. (1976). Extension of the Gauss-Markov Theorem to Include the Estimation of Random Effects. Annals of Statistics 4, Jaccard, J., Wan, C. K. (1993). Statistical Analysis of Temporal Data With Many Observations: Issues for Behavioral Medicine Data. Annals of Behavioral Medicine 15(1), Laird, N. M., Ware, J. H. (1982). Random-effects Models for Longitudinal Data. Biometrics 38, Lambert, P. C., Abrams, K. R., Jones, D. R., Halligan, A. W. F., Shennan, A. (2001). Analysis of Ambulatory Blood Pressure Monitor Data Using a Hierarchical Model Incorporating Restricted Cubic Splines and Heterogeneous Within-Subject Variances. Statistics in Medicine 20, Marler, M. R., Jacob, R. G., Lehoczky, J. P., Shapiro, A. P. (1988). The Statistical Analysis of Treatment Effects in 24-Hour Ambulatory Blood Pressure Recordings. Statistics in Medicine 7, Morrell, C. H., Brant, L. J., Ferrucci, L. (2009). Model Choice Can Obscure Results in Longitudinal Studies. Journal of Gerontology, Series A: Biological Sciences, Medical Sciences 64A(2), Muller, K. E., Edwards, L. J., Simpson, S. L., and Taylor, D. J. (2007). Statistical Tests With Accurate Size and Power for Balanced Linear Mixed Models. Statistics in Medicine 26, Muller, K. E., Fetterman, B. A. (2002). Regression and ANOVA: An Integrated Approach Using SAS Software, SAS Institute: Cary, NC. O'Brien, E. (2003). Ambulatory Blood Pressure Monitoring in the Management of Hypertension. Heart 89, O'Brien, E. (2007). The Circadian Nuances of Hypertension: A Reappraisal of 24-H Ambulatory Blood Pressure Measurement in Clinical Practice. Irish Journal of Medical Sciences 176,

23 21 O'Brien, E. (2008). Ambulatory Blood Pressure Measurement: The Case for Implementation in Primary Care. Hypertension 51, Owens, P., Atkins, N., O Brien, E. (1999). Diagnosis of white Coat Hypertension by Ambulatory Blood Pressure Monitoring. Hypertension 34, Schwartz, J. E., Warren, K., Pickering, T. G. (1994). Mood, location and physical position as predictors of ambulatory blood pressure and heart rate: Application of a multi-level random effects model. Annals of Behavioral Medicine 16(3), Schwarz, S. R. (1978). Estimating the Dimension of a Model. Annals of Statistics 6, Selwyn, M. R., Difranco, D. M. (1993). The Application of Large Gaussian Mixed Models to the Analysis of 24 Hour Ambulatory Blood Pressure Monitoring Data in Clinical Trials. Statistics in Medicine 12, Smith, P. L. (1979). Splines as a Useful and Convenient Statistical Tool. American Statistician 33, Stinnett, S. S. (1993). Collinearity in Mixed Models. Biostatistics, University of North Carolina; Chapel Hill, NC. Unpublished Dissertation, Department of Streitberg, B., Mayer-Sabellek, W. (1990). Smoothing Twenty-Four-Hour Ambulatory Blood Pressure Profiles: A Comparison of Alternative Methods. 6), S21-S37. Journal of Hypertension 8(suppl Streitberg, B., Mayer-Sabellek, W., Baumgart, P. (1989). Statistical Analysis of Circadian Blood Pressure Recordings in Controlled Clinical Trials. Journal of Hypertension 7(suppl 3), S11- S17. Weber, M. A., Cheung, D. G., Grarttinger, W. F., Lipson, J. L (1988). Characterization of Antihypertensive Therapy by Whole-Day Blood Pressure Monitoring. Journal of the American Medical Association 259,

24 Table 1. Estimates, Standard Errors (SE), P-values, and Semi-partial Orthornormal Polynomial Fit to DASH Data V # values for 9th Degree Outcome Parameter Ortho Poly Degree Estimate SE P-value Semi-partial V DBP " " " " " " " " " " SBP "" " " " " " " " " #

25 Table 2. Estimates, Standard Errors (SE), and P-values for Random Effects and Residual Variances Outcome Random Effects Ortho Poly Degree Variance Estimate SE P-value DBP.!3 Intercept # SBP.!3 Intercept #

26 Table 3. Model Comparisons: AIC, BIC, and V # FE=Fixed Effects, RE=Random Effects Outcome Model AIC BIC V # DBP Orthonormal Poly (FE_degree = 9,RE_degree = 9) 58, , Orthonormal Poly (FE_degree = 8,RE_degree = 8) 59, , Orthonormal Poly (FE_degree = 7,RE_degree = 7) 59, , Orthonormal Poly (FE_degree = 6,RE_degree = 6) 59, , Orthonormal Poly (FE_degree = 5,RE_degree = 5) 59, , Orthonormal Poly (FE_degree = 4,RE_degree = 4) , Restricted Cubic Spline 59, , SBP Orthonormal Poly (FE_degree = 9,RE_degree = 9) 61, , Orthonormal Poly (FE_degree = 8,RE_degree = 8) 61, , Orthonormal Poly (FE_degree = 7,RE_degree = 7) 61, , Orthonormal Poly (FE_degree = 6,RE_degree = 6) 61, , Orthonormal Poly (FE_degree = 5,RE_degree = 5) 61, , Orthonormal Poly (FE_degree = 4,RE_degree = 4) 62, , Restricted Cubic Spline 62, ,

27 Table 4. Post-baseline Diet Effects and P-values for 9th Degree Orthornormal Polynomial and Restricted Cubic Spline Model Fit to DASH Data Effect (p-value) Outcome Model DASH vs Control Fruit/Veg vs Control DBP Orthonormal Poly 2.68 (0.0069) 1.62 (0.1004) Restricted Cubic Spline 2.78 (0.0054) 1.65 (0.0962) SBP Orthonormal Poly 3.68 (0.0099) 2.29 (0.1063) Restricted Cubic Spline 3.82 (0.0076) 2.28 ( )

28 Figure 1. Predicted SBP by time for 3 subjects with the 90% prediction interval (shaded region) based on normal subjects. Figure 2. Predicted DBP by time for 3 subjects with the 90% prediction interval (shaded region) based on normal subjects.

29 Web Figure 1

30 Web Figure 2

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department