Longitudinal Data Analysis

Size: px
Start display at page:

Download "Longitudinal Data Analysis"

Transcription

1 Longitudinal Data Analysis Mike Allerhand This document has been produced for the CCACE short course: Longitudinal Data Analysis. No part of this document may be reproduced, in any form or by any means, without permission in writing from the CCACE. The CCACE is jointly funded by the University of Edinburgh and four of the United Kingdom s research councils, BBSRC, EPSRC, ESRC, MRC, under the Lifelong Health and Wellbeing initiative. The document was generated using the R knitr package and typeset using MiKTeX, (LaTeX for Windows), with beamer and TikZ packages. 2016, Dr Mike Allerhand, CCACE Statistician. 1 / 81

2 Straight-line growth Wide format y Observations at 5 measurement occasions. 4 Long format j y x x 2 / 81 Most longitudinal analysis programs require data in long format. Wide format is one row per case, and each row is a complete record. Here there s just one case, measures of something on 5 subsequent occasions. Long format is wide format reshaped so that repeated measures of a variable are stacked into a column, here y. It also needs a column to indicate which measurement occasion, (wave or time-point), each measure belongs to, here j = 1,..., 5. The measurement times are: x = 1,..., 5. Here x represents units of time. The coding assumes equal time intervals between successive measurements. The graph shows the measures y plotted against time x.

3 Straight-line growth y Equation of the straight line: 1 Slope β 1 y j = β 0 + β 1x j j = 1,..., Intercept β 0 x 3 / 81 The straight line is a model of how the measurements y change over time x. The parameters of the model are the intercept β 0 and slope β 1. The intercept β 0 is y when x = 0. The slope β 1 is the change in y per unit x, the change in y when x increases by 1. This particular model assumes a constant growth rate. y changes by the same amount for ANY unit increase of x. The model does not show how the rate of growth might tail off. This model is a perfect fit to these data.

4 Regression by ordinary least squares (OLS) y r = 1 y 1 < r < 1 y y j y y Residual e error j 0 0 x x 0 0 x x 4 / 81 Pearson correlation r = 1 indicates a perfect fit to a straight line. Correlation 1 < r < 1 indicates there is some residual error. A straight line is not a perfect fit. How do we choose the best line if no straight line is a perfect fit? OLS is a procedure for estimating the parameters of a regression model, such as a straight line. OLS estimates β 0 and β 1 for the line with the smallest residual variance. The residual variance σ 2 e is the variance of the residual errors e j around the line.

5 Residual variance Straight line regression model: y j = β 0 + β 1 x j + e j }{{}}{{} fixed random The residuals e j are assumed to be random measurement errors, as if drawn at random from a normal population with mean 0 and variance σ 2 e : e j N(0, σ 2 e ) The errors have zero mean because they vary symmetrically around the line. The consequences of that assumption are: 1. Any regression line always passes through the point ( x,ȳ). ȳ = β 0 + β 1 x 2. If β 1 = 0 the intercept β 0 is ȳ, the mean response. y j = β 0 + e j is a model of the mean. 5 / 81 y = β 0 + β 1 x + e ȳ = 1 n (β0 + β 1 x + e) = 1 β0 + 1 β1 x + 1 n n n 1 1 = β 0 + β 1 x + e n n e But e = 0 because the errors have zero mean, and 1 n x = x, so: ȳ = β 0 + β 1 x Therefore the point ( x,ȳ) always lies on the regression line. If β 1 = 0 then β 0 = ȳ. This is also true for multiple regression. ȳ = β 0 + β 1 x 1 + β 2 x

6 Unconditional and conditional models Unconditional model of the mean. Conditional model of the mean. y j = β 0 + e j y j = β 0 + β 1 x j + e j y y y y σ e 2 2 σ e 0 0 x x 0 0 x x 6 / 81 The unconditional model is intercept-only. There is no slope (it is flat) so the intercept β 0 estimates the mean response ȳ. Here the mean response is assumed not to depend upon x. The conditional model is conditional upon an explanatory variable x. Here the mean response is assumed to be different at different values of x. The intercept β 0 estimates the mean response when x = 0. The slope β 1 estimates the change in the mean per unit increase of x. If x is mean-centered so that x = 0, the intercept β 0 = ȳ. In the unconditional model, residual variance σe 2 equals response variance, Var(y). The conditional model has less residual variance. Part of Var(y) is explained by x. Var(y) is decomposed into two parts: a. The part explained by the straight-line relationship with x. b. The part that is unexplained residual variance σ 2 e. Residual variance of 0 would indicate all of Var(y) is explained by x. In that sense the size of the residual variance tells how closely the data fit the regression line, (how well x explains the variation in y).

7 OLS assumptions (1) Residuals should be: 1. Homoskedastic. The same residual variance σ 2 e at all x. 2. Un-correlated. Residuals should not depend upon each other. y σ e 2 These assumptions are used to derive a formula for the variance of the slope estimator. σ x 2 Var( ˆβ σe 2 1 ) = (n 1)σx 2 Its square root is the slope standard error. 0 0 x If these assumptions are violated standard errors will be incorrect. Then confidence intervals and p-values will also be incorrect. 7 / 81 The box around the regression line represents the slope standard error. The standard error is the standard deviation of the estimator s sampling distribution. Small standard error indicates greater precision. Results are more repeatable. Small standard error, (represented by a long thin box), is given by low residual variance, more degrees-of-freedom, and greater range of x. Including more explanatory variables in the model does not always improve it. Overfitting a specific sample loses generality. ˆ ˆ Residual variance goes down. Standard errors may increase. More parameters to estimate loses degrees-of-freedom. Explanatory variables may confound each other, reducing each variable s unique variance. Aim for a parsimoneous model with acceptable fit.

8 OLS assumptions (2) You have to assume a functional form for the relationship. A straight line is not the only model. It may be mis-specified. y y x x 8 / 81 These two datasets are contrived to have identical variances and covariance. Correlation is blind to the difference. Linear correlation only knows about straight-lines. Fitting a straight-line regression model to both datasets: the estimates, standard errors, and p-values are identical. You have to compare different models fitted to the same data. Compare their goodness-of-fit and test the difference. Compared with a straight-line model, a quadratic model is a much better fit for the data on the right. It also has much lower standard errors.

9 A small dataset Wide format Each row is one person s repeated measures. Each column is a measurement occasion. Long format i j y i th person j th measurement occasion. 9 / 81 These data are Table 11.5 in: Maxwell & Delaney (1990) Designing Experiments and Analyzing Data. 12 children were tested at age 30, 36, 42, and 48 months, (McCarthy scale of children s abilities). 1. Is there, on average, systematic growth in ability over time? 2. Is there variability in growth over time? In wide format each row is a case: one subject s record of observations. Long format is wide format reshaped so that repeated measures are stacked. Long format needs extra columns for indicator variables. i and j indicate which person and which time-point each measurement belongs to. y ij denotes a measurement of the i th person at the j th time-point. Time-points j are repeated within each person i, (and vice versa). Each pair (ij) is unique because the indices are nested.

10 Time as an independent variable i j y x Metrics Coded time-points, (eg. 0,1,2,3). Chronological age (eg. 30,36,42,48 months). Time since baseline, (eg. 0,6,12,18 months). Any meaningful non-decreasing measure / 81 Mixed effects models treat time as data. Time enters the model as an independent variable. Here variable x is time as chronological age in months. The growth rate is the slope of the response per unit time, (per month). These data are strongly balanced. Everyone has the same time-points: the same baseline times and intervals. (Here the intervals are all equal, but that is not strictly necessary). No-one has any missing time-points.

11 Pooled and subject-specific data Pooled data x y Subject-specific data x y 11 / 81 Pooled data are irrespective of grouping by subject. Subject-specific data are indicated by a spaghetti plot. Joining the dots that belong to a specific subject.

12 Pooled and subject-specific data Pooled regression line x y Subject-specific regression lines x y 12 / 81 Subject-specific regression lines often show growth fan-in or fan-out. Here there is fan-in, (except for some unusual subjects). If the data are strongly balanced, (same time-points, none missing), the pooled regression line is the average of the subject-specific regression lines. The intercept of the pooled line is the average subject-specific intercept. The slope is the average subject-specific slope.

13 Fitting a straight-line model R: fit = lm(y x, data) summary(fit) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-09 x R: fit = lmer(y x + (x i), data) summary(fit) Random effects: Groups Name Variance Std.Dev. Corr i (Intercept) x Residual Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) e-06 x / 81 The upper table shows a regression model fitted to pooled data by OLS. The lower table shows a mixed-effects model fitted to subject-specific data by REML, (restricted maximum likelihood). The coefficients of the pooled analysis are the same as the fixed effects of the mixed-effects model, (because these data are strongly balanced). But the standard errors, and hence p-values, are different. The growth rate (x) is non-significant in the pooled analysis. But its standard errors are incorrect because these data violate OLS assumptions. It is (just) significant in the mixed-effects model. This is achieved by accounting for individual growth, (blocking on persons). The mixed-effects model has some additional parameters: the random effects. These represent variation around the average effects due to subject-specific differences. The intercept estimate is the expected response when time x = 0. The intercept variance is the variation in intercepts between subjects. These things have no meaning for a subject age 0. Centre time to give meaning to the intercept and its variance.

14 Centering time Centering time gives meaning to the intercept. The centre is 0 on a continuous scale. Centre time by subtracting a value from the time variable. i j y x Centre x on average baseline age: x ij = x ij x 1 Subtract 30 months from each x value / 81 Long format makes centering and scaling easy. Subtract a mean or some substantively meaningful time value close to the mean. Choose the centre to give meaning to the intercept. For example the expected response at the average baseline age, at the overall average age, or at some particular age. Note: if both time x and the response y are mean-centered, (eg. standardized), then the intercept becomes 0, (at the point ( x, ȳ)). Centering can change intercept variance and intercept-slope covariance, depending upon fan-in/out of subject-specific slopes. Centering on a time where fan-in/out is large makes the intercept variance large. Changing the intercept variance also changes the intercept-slope covariance. Some other reasons for centering time are: 1. It reduces collinearity in quadratic (and higher-order polynomial) models. 2. It can change the size and direction of a TIC direct effect, (if the TIC has a significant interaction with time).

15 Fitting a straight-line model R: fit = lm(y x, data) summary(fit) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 x R: fit = lmer(y x + (x i), data) summary(fit) Random effects: Groups Name Variance Std.Dev. Corr i (Intercept) x Residual Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) e-11 x / 81 Re-fitting the same models as before. The only difference is x is now centered on the average baseline age. Now the intercept represents the expected response at 30 months. Again the coefficients of the regression model are the same as the fixed effects of the mixed-effects model, but the standard errors, and hence p-values, are different.

16 Variation in the data Variance-covariance matrix of repeated measures Cov(Y ) = σ 2 1 σ 12 σ 13 σ 14 σ 21 σ 2 1 σ 23 σ 24 σ 31 σ 32 σ 2 3 σ 34 σ 41 σ 42 σ 43 σ 2 4 Variances on the diagonal. Covariances off the diagonal Heteroskedasticity: different variance at different time-points. Serial correlation: non-zero covariance across time. OLS assumptions: equal variance on the diagonal, 0 covariance off the diagonal. 16 / Why are data serially correlated? Because the same panel is measured repeatedly over time. Some individual s measures are all relatively high, others relatively low. (The more so when there is greater difference between than within persons). Dependency upon previous observations may also come from practice effects. 2. Why are data heteroskedastic? Growth trajectories tend to fan-in or fan-out. (Typically fan-in during development, fan-out during decline). This makes variance of measures different at different time-points. Highly differential growth leads to independent measures. Consistent growth patterns lead to variance-covariance structure. The aim is to exploit patterns to account for individual growth and change in the context of many different individuals.

17 OLS regression of pooled data Longitudinal data violate a statistical assumption for OLS regression. (Residuals must be IID, Independent and Identically Distributed). Heteroskedastic, (residual variance not identical at each time point). Serial correlation, (residuals depend upon previous residuals). Consequence for OLS regression of pooled longitudinal data: The estimate of the slope may be correct, (provided the data are strongly balanced). The slope standard error is incorrect, (confidence interval and p-value are incorrect). How to account for the variation in the data? Decompose the total variance into between-person and within-person variance. Further decompose between-person variance into variance of growth parameters (intercept, slopes). 17 / 81

18 Linear mixed-effects Preliminary assumptions: Subjects are a random sample of a population. Results are conditional upon the sample. If the sample is random the results are unbiased population estimates. Everyone s growth curve has the same functional form. Different people may have different values for the growth parameters. Assuming straight-line growth for example, different people could have different intercepts and slopes. 18 / 81 A straight line is not the only model for the average person s growth trajectory. It s just the simplest.

19 Between-person variation Everyone has a growth curve. For example a straight-line. Different people have different parameters. For example different intercepts and slopes. Two kinds of parameters: The average intercept and slope. The variation in intercepts and slopes / 81

20 Subject-specific means Each subject has a mean of their own repeated measures. The grand mean β 0 is the average of the subject-specific means. Each subject s own mean may deviate from the grand mean. β β 0 β / 81

21 Unconditional model of the mean An unconditional regression line is a model of the grand mean, (β 0 estimates ȳ): y j = β 0 + e j Suppose the i th subject s mean deviates from the grand mean by u 0i. A model of the i th subject s mean, incorporating the grand mean, is: y ij = β 0 + u 0i + e ij Re-write as a 2-level model, where π 0i represents the i th subject s mean: y ij = π 0i + e ij π 0i = β 0 + u 0i The second level is another model of the mean. Its outcome is the subject-specific means, π 0i. So its intercept β 0 estimates the mean of those means. π 0i are random effects. Here they are subject-specific means, (the means of each subject s repeated measures). β 0 is a fixed effect, an average of random effects. Here it is the grand mean, the mean of the subject-specific means. 21 / 81 If everyone had the same average there would be no need for random effects. The fixed effects would be ordinary regression coefficients where one size fits all. β 0 is the grand mean in the equation y j = β 0 + e j. β 0 is also the average of the subject-specific means π 0i in the equation π 0i = β 0 + u 0i. The point of estimating the grand mean as the average of subject-specific means is to divide the total variance into homogeneous subgroups. It is the same idea as ANOVA with a blocking factor in a split-plot design. The aim is to get a more correct estimate of the standard error.

22 Decomposing variance y ij = β 0 + u 0i + e ij e j N(0, σ 2 e ) u 0i N(0, σ 2 0 ) Deviations from β 0 are divided into two parts: u 0i is the deviation of the i th subject s mean from β 0. e ij is the deviation of the i th subject at the j th time-point from their own mean. π 0i e ij u 0i Within and between-person variance: β 0 Var(y ij ) = Var(β 0 + u 0i + e ij ) = σ σ2 e + 2Cov(u 0i, e ij ) = σ σ2 e σ0 2 is between-person variance of the subject-specific means. σe 2 is within-person residual variance. Collectively these are called the variance components. 22 / 81 Between-person variation is composed of deviations u 0i of subject-specific means π 0i from the grand mean β 0. Within-person variation is composed of deviations e ij of a person s scores from their own mean π 0i. Modelling the subject-specific regressions decomposes the total variation into between-person and within-person components. These variance components are independent of each other, (Cov(u 0i, e ij ) = 0). This decomposition is fundamental to mixed-effects models.

23 Fitting the unconditional model of the mean R: fit = lmer(y 1 + (1 i), data) summary(fit) Random effects: Variance Std.Dev. (Intercept) b Residual c Fixed effects: Estimate Std. Error t value Pr(> t ) (Intercept) a e-12 a. β 0 Average subject-specific intercept, (average of subjects means). b. σ0 2 Between-person variation in intercepts, (means). c. σe 2 Average within-person variation. 23 / 81 In the R model formula: y 1 + (1 i) the 1 denotes the intercept. Read this as: Regress y on the intercept, but treat the intercept as a random effect grouped by i. In other words calculate intercepts by fitting the model y 1 individually to the repeated measures of each subject i.

24 Longitudinal intra-class correlation Where is most of the variation? Within groups, (people), or between, or somewhere in the middle. ICC = σ2 0 σ 2 0 +σ2 e Proportion of total variation that is between-persons. ICC=0 ICC=1 No variation between-persons, (σ0 2 = 0). No difference from regression of pooled data. No change within-person, (σe 2 = 0). People differ only in their mean level. ICC = = % of the total response variation is due to differences in mean level between-persons. 24 / 81 The purpose of the unconditional model is to decompose variance. Low ICC (< 0.2) suggests people are very similar, as if one person. There is no advantage to grouping. High ICC (> 0.8) suggests growth curves are flat and there is little change over time. Then there is little to be gained from repeated measures over time. Medium ICC, (say between 0.2 and 0.8), suggests there is within-person change over time and it is also worth grouping by persons to account for variation in change between-persons.

25 Between-person variation in intercepts and slopes Subject-specific intercepts and slopes. β β 0 β Deviations from the average. e ij u 0i β 0 π 0i π 0i is the i th subject s intercept. β 0 is the average of the subject-specific intercepts. 25 / 81 Th left plot shows each person s subject-specific regression line. The right plot highlights one person s repeated measures and their subject-specific regression line. Between-person difference makes each subject-specific regression line deviate from the average. The i th subject s regression line deviates from the average intercept β 0 by u 0i, and from the average slope β 1 by u 1i. Within-person residuals e ij are deviations of a subject s repeated measures from their own regression line.

26 Straight-line model of the mean conditional upon time The conditional regression line upon time x is: y j = β 0 + β 1 x j + e j A model of the i th person with subject-specific deviations from the average intercept and slope: y ij = (β 0 + u 0i ) + (β 1 + u 1i )x ij + e ij Re-write as a 2-level model: Level-1: y ij = π 0i + π 1i x ij + e ij Level-2: π 0i = β 0 + u 0i π 1i = β 1 + u 1i The second level models are again models of means. The outcomes are subject-specific intercepts π 0i and slopes π 1i. So the intercepts β 0 and β 1 estimate the mean intercept and slope. Random effects π 0i and π 1i are the i th subject s intercept and slope. Fixed effects β 0 and β 1 are the averages of the subject-specific intercepts and slopes. 26 / 81 Compared with the unconditional model, this model has more random effects. In the unconditional model π 0i were subject-specific means. In the conditional model π 0i are subject-specific intercepts and π 1i are subject-specific slopes. To specify the model, you choose which level-1 coefficients you want to treat as random effects. (It doesn t have to be all of them). Each random effect has some variance, (due to individual differences). These are collectively called the variance components. The complete set of model parameters includes both the fixed effects and the variance components. You may mainly be interested in the fixed effects. Then the variance components are nuisance parameters. They are used just to decompose variance so that the fixed effects have correct standard errors. Or the variance components may be of interest in their own right.

27 Variance components Variance is decomposed by subject-specific deviations into between-person variance: Slope [ ] ([ [ ]) u0i 0 σ 2 N, 0 u 1i 0] σ 01 σ1 2 leaving residual within-person variance: e ij N(0, σ 2 e ) β 1 0 σ0 2 is variance of subject-specific intercepts. σ1 2 is variance of subject-specific slopes. σ 01 is intercept-slope covariance. β 0 Intercept σe 2 is within-person residual variance. 27 / 81 The between-person variance components are drawn from a bivariate normal to allow the random effects to covary. Their covariance is an additional variance component. (Generally these include variances and covariances). The plot indicates intercept-slope covariance. Covariance implies a fan-in or fan-out pattern of trajectories. For example with negative covariance people with higher intercepts have a more negative slope. That suggests fan-in. When there is fan-in/out the intercept variance, and hence the intercept-slope covariance, depends upon centering. Slopes converge and cross over at some point. Re-centering can change the size and sign of intercept-slope covariance.

28 Shrinkage estimators Efficient estimator for average subject-specific parameters. Shrink subject-specific estimates towards their mean. The random effects are the estimates after shrinkage. The fixed effects are the averages of the random effects. Slope The amount a subject shrinks depends upon their reliability: β 1 The distance to the mean. The subject s residual variance. The number of non-missing observations of the subject. 0 Unreliable estimates are shrunk more towards the mean. β 0 Intercept Individuals borrow strength from others in that population. Unreliable estimates have less influence on the fixed effects and their standard errors. 28 / 81 Shrinkage estimators are efficient in the statistical sense of having lowest variance in the long run of repeated sampling. The blue dots on the plot are subject-specific estimates, the grey lines are their averages (β 0 and β 1 ), and the arrows show the direction and amount of shrinkage. Subject-specific estimates are considered unreliable when they are distant from the mean, with large residual errors, and missing observations. These are shrunk more. As a result the mean and variance of the whole cloud of points becomes a more reliable estimator of the population. Shrinkage enables subjects with missing values to contribute by allowing them to borrow strength from other subjects.

29 Fitting the standard mixed effects model R: fit = lmer(y x + (x i), data) summary(fit) Random effects: Groups Name Variance Std.Dev. Corr i (Intercept) c x d e Residual f Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) a e-11 x b a. β 0 Average intercept. b. β 1 Average slope. c. σ0 2 Intercept variance. d. σ1 2 Slope variance. e. σ 01 Intercept-slope covariance, (as a correlation coefficient r 01 ). f. σe 2 Average within-person residual variance. Slope-on-intercept regression coefficient: r 01 σ 1 σ 0 = / = / 81 The complete set of model parameters includes both the fixed effects and the variance components. The R summary function reports the variance components as random effects in the upper part of the table. The R model formula has an implied intercept. It could be written as: y 1 + x + (1+x i) Read as: Regress y on the intercept and slope of x, but treat both the intercept and slope as random effects grouped by i, and allow them to covary. The same model with covariance fixed at 0 could be specified: y x + (x i) The intercept-slope covariance is given as a correlation coefficient. To convert between correlation and covariance: r 01 = σ 01 /σ 0 σ 1. Confidence intervals for the variance components are provided by: confint(fit)

30 Fitting the standard mixed effects model STATA:. mixed y x i: x, covariance(unstructured) reml cog Coef. Std. Err. z P> z [95% Conf. Interval] x _cons Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] id: Unstructured var(x) var(_cons) cov(x,_cons) var(residual) / 81 Different programs have their own syntax for specifying models, and report the same results in their own way. Stata calls the intercept cons. The option covariance(unstructured) specifies that there be no structural constraints applied to the variance-covariance of random effects. Here this allows intercept-slope covariance. The option reml specifies that parameter estimation use the REML procedure, (restricted maximum likelihood). This is the default for R.

31 Fitting the standard mixed effects model Mplus: VARIABLE: NAMES = i j y x ; USEVARIABLES = i y x ; WITHIN = x ; CLUSTER = i ; ANALYSIS: TYPE = TWOLEVEL RANDOM ; MODEL: %WITHIN% s y ON x ; %BETWEEN% y WITH s ; Within Level Two-Tailed Estimate S.E. Est./S.E. P-Value Residual Variances Y Between Level Y WITH S Means Y S Variances Y S / 81 Here Mplus calls the intercept Y and the slope of x S. For continuous outcome variables, (as here), Mplus uses FIML (full information maximum likelihood) estimation.

32 REML versus FIML These are methods for estimating parameters and fitting models to data. FIML = full information maximum likelihood. REML = restricted maximum likelihood. Why REML? Variance components estimated by FIML are biased (under-estimated) in small samples. It is because the calculation uses sample regression coefficients β. REML aims to correct small sample bias. It estimates variance components by maximizing the likelihood of residuals without using β. The β are calculated afterwards. REML versus FIML REML estimates of variance components are more accurate than FIML in small samples. They become similar in larger samples. Model comparisons based on likelihood calculated by REML cannot tell a difference in the β. Fixed-effects specification must be tested under FIML. Program defaults: R REML (function lmer) Stata FIML (function mixed) SAS REML (proc mixed) Mplus FIML 32 / 81 The REML procedure is analogous to the correction factor (1/(n 1)) used for estimating population variance from a random sample. Estimating a population variance is biased in small samples because the calculation uses the sample mean. Estimating population variance components is similarly biased because the calculation uses sample regression coefficients β. Variance is corrected using n 1 in the denominator for the average. Variance components are corrected by avoiding β in the REML calculation. Mplus uses FIML. To specify FIML using R: lmer(y x + (x i), data, REML=FALSE)

33 Fitting a latent growth curve model (LGC) Mplus: VARIABLE: NAMES = y1 y2 y3 y4 ; MODEL: i s y1@0 y2@6 y3@12 y4@18 ; y1 (err) ; y2 (err) ; y3 (err) ; y4 (err) ; S Two-Tailed Estimate S.E. Est./S.E. P-Value WITH I Means I S Variances I S Residual Variances Y Y Y Y / 81 Mplus can be used to fit growth curve models in the structural equation modelling framework. These are called latent growth curve models because the estimated growth parameters, (here intercept and slope), are latent variables, (factors). For equivalent results between LGC and mixed effects: 1. Hold residual variances equal across time-points. 2. Use the same coding for time-points. 3. Fit the mixed effects model using FIML.

34 Latent growth curve model y 1 y 2 y 3 y 4 1i+18s i+6s 1i+12s s s i s 1i+0s s i / 81 Squares are observed variables, (outcome at each wave). Circles are latent variables for growth factors. Equivalant to random effects: i = intercept. s = slope. Single-headed arrows point to a regression outcome. Four regression equations solved simultaneously: y1 = i y2 = i + 6s y3 = i + 12s y4 = i + 18s The regression coefficients (factor loadings) are fixed. They represent time-points coded to contrive a growth curve. Double-headed arrows are variances or covariances. The arrows at each y are residual variances. The arrow between i and s is covariance.

35 LGC models versus mixed-effects models Advantages of mixed-effects models Treats time as data in a natural way. Allows individually varying baseline times and intervals. Advantages of latent growth curve models Provides several goodness-of-fit measures. Can link multivariate measurement models into a growth model. Can link growth models into a multivariate structural model. 35 / 81 The main disadvantage is LGC models don t treat time as a variable, but as a structural constraint. It is difficult to allow individually varying time-points. Another disadvantage is LGC models are more susceptable to convergence problems. Mixed-effects models handle missing values straightforwardly. LGC models can have convergence problems here. The main advantage is LGC models are relatively easy to link into more complicated path models.

36 Multivariate measurement models y 11 y 21 y 31 y 12 y 22 y 32 y 13 y 23 y 33 f 1 f 2 f 3 i s 36 / 81 Each time-point is a multivariate measurement model. These are linked into a latent growth curve model. The aim of these models is greater reliability through multivariate measurement models. These measure what is common to the set of indicators at each time-point, and reject differential sources of measurement error. But it is necessary to establish longitudinal measurement invariance to be sure the measurement models measure the same thing in the same way at each time-point.

37 Bivariate (cross-lagged) LGC model y 1 y 2 y 3 y 4 i y s y γ y γ x i x s x x 1 x 2 x 3 x 4 37 / 81 Two growth processes each modelled by a LGC model. The models are linked by cross-lagged regressions. These specify association at the level of growth factors. Is the slope of one process determined by the baseline level of the other process?

38 Laird-Ware mixed-effects model General mixed-effects model for the i th person: Y i = X i β + Z i u i + e i X i is a design matrix for the fixed effects β, Z i is a design matrix for the random effects u i. The columns of Z i are a subset of the columns of X i, (your choice of random effects). Z i must contain only TVCs, (time-varying within-subject covariates, such as time itself). The remaining columns of X i must contain only TICs, (between-subject covariates that are constant over time). The standard model (random intercepts and slopes) in Laird-Ware form: (No TICs are included, so the columns of Z i are all the columns of X i ). y i1 1 x i1 [ ] 1 x i1 [ ] e i1 y i2 y i3 = 1 x i2 β0 1 x i3 + 1 x i2 u0i β 1 1 x i3 + e i2 u 1i e i3 y i4 1 x i4 1 x i4 e i4 β 0 + β 1 x i1 u 0i + u 1i x i1 e i1 = β 0 + β 1 x i2 β 0 + β 1 x i3 + u 0i + u 1i x i2 u 0i + u 1i x i3 + e i2 e i3 β 0 + β 1 x i4 u 0i + u 1i x i4 e i4 y ij = β 0 + β 1 x ij + u 0i + u 1i x ij + e ij = (β 0 + u 0i ) + (β 1 + u 1i )x ij + e ij 38 / 81 The random terms collected together form the composite residual. y ij = β 0 + β 1 x ij + u 0i + u 1i x ij + e ij = β 0 + β 1 x ij + ɛ ij The composite residuals of a mixed-effects model are more complicated than the independent residuals assumed for an OLS regression model. They depend upon time x. This gives the residuals a variance-covariance structure.

39 The composite residual The standard model has a composite residual that depends upon time: y ij = β 0 + β 1 x ij + u 0i + u 1i x ij + e ij }{{} ɛ ij For the standard mixed-effects model: Residual variance (diagonal elements of the variance-covariance matrix) Var(ɛ ij ) = σ 2 e + σ σ 01x ij + σ 2 1 x2 ij Residual covariance between measurement occasions j and j, (off-diagonal elements) Cov(ɛ ij, ɛ ij ) = σ σ 01(x ij + x ij ) + σ 2 1 x ij x ij For the general mixed-effects model: Cov(Y i ) = Z i Cov(u i )Z t i + σ 2 e I n Random effects in the model induce a residual variance-covariance structure. 39 / 81 Residual variance depends upon time. It is heteroskedastic. It may be different at different time-points. Random effects in a model induce a correlation structure, (a pattern of variances and covariances amongst the residuals). Without random effects the variance-covariance matrix reduces to: This represents the OLS assumptions: σ 2 e 0 Cov(Y i ) = σe 2 I n =... 0 σe 2 ˆ ˆ Homoskedasticity = identical variances on the diagonal. Un-correlated = zero covariances off the diagonal. Random effects induce structure (patterns) in the variance-covariance matrix. The model-implied correlation structure depends upon your choice of random effects. The aim is to choose a structure that reflects correlations in the observed data.

40 Time-dependent variance and heteroskedasticity The induced variance-covariance structure depends upon time. Consequently it can reflect heteroskedasticity. Residual variance: σ 2 e + σ σ 01x ij + σ 2 1 x2 ij Minimum: σ 01 /σ 2 1 Residual variance Curvature: 2σ Time (x) Either side of the minimum the variance changes monotonically with time. The location of the minimum determines where variance increases or decreases. Increasing variance with time reflects a fan-out pattern of growth curves. Decreasing variance reflects fan-in. The smaller the slope variance σ1 2 the less curvature and the more homoskedastic. 40 / 81 The diagonal of Cov(Y i ) is the residual variance at different time-points. Homoskedasticity assumes it is the same at all time time-points. Heteroskedasticity means it changes over time. Residual variance depends upon time when the model includes random effects. The form this takes provides some account of heteroskedasticity in the data. The standard model, (random intercepts and slope of time), induces a parabola. This accounts for the typical fan-in/fan-out patterns of growth curves. The time location of the minimum variance depends upon the slope variance and the intercept-slope covariance. The slope variance is usually dominant. Any TVCs added to the model must be added to Z i so they appear at level-1. But the induced variance-covariance also depends upon Z i. Therefore adding further TVCs makes the variance-covariance structure more complex and time-dependent.

41 Correlation structure Correlation structure is a pattern in the variance-covariance matrix. The pattern in the block of the i subject s residuals is assumed the same for all subjects. Unstructured assumes no pattern. All variances and covariances may be different. σ 2 1 σ Cov(Y i ) = σ σn 2 Independence assumes a strong pattern, (the OLS assumption). All variances are equal and all covariances are 0. Cov(Y i ) = σe 2 In = σ 2 e σ 2 e 41 / 81 The i th subject has n repeated measures, i = 1,..., n. Time-dependent variance on the diagonal reflects heteroskedasticity. Time-dependent covariance off the diagonal reflects serial correlation.

42 Covariance patterns and serial correlation Two ways to add covariance structure: 1. Your choice of random effects induces a certain correlation structure. 2. Some programs provide options for a range of correlation structures. These aim to account for patterns of serial correlation. Independence. The matrix has a diagonal structure. All variances are equal and all covariances are 0. Exchangeable, (compound symmetry). All variances are equal, and all covariances are equal. Toeplitz. All variances are equal. Covariance is the same across equal time intervals. This leads to a diagonally banded structure. AR(1). First-order autoregressive relationship between successive time-points: e ij = ρe ij 1 + w ij All variances are equal. Covariance decreases as the time interval increases. Unstructured. No constraints. Every variance and covariance is free to be estimated. 42 / 81 Correlation structure exploits stable patterns of residual variance-covariance in order to apply constraints and reduce the number of parameters to estimate. It is a trade-off between model fit and degrees of freedom. Unstructured correlation may give a better fit but the model may be unestimable. There may not be enough unique bits of information in the data to estimate all the required parameters.

43 Parallel slopes model Regression equation for the i th person with subject-specific deviations from the average intercept: y ij = (β 0 + u 0i ) + β 1 x ij + e ij = β 0 + β 1 x ij + u 0i + e ij } {{ } ɛ ij Var(ɛ ij ) = σ 2 e + σ2 0 Cov(ɛ ij, ɛ ij ) = σ 2 0 Compound symmetry, (sphericity): Residual variance-covariance is not time-dependent. Variance is constant at all time-points (homoskedsticity). Covariance is equal between any pair of time-points. R: fit = lmer(y x + (1 i), data) summary(fit) Random effects: Groups Name Variance Std.Dev. Corr id (Intercept) Residual Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) e-14 x / 81 The parallel slopes model induces exchangeable correlation structure. All variances are equal, all covariances (across time-points) are equal. σe 2 + σ0 2 σ Cov(Y i ) = σ σe 2 + σ2 0 Also called compound symmetry, (or sphericity ). Observations separated in time are assumed to be correlated, but the correlation is assumed to be the same between any pair of time-points regardless of how far apart in time. Compared with the standard model, (random intercepts and slopes): ˆ ˆ ˆ The fixed effects are the same. Residual variance within-person is higher. The restriction of parallel slopes does not fit so well. The slope variation has been lumped into the residual variance. Standard errors for fixed effects are lower. There are fewer parameters to estimate, (no slope variance or intercept-slope covariance). More degrees of freedom.

44 Model specification Two sides of model specification: 1. Specify the functional form of the growth model. For example a straight-line, or a quadratic curve, etc. 2. Specify the residual variance-covariance. This has two sides: a. Choose random effects. Your choice induces a variance-covariance structure. b. Specify program options for variance-covariance structure, if provided. How to choose random effects? This can be guided by model goodness-of-fit and comparison. 44 / 81

45 Model comparison Assess random effects specifications by comparing nested models fitted to the same data using FIML. AIC, BIC, and log likelihood, (lowest is best). Likelihood ratio test, (chi-squared test of difference in goodness-of-fit). R: fit1 = lmer(y x + (1 i), data, REML=FALSE) a fit2 = lmer(y x + (x i), data, REML=FALSE) b fit3 = lmer(y x + (x i), data, REML=FALSE) c anova(fit1, fit2, fit3) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) a. Random intercept only (parallel slopes). b. Independent random intercepts and slopes. c. Covarying random intercepts and slopes, (unstructured). 45 / 81 Which model fits best? BIC suggests model (a). AIC suggests model (b). Model comparison by the LR test suggests there is no significant difference between models (a) and (b), or between models (b) and (c). Conclusion: If the fixed effects are the main interest and the variance components are nuisance parameters, the random intercept only model (a) might be preferred. If the variance components are of interest the random intercept and slope model (b) might be preferred. There is no significant benefit to allowing intercept-slope covariance.

46 Model comparison Stata: mixed y x i: a estimates store fit1 mixed y x i: x b estimates store fit2 mixed y x i: x, covariance(unstructured) c estimates store fit3 lrtest fit1 fit2 lrtest fit2 fit3 Likelihood-ratio test LR chi2(1) = 3.51 (Assumption: fit1 nested in fit2) Prob > chi2 = Likelihood-ratio test LR chi2(1) = 1.16 (Assumption: fit2 nested in fit3) Prob > chi2 = a. Random intercept only (parallel slopes). b. Independent random intercepts and slopes. c. Covarying random intercepts and slopes. 46 / 81 The same model comparison procedure using LR tests in Stata.

47 Including covariates to explain away residual variance In the standard 2-level model: Level-1 is the within-person or individual level. Level-2 is the between-person or group level. The levels decompose within and between-person variance. Between-person variance is further decomposed into variance of growth parameters. One kind of variance might be the research interest, the others a nuisance to be controlled. Either way, variance is explained by including covariates. Covariates are classified according to the kinds of variation they can explain. Time-varying covariates (TVCs) are variables that change over time, (eg. age). They explain variation within-person. Time-invariant covariates (TICs) are variables that are constant over time, (eg. sex). They explain variation between-persons. 47 / 81 Level-1 describes change in the i th person using variables that change over time. Level-2 describes differences in change between-persons using time-invariant variables that have different levels for different people. Time-invariant variables have time-invariant effects. They explain individual differences that are constant over time. This does not imply there is no differential growth. A straight-line model with random slopes, for example, allows people to grow differently with a constant difference in their growth rates. The order of the difference is determined by the model. A quadratic model, for example, allows the growth rate to change but assumes constant 2nd-order difference in curvature.

48 Longitudinal dataset with a TIC Wide format y.1 y.2 y.3 y.4 z Willett J. B. (1988) Review of research in education, p Long format i j y x z i th person j th measurement occasion. 48 / 81 A panel of 35 subjects were assessed at baseline for their cognitive function (z). The subjects were given an opposites naming task on each of four consecutive days. They were given a long list of words and had to name the opposite of words as quickly as possible. The data were the count of how many opposites they could name in 10 minutes. y.1 are the counts of the 35 persons on day 1, and so forth. The researcher was interested in whether clever people s performance improved at a faster rate. Their baseline cognitive function was assumed not to change over the four days. These data are strongly balanced: everyone has the same measurement times x with no missing time-points. Variable z is a TIC. By definition it does not change over time. It needs only to be measured once in each person, for example at baseline. TICs in long format must be repeated within-person at each time-point.

49 TICs and residual between-person variance A TIC can only explain between-person variation. It cannot explain within-person variation because it is constant within-person. Between-person variation is further decomposed into growth parameters, (eg. intercept and slopes). A TIC can be used to explain some or all of these parts, depending upon how it enters the model. TIC effect on the intercept only: Level-1: Level-2: y ij = (β 00 + β 01 z i + u 0i ) + (β 10 + u 1i ) x ij + e ij } {{ } } {{ } π 0i π 1i y ij = π 0i + π 1i x ij + e ij π 0i = β 00 + β 01 z i + u 0i π 1i = β 10 + u 1i TIC effects on the intercept and slope: Level-1: Level-2: y ij = (β 00 + β 01 z i + u 0i ) + (β 10 + β 11 z i + u 1i ) x ij + e ij } {{ } } {{ } π 0i π 1i y ij = π 0i + π 1i x ij + e ij π 0i = β 00 + β 01 z i + u 0i π 1i = β 10 + β 11 z i + u 1i 49 / 81 TICs appear as level-2 covariates. TICs are assumed to stay constant within subjects. It makes no sense for TICs to have random effects within subjects. They don t change within subjects so they can t change differently between subjects. One subject s constant TIC value may be different from another s. So there may be a TIC effect between subjects. For example sex may have a fixed effect upon the slope. The slope may be different between female and male.

50 TIC direct effect and cross-level interaction TIC effect on the intercept enters the model as a direct (main) effect. y ij = (β 00 + β 01 z i ) + β 10 x ij + ɛ ij = β 00 + β 10 x ij + β 01 z i }{{} +ɛ ij direct effect R: lmer(y 1 + x + z + (1+x i), data) lmer(y x + z + (x i), data) # shorthand (implied intercept) TIC effects on the intercept and slope of time enter the model as a direct effect and cross-level interaction with time, (a product term). y ij = (β 00 + β 01 z i ) + (β 10 + β 11 z i )x ij + ɛ ij = β 00 + β 10 x ij + β 01 z i }{{} direct effect + β 11 z i x ij +ɛ ij }{{} interaction R: lmer(y 1 + x + z + z:x + (1+x i), data) lmer(y x * z + (x i), data) # shorthand 50 / 81 Error terms are collected into a composite residual ɛ ij for convenience. The models include random intercept and slope of time (x) and their covariance. The R formula syntax tries to look like the model equation. The cross-level interaction describes how an individual-level variable such as the slope of time (at level-1) is moderated by a group-level variable such as a TIC, (at level-2). Interactions depend upon their constituent direct effects and how they are centered. If interaction x:z is significant, the effect of x is conditional upon the value z is centered on, and vice versa.

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Longitudinal Data Analysis of Health Outcomes

Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: Summary of building unconditional models for time Missing predictors in MLM Effects of time-invariant predictors Fixed, systematically varying,

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information

Recent Developments in Multilevel Modeling

Recent Developments in Multilevel Modeling Recent Developments in Multilevel Modeling Roberto G. Gutierrez Director of Statistics StataCorp LP 2007 North American Stata Users Group Meeting, Boston R. Gutierrez (StataCorp) Multilevel Modeling August

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

More information

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures Describing Within-Person Fluctuation over Time using Alternative Covariance Structures Today s Class: The Big Picture ACS models using the R matrix only Introducing the G, Z, and V matrices ACS models

More information

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Review of Unconditional Multilevel Models for Longitudinal Data

Review of Unconditional Multilevel Models for Longitudinal Data Review of Unconditional Multilevel Models for Longitudinal Data Topics: Course (and MLM) overview Concepts in longitudinal multilevel modeling Model comparisons and significance testing Describing within-person

More information

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only) CLDP945 Example 7b page 1 Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only) This example comes from real data

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models EPSY 905: Multivariate Analysis Spring 2016 Lecture #12 April 20, 2016 EPSY 905: RM ANOVA, MANOVA, and Mixed Models

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Describing Within-Person Change over Time

Describing Within-Person Change over Time Describing Within-Person Change over Time Topics: Multilevel modeling notation and terminology Fixed and random effects of linear time Predicted variances and covariances from random slopes Dependency

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Review of Multilevel Models for Longitudinal Data

Review of Multilevel Models for Longitudinal Data Review of Multilevel Models for Longitudinal Data Topics: Concepts in longitudinal multilevel modeling Describing within-person fluctuation using ACS models Describing within-person change using random

More information

Describing Nonlinear Change Over Time

Describing Nonlinear Change Over Time Describing Nonlinear Change Over Time Longitudinal Data Analysis Workshop Section 8 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 8: Describing

More information

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes 1/64 to : Modeling continuous longitudinal outcomes Dr Cameron Hurst cphurst@gmail.com CEU, ACRO and DAMASAC, Khon Kaen University 4 th Febuary, 2557 2/64 Some motivational datasets Before we start, I

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 217, Boston, Massachusetts Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Modelling the Covariance

Modelling the Covariance Modelling the Covariance Jamie Monogan Washington University in St Louis February 9, 2010 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 1 / 13 Objectives By the end of this meeting, participants

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Lmer model selection and residuals Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline The London Schools Data (again!) A nice random-intercepts, random-slopes

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions) Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions) CLP 948 Example 7b page 1 These data measuring a latent trait of social functioning were collected at

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Multilevel/Mixed Models and Longitudinal Analysis Using Stata

Multilevel/Mixed Models and Longitudinal Analysis Using Stata Multilevel/Mixed Models and Longitudinal Analysis Using Stata Isaac J. Washburn PhD Research Associate Oregon Social Learning Center Summer Workshop Series July 2010 Longitudinal Analysis 1 Longitudinal

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

Interactions among Continuous Predictors

Interactions among Continuous Predictors Interactions among Continuous Predictors Today s Class: Simple main effects within two-way interactions Conquering TEST/ESTIMATE/LINCOM statements Regions of significance Three-way interactions (and beyond

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Value Added Modeling

Value Added Modeling Value Added Modeling Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for VAMs Recall from previous lectures

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Lecture 3 Linear random intercept models

Lecture 3 Linear random intercept models Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

1 The basics of panel data

1 The basics of panel data Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge

More information

Estimation and Centering

Estimation and Centering Estimation and Centering PSYED 3486 Feifei Ye University of Pittsburgh Main Topics Estimating the level-1 coefficients for a particular unit Reading: R&B, Chapter 3 (p85-94) Centering-Location of X Reading

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Douglas Bates Department of Statistics University of Wisconsin - Madison Madison January 11, 2011

More information

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology Serial Correlation Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 017 Model for Level 1 Residuals There are three sources

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Economics 308: Econometrics Professor Moody

Economics 308: Econometrics Professor Moody Economics 308: Econometrics Professor Moody References on reserve: Text Moody, Basic Econometrics with Stata (BES) Pindyck and Rubinfeld, Econometric Models and Economic Forecasts (PR) Wooldridge, Jeffrey

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information