Part III: Linear mixed effects models

Size: px
Start display at page:

Download "Part III: Linear mixed effects models"

Transcription

1 Part III: Linear mixed effects models 159 BIO 245, Spring 2018

2 Dental growth data Females Males Length, mm Length, mm Age, years Age, years One possible scientific goal is to estimate and formally compare the average growth trajectory between males and females 160 BIO 245, Spring 2018

3 In Part II we considered two-stage least squares as one possible analysis strategy: (1) estimate the growth trajectory for each child: Y k = Z k β k + ǫ k child-specific regression parameters, β k ǫ k are observation-specific random variations around each subjects underlying growth curve assumed to be i.i.d with mean 0 and variance σ 2 k (2) consider differences in the child-specific coefficients between males and females β k = W k β + γ k β characterizes the mean growth curve in the population (of children) γ k are child-specific deviations around the population growth curve assumed to be i.i.d with mean 0 and variance-covariance matrix G 161 BIO 245, Spring 2018

4 Results: Fitted lines Coefficients Length, mm Slope Age, years Intercept > summary(lm(beta1 ~ gender, data=betamat))... Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) gender * BIO 245, Spring 2018

5 Issues The design matrix is constrained at each stage of the analysis stage 1 is restricted to within-subject covariates stage 2 is restricted to between-subject covariates Information is lost by having summarized the response vector for subject k at stage 1 Differential uncertainty in the β k is not accounted for at stage 2 estimates, not the true β k uncertainty depends on n k and when the observations are collected The fact that observations are correlated is ignored All of these are key motivators for combining stages 1 and 2 into a single model formulation 163 BIO 245, Spring 2018

6 Linear mixed effects models Basic idea Assume that each cluster has a regression model characterized by cluster-specific regression parameters Structure the cluster-specific regressions across the population of clusters via a series of p fixed effects parameters that are common to all clusters in the population q random effects parameters that permit cluster-specific perturbations As we will see, both sets of parameters allow for differences in the regressions across subjects 164 BIO 245, Spring 2018

7 Notation Y k = (Y k1, Y k2,..., Y knk ) T Response vector β = (β 1, β 2,..., β p ) T X ki = (X ki,1, X ki,2,..., X ki,p ) X k = (X k1, X k2,..., X knk ) T γ k = (γ k1, γ k2,..., γ kq ) T Z ki = (Z ki,1, Z ki,2,..., Z ki,q ) Z k = (Z k1, Z k2,..., Z knk ) T ǫ k = (ǫ k1, ǫ k2,..., ǫ knk ) T Fixed effects Design matrix for the fixed effects n k p Random effects Design matrix for the random effects n k q Residual error terms 165 BIO 245, Spring 2018

8 Definition By a linear mixed effects model, we mean a statistical model with the following assumptions/components: model for the response of the k th cluster, given the fixed and random effects: Y k = X k β + Z k γ k + ǫ k model for the random effects and residual error: E[γ k ] = 0 E[ǫ k ] = 0 Cov[γ k ] = G(α) Cov[ǫ k ] = R k (α) Cov[γ k,ǫ k ] = 0 α referred to as covariance parameters responses across clusters are independent of eachother 166 BIO 245, Spring 2018

9 Note, this specification defines a marginal linear model: E[Y k X k ] = E γ [E Y [Y k X k, γ k ]] = E γ [X k β + Z k γ k ] = X k β µ k (β) Cov[Y k X k ] = Cov γ [E Y [Y k X k, γ k ]] + E γ [Cov Y [Y k X k, γ k ]] = Cov γ [X k β + Z k γ k ] + E γ [R k (α)] = Z k G(α)Zk T + R k (α) Σ k (α) Cov[Y k,y k ] = 0 In principle, therefore, we could use the methods developed in Part II to estimate components of this model 167 BIO 245, Spring 2018

10 From the expression for Σ(α), we can see that G k (α) and R k (α) jointly characterize variation in the observed responses around the population mean induced by β Σ k (α) = Z k G(α)Z T k + R k (α) G(α) is a q q matrix that quantifies random variation in the trajectories across clusters R k (α) is a n k n k matrix that quantifies random variation within subjects 168 BIO 245, Spring 2018

11 Interpretation Consider the interpretation of β in: Y k = X k β + Z k γ k + ǫ k Following standard principles, the components of β correspond to differences in the mean response between specific populations determined by the specific covariate under consideration require holding other terms constant, including the random effects interpretation is therefore conditional on the random effect Since the γ k are continuous, in theory they are unique to each cluster interpretation of β is therefore cluster-specific Some folks find this unappealing simultaneously thinking about averages and yet we are conditioning on a specific cluster 169 BIO 245, Spring 2018

12 Recall, however, the induced marginal model: E[Y k X k ] = X k β the interpretation of β in this model does not require conditioning on the subject interpretation is therefore marginal with respect to cluster membership more in line with our usual interpretation of regression parameters The main point here is that despite the model being conditionally specified, since the true conditional and marginal parameters (i.e. the regression coefficients) are the same numerically one can interpret results from a linear mixed effects model as pertaining to marginal associations relevant key question becomes how do you estimate β as well as perform inference? 170 BIO 245, Spring 2018

13 Q: What about the interpretation of the random effects? often interpreted as latent characteristics that are specific to the subject represent the collective impact of factors relevant to the response but not included in the model Need to be careful, however, not to interpret their inclusion as adjusting for unmeasured confounding 171 BIO 245, Spring 2018

14 Special case #1: Random intercepts model Arguably the simplest mixed effects model is one in which a single random intercept is introduced: Y ki = X T kiβ + γ 0k + ǫ ki γ 0k is a cluster-specific deviation around the population intercept, β 0 assume E[γ 0k ] = 0 and V[γ 0k ] = σ 2 γ assume E[ǫ ki ] = 0 and V[ǫ ki ] = σ 2 ǫ Consider the induced marginal covariance, Σ k, by considering two specific study units: Y ki = X T kiβ + γ 0k + ǫ ki Y kj = X T kjβ + γ 0k + ǫ kj notice how the two observations share the same value of γ 0k 172 BIO 245, Spring 2018

15 We then have: V[Y ki ] = V γ [E Y [Y ki γ 0k ]] + E γ [V Y [Y ki γ 0k ]] = σ 2 γ + σ 2 ǫ Cov[Y ki,y kj ] = Cov γ [E Y [Y ki γ 0k ],E Y [Y kj γ 0k ]] = σ 2 γ + E γ [Cov Y [Y ki,y kj γ 0k ]] so that for α = (σγ, 2 σǫ): 2 Σ(α) = σ 2 γ +σ 2 ǫ σ 2 γ... σ 2 γ σγ 2 σγ 2 +σǫ 2... σγ σ 2 γ σ 2 γ... σ 2 γ +σ 2 ǫ 173 BIO 245, Spring 2018

16 Consequently, the induced marginal model has an exchangeable dependence structure Notice that the correlation between two study units is: Cor[Y ki,y kj ] = σ 2 γ σ 2 γ + σ 2 ǫ = Between Between + Within [0,1] only positive correlation is entertained degree of correlation depends on the interplay between the between-cluster variation and the within-cluster variation in the responses if σ 2 γ = 0 there is no correlation if σ 2 ǫ = 0 there is perfect (positive) correlation 174 BIO 245, Spring 2018

17 Visualization of K=10 clusters with n k =5: σ γ = 1; σ ε = 1 σ γ = 1; σ ε = 0 Y ki Y ki T ki T ki σ γ = 0; σ ε = 1 σ γ = 2; σ ε = 1 Y ki Y ki T ki T ki 175 BIO 245, Spring 2018

18 Special case #2: Random intercepts/slopes model The random intercept model can be extended to permit the effects of select covariates to vary across the clusters For example in a longitudinal study in which T ki is the timing of the i th observation in the k th cluster, we might specify the model: Y ki = X T kiβ + γ 0k + T ki γ 1k + ǫ ki γ 0k is a cluster-specific deviation around the population intercept γ 1k is a cluster-specific deviation around the population slope for time assume T ki is an element of X ki assume Cov[γ k ] = G(α) = Σ γ,00 Σ γ,01 Σ γ,01 Σ γ,11 assume E[ǫ ki ] = 0 and V[ǫ ki ] = σǫ BIO 245, Spring 2018

19 Again consider the induced marginal covariance, Σ k, by considering two specific study units: Y ki = β 0 + β 1 T ki + γ 0k + γ 1k T ki + ǫ ki Y kj = β 0 + β 1 T kj + γ 0k + γ 1k T kj + ǫ kj for simplicity, just have the intercept and a term for T ki in X k We then have: V[Y ki ] = V γ [E Y [Y ki γ k ]] + E γ [V Y [Y ki γ k ]] = Σ γ,00 + 2Σ γ,01 T ki + Σ γ,11 T 2 ki + σ 2 ǫ Cov[Y ki,y kj ] = Cov γ [E Y [Y ki γ],e Y [Y kj γ k ]] + E γ [Cov Y [Y ki,y kj γ k ]] = Σ γ,00 + Σ γ,01 (T ki +T kj ) + Σ γ,11 T ki T kj 177 BIO 245, Spring 2018

20 In contrast to the exchangeable structure induced by the random intercepts model, the random intercepts/slopes model permits: heteroskedasticity across the study units within a cluster covariance (correlation) to depend on the two study units under consideration Note both of these extensions are specific to the covariate whose slope is permitted to vary across the clusters In principle, one could include any element of X k in Z k depending on what we believe about the structure of dependence the scientific interest(s) 178 BIO 245, Spring 2018

21 Visualization of K=10 clusters with n k =5: Σ γ00 = 1; Σ γ01 = 0; Σ γ11 = 1; σ ε = 0 Σ γ00 = 0.1; Σ γ01 = 0; Σ γ11 = 1; σ ε = 0 Y ki Y ki T ki T ki Σ γ00 = 1; Σ γ01 = 0; Σ γ11 = 0.1; σ ε = 0 Σ γ00 = 1; Σ γ01 = 0.8; Σ γ11 = 1; σ ε = 0 Y ki Y ki T ki T ki 179 BIO 245, Spring 2018

22 Special case #3: Random intercepts/serial dependence model The two special cases we ve seen so far build flexibility in the first component of: Cov[Y k X k ] = Σ k (α) = Z k G(α)Z T k }{{} Random effects + R k (α) }{{} Residual error i.e. solely via the random effects An alternative strategy to introducing flexibility is to explicitly model the correlation structure in the R(α) i.e. move beyond and permit the ǫ ki to be correlated R k (α) = σ 2 ǫi nk 180 BIO 245, Spring 2018

23 In a longitudinal study, one could do this by postulating that the residual error terms arise, in part, due to some (latent) stochastic process introduces serial dependence For example, the random intercepts/serial dependence model: Y ki = X T kiβ + γ 0k + W k (T ki ) + ǫ ki γ 0k is a cluster-specific random intercept assume E[γ 0k ] = 0 and V[γ 0k ] = σ 2 γ W k (T ki ) is a serial dependence term assume E[ǫ ki ] = 0 and V[ǫ ki ] = σ2 ǫ Assume the stochastic process W k ( ) is mean zero and is characterized by its covariance function: Cov[W k (T ki ), W k (T kj )] = σ 2 Wρ(U k,ij ) where U k,ij = T ki T kj 181 BIO 245, Spring 2018

24 Note, the ǫ ki terms in this model don t have the same interpretation as in the two previous models capture variation in the error terms beyond that explained by W k ( ) Under this model, assuming independence between γ 0k and W k ( ), for two specific study units we have: Y ki = β 0 + β 1 T ki + γ 0k + W k (T ki ) + ǫ ki Y kj = β 0 + β 1 T kj + γ 0k + W k (T kj ) + ǫ kj V[Y ki ] = V γ [E Y [Y ki γ 0k ]] + E γ [V Y [Y ki γ 0k ]] = σ 2 γ + σ 2 W + σ 2 ǫ Cov[Y ki,y kj ] = Cov γ [E Y [Y ki γ 0k ],E Y [Y kj γ 0k ]] + E γ [Cov Y [Y ki,y kj γ 0k ]] = σγ 2 + σwρ(u 2 k,ij ) 182 BIO 245, Spring 2018

25 Notice that the correlation between two study units is: Cor[Y ki,y kj ] = σ2 γ + σ 2 W ρ(u k,ij) σ 2 γ + σ 2 W + σ2 ǫ Specific examples for the correlation function include: ρ(u k,ij ) = ρ U kij ρ(u k,ij ) = exp{ U kij /range} auto-regressive model exponential spatial correlation 183 BIO 245, Spring 2018

26 Visualization of K=10 clusters with n k =5 based on an auto-regressive correlation function for W k ( ): σ γ = 0; σ W = 1; ρ W = 1.0; σ ε = 0 σ γ = 0; σ W = 1; ρ W = 0.8; σ ε = 0 Y ki Y ki T ki T ki σ γ = 0; σ W = 1; ρ W = 0.0; σ ε = 0 σ γ = 0; σ W = 2; ρ W = 0.8; σ ε = 0 Y ki Y ki T ki T ki 184 BIO 245, Spring 2018

27 Comment The random intercepts/slopes and the random intercepts/serial dependence models both structure Σ k (α) such that correlation among study units depends, in part, on the timing of their measurement One way of thinking how to distinguish between the two models is that the: random intercepts/slopes model structures dependence globally random intercepts/serial dependence structures dependence locally 185 BIO 245, Spring 2018

28 Estimation/inference Model components: response of the k th cluster, given the fixed and random effects: Y k = X k β + Z k γ k + ǫ k random effects and residual error: E[γ k ] = 0 E[ǫ k ] = 0 Cov[γ k ] = G(α) Cov[ǫ k ] = R k (α) Cov[γ k,ǫ k ] = 0 responses across clusters are independent of eachother If we are to proceed on the basis of likelihood-based estimation/inference we need to specify a complete probability distribution for the data 186 BIO 245, Spring 2018

29 The marginalized likelihood The most common way forward is to assume that and that γ k MVN q (0,G(α)) ǫ k MVN nk (0,R k (α)) γ k ǫ k Given these assumptions, we can write down the marginal distribution of the response for the k th cluster as: f Y (Y k β,α) = f(y k,γ k β,α) γ k = f Y γ (Y k γ k,β,α)f γ (γ k α) γ k marginal with respect to the random effects a convolution of two multivariate Normal distributions 187 BIO 245, Spring 2018

30 Since the convolution of two Normal distributions is itself a Normal distribution, and using results we derived on slide 167, we can write that: Y k β,α MVN nk (X k β, Σ k (α)) where Σ k (α) = Z k G(α)Z T k +R k (α) Use this as a basis for an intergated or marginal likelihood that can be used to perform estimation/inference with respect to (β, α): L(β,α) = K k=1 f Y (Y k β,α) assume that contributions from different clusters are independent From this, the log-likelihood is proportional to: l(β,α) K K log Σ k (α) + (Y k X k β) T Σ k (α) 1 (Y k X k β) k=1 k=1 188 BIO 245, Spring 2018

31 ML estimation of β Differentiating with respect to β yields the score: U β (β,α) = β l(β,α) = K k=1 X T k Σ k (α) 1 (Y k X k β) which yields the familiar MLE for β given α: ( K ) 1( K ) β(α) = Xk T Σ k (α) 1 X k Xk T Σ k (α) 1 Y k k=1 k=1 The covariance of β(α) is based on the inverse of the Fisher information matrix: [ 2 ] K I β (β,α) = E β β T l(β,α) = Xk T Σ k (α) 1 X k k=1 denoted this by A(α) when discussing least squares estimation 189 BIO 245, Spring 2018

32 Notice that the MLE is equivalent to the WLS estimator with W taken to be Σ k (α) 1 we could therefore use the theory of WLS to perform estimation/ inference for β since β(α) is a linear combination of the Y k, we have: β(α) MVN p (β, A(α) 1 ) since α is unknown, we could plug-in any consistent estimator and appeal to Slutsky s Theorem to see that: as K A( α) 1/2 ( β( α) β) MVN p (0, I) note, it is not sufficient for n k for fixed K 190 BIO 245, Spring 2018

33 ML estimation of α Differentiating the log-likelihood with respect to α yields: U α (β,α) = α l(β,α) K [ = (Y k X k β) T Σ k (α) 1 Σ k α Σ k(α) 1 (Y k X k β) k=1 trace ( )] Σ 1 Σ k k α In general one won t be able to solve U α (β,α) = 0 analytically However one proceeds, the solution α must satisfy the constraint that the resulting covariance matrix Σ( α) is positive definite several computation methods for parameterizing Σ(α) so that this constraint is satisfied e.g. Cholesky decomposition 191 BIO 245, Spring 2018

34 REML estimation Recall that, in general, the MLE for variance components in a multivariate Normal exhibit small sample bias intuition is that the standard ML estimator doesn t acknowledge the fact that the mean (i.e. β) is estimated implications for inference in small samples When considering estimation/inference for general linear models for dependent data we derived the REML estimator as an alternative REML can also be used here Operationally, the strategy involves the use of the linear transformation: Y (Z, β) where Z = B T Y with B the N (N p) matrix defined by the requirements that BB T = A and B T B = I. 192 BIO 245, Spring 2018

35 Exploiting the facts that E[Z] = 0 and Cov[Z, β] = 0, the pdf of Z, expressed as a function of Y, is proportional to: f Y (Y ) f( β) = (2π) (N p)/2 Σ(α) 1/2 X T Σ(α) 1 X 1/2 exp { 1 } 2 (Y X β) T Σ(α) 1 (Y X β) The REML estimator of Σ, therefore, maximizes the so-called restricted log-likelihood: l (α) log Σ log X T Σ(α) 1 X (Y X β) T Σ(α) 1 (Y X β) 193 BIO 245, Spring 2018

36 Inference From an inferential perspective, one might be interested in answering questions regarding the mean model and/or the dependence model Mean model: formally evaluate the association between some exposure of interest and the response compare competing specifications of some association investigate potential effect modification Dependence model: explain the nature of random variation in the data valid inference for the fixed effects requires a correct structure for Σ k (α) avoid over-parameterization which can lead to inefficient inference for the fixed effects 194 BIO 245, Spring 2018

37 Inference for the mean model Consider testing the fixed effects in nested linear mixed models: H 0 : β = β 1 0 versus H 1 : β = β 1 β 2 Usual Wald test will be valid, regardless of whether ML or REML was used If ML was used, one could also use a likelihood ratio test If REML was used, a likelihood ratio test will not be valid X k β under H 0 is not the same as X k β under H 1 correspondingly different Z = B T Y under H 0 and H 1 REML likelihoods are based on different sets of observations differences in REML deviances are not meaningful 195 BIO 245, Spring 2018

38 Inference for the dependence structure Consider testing whether the random slopes in a random intercepts/slopes model are necessary i.e. compare the fit based on to that based on random intercepts/slopes model: Y ki = X T kiβ + γ 0k + T ki γ 1k + ǫ ki to that based on the random intercepts model: Y ki = X T kiβ + γ 0k + ǫ ki Recall the covariance structure for the random intercepts and slopes: Cov[γ k ] = G(α) = Σ γ,00 Σ γ,01 Σ γ,01 Σ γ, BIO 245, Spring 2018

39 We can therefore structure the hypotheses for this question as: H 0 : G(α) = Σ γ, H 1 : G(α) = versus Σ γ,00 Σ γ,01 Σ γ,01 Σ γ,11 Notice that under the null, Σ γ,11 = 0 G(α) under H 0 is the boundary of the parameter space defined by the alternative violation of a standard assumption used to establish the (asymptotic) χ 2 distribution of the likelihood ratio test statistic 197 BIO 245, Spring 2018

40 In general, when testing variance components, the asymptotic sampling distribution of the LRT statistic is a mixture of χ 2 distributions nature of the mixture depends on the hypotheses one is testing For H 0 and H 1 above, the correct sampling distribution is a 50:50 mixture of a χ 2 2 distribution and a χ 2 1 distribution Density of χ LRT statistic Density of the mixture LRT statistic 198 BIO 245, Spring 2018

41 See that the naïve use of a χ 2 2 distribution will lead to overly conservative inference for a given value of the test statistic, the p-value based on a χ 2 2 distribution will be bigger than the p-value based on the correct mixture fail to reject H 0 when one should reject it leads to an incorrect over-simplification of the model for the dependence structure Additional details can be found in: Self and Liang (JASA, 1987) Stram and Lee (Biometrics, 1994) Finally, note that, in contrast to likelihood-based inference for β, one can perform valid likelihood-based inference for α whether ML or REML is used X k β is the same under H 0 and H 1 REML likelihoods are based on the same set of observations 199 BIO 245, Spring 2018

42 Empirical Bayes So far estimation/inference for (β, α) has been based on a marginalized likelihood in which the random effects have been integrated out In some settings, one might be interested in the random effects themselves visualization of heterogeneity across clusters characterization of cluster-specific associations profiling and ranking If (β, α) were known, one could proceed within the Bayesian paradigm for the unknown γ k : take the conditional distribution of the data as the likelihood take the distributional assumptions for the random effects as the prior the posterior of γ k given the data Y k : π(γ k Y k,β,α) L(γ k ;Y k,β,α) π(γ k α) 200 BIO 245, Spring 2018

43 As an estimate of γ k, one might compute and report the mean of the posterior distribution: γ k = E[γ k Y k,β,α] For a given form of this expectation, one could plug in estimates of (β, α) based on ML or REML Referred to as the empirical Bayes estimator empirical in the sense that the data was used to inform the prior i.e. we plugged α into π(γ k α) 201 BIO 245, Spring 2018

44 Empirical Bayes in a simple setting Consider the simple setting in which there are no covariates: Y ki = β 0k + ǫ ki β 0k Normal(β 0, τ 2 ) ǫ ki Normal(0, σ 2 ) The induced marginal distribution of the response vector is: Y k1 Y k2... Y knk MVN nk β 0 β 0... β 0, τ 2 +σ 2 τ 2... τ 2 τ 2 τ 2 +σ 2... τ τ 2 τ 2... τ 2 +σ BIO 245, Spring 2018

45 It is relatively straightforward to show that the ML estimator of β 0, the population mean, is simply the WLS estimator: ˆβ 0 = k w ky k k w k where w k = τ 2 +σ 2 /n k obtain estimates of σ 2 and τ 2 via ML or REML Now consider the joint distribution of the cluster-specific mean, Y k and β 0k : Y k β 0k MVN 2 β 0 β 0, τ2 +σ 2 /n k τ 2 τ 2 τ BIO 245, Spring 2018

46 The corresponding conditional distribution of β 0k Y k is a Normal with mean E[β 0k Y k,β 0,σ 2,τ 2 ] = β 0 + = τ 2 τ 2 +σ 2 /n k (Y k β 0 ) σ 2 /n k τ 2 +σ 2 /n k β 0 + τ 2 τ 2 +σ 2 /n k Y k weighted average of the population mean and the cluster-specific mean of the responses as n k gets large, more weight is attached to the sample mean also depends on the relative degree of between- versus within-cluster variation Operationally one would plug-in the estimates of β 0, σ 2 and τ 2, we get the empirical Bayes estimate of β 0k : β 0k = ˆσ 2 /n k ˆτ 2 + ˆσ 2 /n k ˆβ0 + ˆτ 2 ˆτ 2 + ˆσ 2 /n k Y k 204 BIO 245, Spring 2018

47 Empirical Bayes for linear mixed effects models Now consider the linear mixed effects model: Y k = X k β + Z k γ k + ǫ k γ k MVN q (0,G(α)) ǫ k MVN nk (0,R k (α)) γ k ǫ k for which the induced marginal distribution of Y k is a multivariate Normal: Y k MVN nk (X k β, Σ k (α)) where Σ k (α) = Z k G(α)Z T k +R k (α) Suppose we obtain estimates of (β, α) via ML or REML 205 BIO 245, Spring 2018

48 Now consider the (cluster-specific) joint distribution of Y k and γ k : Y k γ k MVN nk +q X kβ 0, Σ k(α) G(α)Zk T Z k G(α) G(α) since Cov[Y k, γ k ] = Cov[X k β + Z k γ k + ǫ k, γ k ] = Cov[X k β, γ k ] + Cov[Z k γ k, γ k ]+ Cov[ǫ k, γ k ] = 0 + Z k Cov[γ k, γ k ] + 0 = Z k G(α) The induced conditional distribution of γ k Y k is: γ k Y k MVN q (G(α)Z T k Σ k (α) 1 (Y k X k β), Σ k(α)) where Σ k(α) = G(α) G(α)Z T k Σ k (α) 1 Z k G(α) 206 BIO 245, Spring 2018

49 Consequently, the empirical Bayes estimator of γ k is: γ k = G( α)z T k Σ k ( α) 1 (Y k X k β) simply plug in the estimates (ML or REML) of (β, α) Can also examine the fitted values of the response based on the empirical Bayes estimator: Ŷ k = X k β + Zk γ k = X k β + Zk [ G( α)z T k Σ k ( α) 1 (Y k X k β) ] = [I nk Z k G( α)z T k Σ k ( α) 1 ]X k β + Zk G( α)z T k Σ k ( α) 1 Y k weighted average of the estimated population profile, X k β, and the observed data, Y k. 207 BIO 245, Spring 2018

50 Fitting models in R There are many ways of fitting linear mixed effects models in R Two functions/packages written by Doug Bates and colleagues are: lme() function in the nlme package lmer() function in the lme4 package Unfortunately, no single function/package has the functionality to fit all forms of a linear mixed effects model that one could be interested in e.g. lme() permits direct specification of structure for the off-diagonal elements of R k (α), whereas lmer() does not (at least as far as I can tell) e.g. lmer() permits non-nested levels of clustering, whereas lme() does not (at least as far as I can tell) 208 BIO 245, Spring 2018

51 The basic call to lme() is has the following elements: fixed random correlation weights method Specification for the regression model, X k β Specification for the random effects model, Z k γ k Specification for serial dependence in R k (α) Specification for heteroskedasticity in R k (α) Indicator of whether ML or REML is used > ## >?lme ## Help on specifying a linear mixed model >?corclasses ## Help on specifying serial dependence >?varclasses ## Help on specifying heteroskedasticity Additional resources: Pinheiro J. and Bates D. Mixed-Effects Models in S and S-PLUS (2006) Galecki A. and Tomasz B. Linear mixed-effects models using R: A step-by-step approach (2013). 209 BIO 245, Spring 2018

52 Dental growth data Fit a series of models with the same mean structure: E[Y ki ] = β 0 + β 1 A ki + β 1 G k + β 3 A kig k where A ki = A ki 8 Consider a range of specifications for dependence: Σ k (α) = Z k G(α)Z T k +R k (α) > ## > load("growth.rdata") > growth$agestar <- growth$age - 8 > library(nlme) > > ## Naive - independent errors > ## > fit0.ml <- glm(length ~ agestar * gender, data=growth, family=gaussian) 210 BIO 245, Spring 2018

53 > ## Random intercepts + independent errors > ## > fit1.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + data=growth, method="ml") > > ## Independent random intercepts/slopes + independent errors > ## * two subject-specific random effects are independent > ## > fit2.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ agestar id, pdclass="pddiag"), + data=growth, method="ml") > > ## Correlated random intercepts/slopes + independent errors > ## * two subject-specific random effects are correlated > ## > fit3.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ agestar id), + data=growth, method="ml") 211 BIO 245, Spring 2018

54 > ## Random intercepts + auto-regressive errors > ## > fit4.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + correlation=corar1(form= ~ agestar id), + data=growth, method="ml") > > ## Random intercepts + exponential spatial errors > ## > fit5.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + correlation=corexp(form= ~ agestar id), + data=growth, method="ml") > > ## Random intercepts + (exponential spatial and independent measurement error) > ## > fit6.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + correlation=corexp(form= ~ agestar id, nugget=true), + data=growth, method="ml") 212 BIO 245, Spring 2018

55 > ## Random intercepts + independent errors > ## * heteroskedasticity across age > ## > fit7.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + weights=varident(form= ~1 agestar), + data=growth, method="ml") > > ## Random intercepts + independent errors > ## * heteroskedasticity across gender > ## > fit8.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + weights=varident(form= ~1 gender), + data=growth, method="ml") 213 BIO 245, Spring 2018

56 Output: > ## > summary(fit0.ml)... Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** agestar ** gendermale * agestar:gendermale (Dispersion parameter for gaussian family taken to be ) Null deviance: on 103 degrees of freedom Residual deviance: on 100 degrees of freedom AIC: > sqrt(summary(fit0.ml)$dispersion) [1] BIO 245, Spring 2018

57 > ## Random intercepts + independent errors > ## > summary(fit1.ml) Linear mixed-effects model fit by maximum likelihood Data: growth AIC BIC loglik Random effects: Formula: ~1 id (Intercept) Residual StdDev: Fixed effects: length ~ agestar * gender Value Std.Error DF t-value p-value (Intercept) agestar gendermale agestar:gendermale Number of Observations: 104 Number of Groups: BIO 245, Spring 2018

58 > ## Independent random intercepts/slopes + independent errors > ## > summary(fit2.ml)... Random effects: Formula: ~agestar id Structure: Diagonal (Intercept) agestar Residual StdDev: > > ## Correlated random intercepts/slopes + independent errors > ## > summary(fit3.ml)... Random effects: Formula: ~agestar id Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) (Intr) agestar Residual BIO 245, Spring 2018

59 > ## Random intercepts + auto-regressive errors > ## > summary(fit4.ml)... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Correlation Structure: ARMA(1,0) Formula: ~agestar id Parameter estimate(s): Phi BIO 245, Spring 2018

60 > ## Random intercepts + exponential spatial errors > ## > summary(fit5.ml)... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Correlation Structure: Exponential spatial correlation Formula: ~agestar id Parameter estimate(s): range > > ## Correlation between two observation 1 time unit apart > ## > range < > exp(-1/range) [1] e BIO 245, Spring 2018

61 > ## Random intercepts + (exponential spatial and independent measurement error) > ## > summary(fit6.ml)... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Correlation Structure: Exponential spatial correlation Formula: ~agestar id Parameter estimate(s): range nugget > > ## Correlation between two observation 1 time unit apart > ## > range < > nugget < > (1-nugget) * exp(-1/range) [1] BIO 245, Spring 2018

62 > ## Random intercepts + independent errors > ## * heteroskedasticity across age > summary(fit7.ml)... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Variance function: Structure: Different standard deviations per stratum Formula: ~1 agestar Parameter estimates: BIO 245, Spring 2018

63 > ## Random intercepts + independent errors > ## * heteroskedasticity across gender > summary(fit8.ml)... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Variance function: Structure: Different standard deviations per stratum Formula: ~1 gender Parameter estimates: female male BIO 245, Spring 2018

64 Comparison of model fit: Dependence model log-like AIC 0. Independence Random intercepts + inde. errors Independent random intercepts/slopes + inde. errors Correlated random intercepts/slopes + inde. errors Random intercepts + AR errors Random intercepts + ES errors Random intercepts + (ES and ME errors) Random intercepts + heteroske inde. errors (age) Random intercepts + heteroske inde. errors (gender) seems clear that model 8 provides the best fit (from among those presented, at least) 222 BIO 245, Spring 2018

65 Results for: E[Y ki ] = β 0 + β 1 A ki + β 1 G k + β 3 A kig k Dependence Estimate Standard error model β 0 β 1 β 2 β 3 β 0 β 1 β 2 β standard error estimates are notably smaller under model BIO 245, Spring 2018

66 Empirical Bayes estimates of the random effects estimates from a random intercept/slopes model compare estimates with those from K=26 separate regressions Subject specific intercepts Subject specific slopes Random effects Random effects Fixed effects Fixed effects see structure imposed by the model 224 BIO 245, Spring 2018

67 Finally consider the impact of using REML for model 8 > > fit8.reml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + weights=varident(form= ~1 gender), + data=growth, method="reml") > > summary(fit8.reml) Linear mixed-effects model fit by REML... Random effects: Formula: ~1 id (Intercept) Residual StdDev: Variance function: Structure: Different standard deviations per stratum Formula: ~1 gender Parameter estimates: female male BIO 245, Spring 2018

68 ML REML Est SE Est SE Fixed effects Intercept, β Main effect for age, β Main effect for gender, β Interaction, β Variance components SD of random intercepts SD for errors females males differences don t appear to be substantial 226 BIO 245, Spring 2018

69 MACS CD4+ cell count data Consider the mean model: E[CD4 ki ] = β 0 + β 1 time ki + β 2 age ki + β 3 smoker ki based on the following data manipulations: + β 4 drug ki + β 5 cesd-cs ki + β 5 cesd-lo ki > ## > load("macs.rdata") > macs$smoker <- as.numeric(macs$packs > 0) > macs$age <- macs$age / 5 > macs$cesd <- (macs$cesd - 10) / 5 restricted to K=266 participants those who seroconverted with the pre measurement no more than 6 months prior to seroconversion cesd-cs and cesd-lo are the cross-sectional and longitudinal forms of CESD we ve considered previously 227 BIO 245, Spring 2018

70 Fit seven of the models for the dependence structure considered for the dental growth data for simplicity do not consider heteroskedasticity Dependence model log-like AIC 0. Naïve independence -10,956 21, Random intercepts + inde. errors -10,705 21, Independent random intercepts/slopes + inde. errors -10,674 21, Correlated random intercepts/slopes + inde. errors -10,663 21, Random intercepts + AR errors -10,832 21, Random intercepts + ES errors -10,677 21, Random intercepts + (ES and ME errors) -10,667 21,357 indications that dependence models 3, and 6 provide the best fits to the data 228 BIO 245, Spring 2018

71 > ## Correlated random intercepts/slopes + independent errors > ## > summary(fit3.ml)... Random effects: Formula: ~time id Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) (Intr) time Residual Fixed effects: list(my.form) Value Std.Error DF t-value p-value (Intercept) time substantial variation in the random slopes relative to that in the random intercepts and residual error negative correlation between the random intercepts and slopes 229 BIO 245, Spring 2018

72 Subject specific slope for time Subject specific intercept vast majority of subject-specific slopes are negative participants with higher initial CD4 count tend to have steeper declines post-seroconversion 230 BIO 245, Spring 2018

73 > ## Random intercepts + (exponential spatial and independent measurement error) > ## > summary(fit6.ml) Random effects: Formula: ~1 id (Intercept) Residual StdDev: Correlation Structure: Exponential spatial correlation Formula: ~time id Parameter estimate(s): range nugget > > ## Correlation between two observation 1 time unit apart > ## > (1-nugget) * exp(-1/range) [1] Fairly substantial correlation in the serial dependence term 231 BIO 245, Spring 2018

74 Point estimates: E[CD4 ki ] = β 0 + β 1 time ki + β 2 age ki + β 3 smoker ki + β 4 drug ki + β 5 cesd-cs ki + β 5 cesd-lo ki Dependence Int time age smoker drug cesd-cs cesd-lo model BIO 245, Spring 2018

75 Standard error estimates: E[CD4 ki ] = β 0 + β 1 time ki + β 2 age ki + β 3 smoker ki + β 4 drug ki + β 5 cesd-cs ki + β 5 cesd-lo ki Dependence Int time age smoker drug cesd-cs cesd-lo model BIO 245, Spring 2018

76 Model diagnostics Since estimation/inference for linear mixed effects models is likelihood-based, the validity of results relies on a series of assumptions: correct specification of the mean model correct specification of the dependence model Normality of γ k and ǫ k K is sufficiently large for asymptotic results to apply Prior to formal modeling, one can (and should) perform an initial EDA to examine possible errors in the data as well as to get insight into model structure Once a model has been fit, the assumptions can be re-assessed using an analysis of residuals and the empirical Bayes estimates of the random effects 234 BIO 245, Spring 2018

77 Residuals In mixed effects models, the notion of a residual can be defined at a number of different levels marginal (population-level) residuals: stage-one (cluster-level) residuals: e k = Y k X k β ǫ k = Y k X k β Z k γ k γ k can also be viewed as form of stage two residual deviation around the population regression Estimates of these quantities can be obtained by plugging in β and γ k : ê k = Y k X k β ǫ k = Y k X k β Zk γ k 235 BIO 245, Spring 2018

78 Standardization The observed residuals arise from having fit a model and, therefore, should exhibit variation-covariation in accordance with the structure of the model e.g. if the model permits the residual error variance to differ by some covariate, then the observed residuals will reflect this If the goal is to investigate the extent to which the model is inadequate, we should acknowledge this phenomenon One way forward is to standardize (or normalize) the residuals let L T kl k be the Cholesky decomposition of Σ k ( α) use L k to form the standardized marginal residuals ê k = L 1 k êk = L 1 k (Y k X k β) 236 BIO 245, Spring 2018

79 If Σ k (α) has been correctly specified then Cov[ê k] I nk no mean-variance relationship in the variance components no relationship between the residuals and any of the covariates no residual correlation among observations within a cluster Note, as appropriate, one might also consider standardizing the stage one residuals ǫ k = Y k X k β Z k γ k particularly useful for investigating residual (unaccounted) serial dependence in the specification of R k (α) 237 BIO 245, Spring 2018

80 Diagnostics Conventionally, folks tend to focus on the stage one residuals and the random effects estimates when performing diagnostics for linear mixed models patterns in the population residuals could result from inadequacy in the model structure in various places Mean model: plots of ˆǫ ki versus the columns of X k to investigate whether the assumed specification of X k β is adequate plots of γ kq versus the columns of X k to investigate whether the assumed specification of Z k γ k is adequate Note, if the residuals have been standardized, one should compare them with the columns of X k similarly standardized 238 BIO 245, Spring 2018

81 Dependence model: plot of ˆǫ ki versus the fitted values, ˆµ ki = X k β + Zk γ k, to see if there is a residual mean-variance relationship plot of ˆǫ ki versus lag(ˆǫ ki ) to see if there is any residual serial dependence Normality: standard approach is to present a Q-Q plot of the fitted residual versus the expected normalized residuals from a Normal distribution stage one residuals, ˆǫ ki random effect estimates, γ k,j for j= 1,..., q as appropriate, one can also examine scatter plots of the components of γ k to assess joint Normality 239 BIO 245, Spring 2018

82 Dental growth data Recall the best fitting model was model 8 random intercepts + heteroskedastic error terms by gender Present select diagnostics based on model 1: random intercepts + homoskedastic independent error > > ## Random intercepts + independent errors > ## > fit1.ml <- lme(fixed=length ~ agestar * gender, + random=restruct(~ 1 id), + data=growth, method="ml") > > epshat <- resid(fit1.ml, type="normalized") > gammahat <- ranef(fit1.ml)[,1] 240 BIO 245, Spring 2018

83 Investigate heteroskedasticity in the standardized stage 1 residuals: years 10 years 12 years 14 years Females Males inconclusive evidence regarding heteroskedasticity by age strong evidence of heteroskedasticity by gender 241 BIO 245, Spring 2018

84 Compare residuals and lagged residuals to investigate potential serial dependence: only for observations at ages 10, 12 and 14 Stage 1 residual Lagged stage 1 residual indication of unaccounted for local correlation 242 BIO 245, Spring 2018

85 Q-Q plots: Stage 1 residuals Random intercepts Sample quantiles Sample quantiles Theoretical quantiles Theoretical quantiles looking for linearity between the theoretical and sample quantiles some evidence of heavier-than-normal tails in the stage 1 residuals random intercepts seems fine tough to tell with only K=26 children 243 BIO 245, Spring 2018

86 Summary Goals: perform estimation/inference for regression parameters from a model for a continuous response while acknowledging within-cluster dependence also perform estimation/inference for the dependence structure estimate cluster-specific parameters/effects Approach: combine fixed effects with random effects into a single model formulation: Y k = X k β + Z k γ k + ǫ k consider various strategies for structuring the model components so that a range of dependence models can be entertained structure for Cov[γ k ] = G(α) structure for Cov[ǫ k ] = R k (α) 244 BIO 245, Spring 2018

87 Estimation/inference: integrated likelihood for (β, α) ML or REML empirical Bayes for the cluster-specific random effects, γ k Since estimation/inference is fully parametric, diagnostics for model structure and assumptions are critical mean model dependence model distributional assumptions 245 BIO 245, Spring 2018

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed

More information

Part IV: Marginal models

Part IV: Marginal models Part IV: Marginal models 246 BIO 245, Spring 2018 Indonesian Children s Health Study (ICHS) Study conducted to determine the effects of vitamin A deficiency in preschool children Data on K=275 children

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data Berwin Turlach School of Mathematics and Statistics Berwin.Turlach@gmail.com The University of Western Australia Models

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

13. October p. 1

13. October p. 1 Lecture 8 STK3100/4100 Linear mixed models 13. October 2014 Plan for lecture: 1. The lme function in the nlme library 2. Induced correlation structure 3. Marginal models 4. Estimation - ML and REML 5.

More information

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data.

These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data. These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data. We focus on the experiment designed to compare the effectiveness of three strength training

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Modeling the Covariance

Modeling the Covariance Modeling the Covariance Jamie Monogan University of Georgia February 3, 2016 Jamie Monogan (UGA) Modeling the Covariance February 3, 2016 1 / 16 Objectives By the end of this meeting, participants should

More information

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

22s:152 Applied Linear Regression. Returning to a continuous response variable Y... 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y... Ordinary Least Squares Estimation The classical models we have fit so far with a continuous

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ) 22s:152 Applied Linear Regression Generalized Least Squares Returning to a continuous response variable Y Ordinary Least Squares Estimation The classical models we have fit so far with a continuous response

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Random and Mixed Effects Models - Part III

Random and Mixed Effects Models - Part III Random and Mixed Effects Models - Part III Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Quasi-F Tests When we get to more than two categorical factors, some times there are not nice F tests

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Lmer model selection and residuals Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline The London Schools Data (again!) A nice random-intercepts, random-slopes

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

36-402/608 Homework #10 Solutions 4/1

36-402/608 Homework #10 Solutions 4/1 36-402/608 Homework #10 Solutions 4/1 1. Fixing Breakout 17 (60 points) You must use SAS for this problem! Modify the code in wallaby.sas to load the wallaby data and to create a new outcome in the form

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij Hierarchical Linear Models (HLM) Using R Package nlme Interpretation I. The Null Model Level 1 (student level) model is mathach ij = β 0j + e ij Level 2 (school level) model is β 0j = γ 00 + u 0j Combined

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

Lecture 9 STK3100/4100

Lecture 9 STK3100/4100 Lecture 9 STK3100/4100 27. October 2014 Plan for lecture: 1. Linear mixed models cont. Models accounting for time dependencies (Ch. 6.1) 2. Generalized linear mixed models (GLMM, Ch. 13.1-13.3) Examples

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Repeated measures, part 2, advanced methods

Repeated measures, part 2, advanced methods enote 12 1 enote 12 Repeated measures, part 2, advanced methods enote 12 INDHOLD 2 Indhold 12 Repeated measures, part 2, advanced methods 1 12.1 Intro......................................... 3 12.2 A

More information

Estimation: Problems & Solutions

Estimation: Problems & Solutions Estimation: Problems & Solutions Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline 1. Introduction: Estimation of

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Non-independence due to Time Correlation (Chapter 14)

Non-independence due to Time Correlation (Chapter 14) Non-independence due to Time Correlation (Chapter 14) When we model the mean structure with ordinary least squares, the mean structure explains the general trends in the data with respect to our dependent

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Modelling the Covariance

Modelling the Covariance Modelling the Covariance Jamie Monogan Washington University in St Louis February 9, 2010 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 1 / 13 Objectives By the end of this meeting, participants

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Estimation: Problems & Solutions

Estimation: Problems & Solutions Estimation: Problems & Solutions Edps/Psych/Soc 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline 1. Introduction: Estimation

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Mixed effects models - II Henrik Madsen, Jan Kloppenborg Møller, Anders Nielsen April 16, 2012 H. Madsen, JK. Møller, A. Nielsen () Chapman & Hall

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Longitudinal Data Analysis

Longitudinal Data Analysis Longitudinal Data Analysis Mike Allerhand This document has been produced for the CCACE short course: Longitudinal Data Analysis. No part of this document may be reproduced, in any form or by any means,

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Spatial analysis of a designed experiment. Uniformity trials. Blocking

Spatial analysis of a designed experiment. Uniformity trials. Blocking Spatial analysis of a designed experiment RA Fisher s 3 principles of experimental design randomization unbiased estimate of treatment effect replication unbiased estimate of error variance blocking =

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Coping with Additional Sources of Variation: ANCOVA and Random Effects Coping with Additional Sources of Variation: ANCOVA and Random Effects 1/49 More Noise in Experiments & Observations Your fixed coefficients are not always so fixed Continuous variation between samples

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks. Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as: 1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology Serial Correlation Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 017 Model for Level 1 Residuals There are three sources

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

1. Simple Linear Regression

1. Simple Linear Regression 1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Statistical Inference: The Marginal Model

Statistical Inference: The Marginal Model Statistical Inference: The Marginal Model Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Inference for fixed

More information

STAT 526 Advanced Statistical Methodology

STAT 526 Advanced Statistical Methodology STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information