British Journal of Mathematical and Statistical Psychology (2016), 69, The British Psychological Society

Size: px
Start display at page:

Download "British Journal of Mathematical and Statistical Psychology (2016), 69, The British Psychological Society"

Transcription

1 316 British Journal of Mathematical and Statistical Psychology (16), 69, The British Psychological Society The relationship between multilevel models and non-parametric multilevel mixture models: Discrete approximation of intraclass correlation, random coefficient distributions, and residual heteroscedasticity Jason D. Rights* and Sonya K. Sterba Vanderbilt University, Nashville, Tennessee, USA Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. 1. Introduction Multilevel data structures are common in social science research, as when students are nested within schools or patients are nested within clinicians. Such nested data are typically analysed with multilevel models (MLMs, also known as hierarchical linear models or mixed effects models) in which heterogeneity between clusters is modelled by continuous, normally distributed random intercepts and/or slopes (e.g., Goldstein, 11; Hox, 1; Raudenbush & Bryk, ). Multilevel mixture models are also used to accommodate nested data. One type of multilevel mixture that has been increasingly applied in practice has been termed the non-parametric multilevel mixture model (NPMM; e.g., Asparouhov & Muthen, 8; Finch & Marchant, 13; Henry & Muthen, 1; Kaplan & Keller, 11; Karakos, 15; *Correspondence should be addressed to Jason D. Rights, Department of Psychology and Human Development, Vanderbilt University, Peabody #55, 3 Appleton Place, Nashville, TN 373, USA ( jason.d. rights@vanderbilt.edu). DOI:1.1111/bmsp.173

2 Non-parametric multilevel regression mixture 317 Park & Yu, 15; Van Horn, Feng, Kim, Lamont, Feaster, & Jaki, 16; Vermunt, 3, 4, 8; Yu & Park, 14). In this model, latent classes are extracted at multiple levels of a hierarchical data structure (e.g., level 1 and level ) and, unlike the conventional MLM, there are no continuously distributed random effects. In this paper, we consider the regression 1 version of the NPMM, where intercepts and slopes of covariates can vary discretely across level-1 and level- classes (e.g., Vermunt, 1; Vermunt & Magidson, 5). Compared to the MLM, the NPMM provides the advantages of relaxed distributional assumptions (e.g., distributions of effects across clusters are not assumed normal) and the ability to interpret classes both directly (as representing literal, discrete subpopulations) and indirectly (e.g., as approximating an underlying continuous distribution of effects across clusters; Titterington, Smith, & Makov, 1985). Indeed, the NPMM is termed non-parametric because it has often been motivated for use with an indirect interpretation in which classes can represent a discrete approximation of continuous distributions, such as those assumed in the MLM. However, precisely how this NPMM non-parametrically approximates MLM parameters has not been shown mathematically. Having only a conceptual or incomplete 3 understanding of the relationship between the NPMM and MLM leaves us with several unknowns that limit our understanding of the indirect interpretation of the NPMM: (1) What are the analytic relationships through which the parameters of the NPMM can approximate the parameters of the MLM? () Which standard MLM specifications can be approximated by which NPMM specifications? (3) Can the NPMM also indirectly approximate non-standard MLM specifications? In contrast, model parameters from other mixtures have been analytically related to the non-mixture-model parameters they approximate (e.g., Bauer, 5, 7; Vermunt & Van Dijk, 1), which has aided the indirect interpretation of classes (e.g., Muthen & Asparouhov, 8; Pek, Sterba, Kok, & Bauer, 9). A recent review of psychology mixture applications noted that whereas a direct interpretation of classes is intuitive, an indirect semiparametric function of classes is less so and called for methodologists to provide more concrete details on an indirect interpretation to facilitate its use in practice (Sterba, Baldasaro, & Bauer, 1, pp. 593, 65). Understanding the relationship between the NPMM and MLM (i.e., the indirect interpretation of the NPMM) would be useful to applied researchers for several reasons. First, understanding this relationship can aid both multilevel modelling and mixture modelling researchers in developing competing explanations of their data-generating mechanisms. This can be particularly useful in mixture applications given that many applied researchers currently only consider a direct interpretation of classes which may be seductive in its simplicity, yet risks reifying classes as literal subpopulations (Bauer, 1 Note that Muthen and Asparouhov (9) address a different kind of multilevel regression mixture that involves both random effects and classes, requires numerical integration, and is most often used for directly interpreting classes. They do not discuss the NPMM. The NPMM involves only classes, does not require numerical integration, and is often used for indirectly interpreting classes. When the normality assumptions of the MLM hold, the NPMM non-parametrically approximates MLM parameters, including the first moment(s) and second central moment(s) of the random effect(s) distribution. When these normality assumptions do not hold, the NPMM has the flexibility to not only more accurately recover lower-order moments of the random effects distribution (e.g., Asparouhov & Muthen, 8; Brame, Nagin, & Wasserman, 6), but also recover its higher-order moments. As a shorthand, throughout this paper we refer to the NPMM as non-parametrically approximating MLM parameters when referring to its discrete approximation of such lower-order moments, regardless of whether MLM assumptions are upheld or not. 3 Certain relationships have been delineated for simpler mixtures with classes exclusively at level (Vermunt & Van Dijk, 1), as described in detail later.

3 318 Jason D. Rights and Sonya K. Sterba 7; Sampson & Laub, 5). Second, understanding this relationship would illuminate how the NPMM can be used as a preliminary, exploratory tool for multilevel modelling researchers. It will be shown that fitting our general NPMM automatically and flexibly approximates a number of non-standard MLM specifications that would not typically be considered by applied researchers fitting MLMs, yet could diagnostically suggest novel future directions for multilevel model-building. Third, understanding this link can help synthesize or even meta-analyse results across already published studies that have used the MLM and/or NPMM with similar dependent variables and covariates. Without the linkages established in this paper, there would be no way to formally compare empirical results across methods, as is often desired (e.g., Bergman, 15; Sterba & Bauer, 14). The purpose of this paper is to fill this gap by enumerating the relationships between the NPMM and MLM useful for indirectly interpreting the NPMM and, moreover, to highlight the practical importance of understanding these relationships. In doing so, we hope to facilitate the use of the NPMM as a non-parametric approximation of the MLM in practice. We first consider a series of NPMMs of increasing complexity. For each, we (a) define the corresponding MLM and (b) provide formulas underlying how parameters of each MLM are indirectly approximated. We begin by relating null NPMM parameters to the intraclass correlation (ICC) from a random-intercept-only MLM; subsequently, we discuss these relationships for MLM fixed effects, MLM random slope variances, and also nonstandard features not previously known to be approximated by the NPMM namely, MLM heteroscedastic residual variances at level 1 and level. To ground these analytic developments, we contrast direct and indirect interpretations of latent classes, employ simulated graphical illustrations, and provide an empirical example involving predicting mathematics achievement for students nested within schools. In our discussion (Section 8), we consider additional novel uses for the indirect interpretation of the NPMM. Mplus (Muthen & Muthen, ) syntax to implement the NPMM and an R function to implement indirect approximation calculations are provided in our Appendix S1. Regarding our scope, note that subsequent research can use our analytic formulas to investigate via simulation the degree of correspondence between MLM parameters and their NPMM approximations under different design conditions. Though we provide a single-sample empirical illustration, a full-scale simulation, as has been done for other mixtures (e.g., Brame et al., 6; Muthen & Asparouhov, 8; Sterba & Bauer, 14), 4 is beyond our scope here. Before continuing, it is important to note the ways in which our analytic developments build on and differ from those provided for other non-parametric mixtures with classes only at level (Vermunt & Van Dijk, 1). First, the additional presence of level-1 classes in the NPMM not only changes computational aspects of these analytic relationships (see Sections and 3), but also introduces new features (see Sections 4 and 5). Second, previous work has not related the variety of possible NPMM specifications each to particular MLMs, nor focused on approximating the ICC. Third, we show steps to derive each indirect approximation formula. This will allow our analytic approach to be generalized to non-parametric versions of other multilevel mixtures (e.g., multilevel latent class analysis) and to be used to clarify indirect relationships in other contexts. 4 In general, the NPMM would be expected to achieve a closer approximation of first moments and second central moments of MLM s random effects distribution with greater numbers of classes. Also note that the NPMM can fit aspects of the data that MLM cannot (i.e., higher-order moments). See, for example, Bauer and Curran (3).

4 Non-parametric multilevel regression mixture 319. Relationship of the NPMM to the random-intercept-only MLM and ICC To begin, we consider a null NPMM. In this model, there are no covariates; rather, the distribution of the outcome variable, y ij, is allowed to differ across latent classes at level 1 and level. Let i denote a level-1 unit, or observation within cluster (i = 1,..., N j ), and j denote a level- unit or cluster (j = 1,..., J ). Latent class membership at level 1 and level is denoted c ij and d j, respectively, with particular level-1 classes denoted by k (k = 1,..., K) and level- classes by h (h = 1,..., H). 5 For instance, suppose we were analysing scores on a mathematics achievement test for students (level-1 units) nested within schools (level- units). For the distribution of maths scores, we may specify a model with multiple school-level latent classes (e.g., level- classes of schools with primarily highachieving students vs. level- classes of schools with primarily low-achieving students) and, within each school-level class, multiple student-level latent classes (e.g., level-1 classes of high-achieving students vs. low-achieving students). More formally, this null NPMM is specified in equation (1) in Table 1 6 as y ij j cij ¼ k;d j ¼ h ¼ c kh þ e ij. In equation (1), the outcome variable y ij, where c ij = k and d j = h, is modelled by a kh class-combination intercept, c kh, and residual, e ij. The latter is assumed normally distributed with class-combination-specific 7 variance h kh, that is, e ij ~ N(, h kh ). Thus, the c kh and h kh parameters are specific to a particular combination of level-1 class and level- class. There are three types of probabilities important to the NPMM. The marginal probability of level- class membership, pðh j ¼ dþ ¼p dh, represents the probability of a randomly selected observation belonging to level- class h. It is modelled by a multinomial regression, pðd j ¼ hþ ¼p dh ¼ expð- h Þ= P H h¼1 expð-h Þ; with multinomial intercept ϖ h fixed at in class H for identification. Additionally, the conditional probability of level-1 class membership given level- class membership, pðc ij ¼ kjh j ¼ dþ ¼p kjh ;represents the probability of belonging to level-1 class k for a randomly selected observation within level- class h. It is modelled by another multinomial regression, pðc ij ¼ kjd j ¼ hþ ¼p kjh ¼ expðx k þ d kh Þ= P K k¼1 expðxk þ d kh Þ; with multinomial intercept x k and slope (k on h) d kh. For identification, x K = d Kh = d kh =. Finally, the kh class-combination probability p kh represents the joint probability of a randomly selected observation belonging to level-1 class k and level- class h. This is computed as the product of p dh and p k h. Under a direct interpretation of equation (1), one could consider intercept and residual variance differences between classes to represent substantively meaningful differences between distinct subpopulations. For example, average maths achievement scores may differ across distinct school-level (i.e., level-) classes, and we may be particularly interested in examining membership in kh class combinations such as a high-achieving student-level (i.e., level-1) class within a low-performing school-level class, or a low-achieving student-level class within a high-performing school-level class. 5 Though selecting K and H is not our focus, we revisit this topic later in the context of an empirical example. 6 Equations (1) (14) are contained in Table 1, and equations (15) and (16) in Table. 7 In empirical research, it is common for some or all class-combination residual variances to be constrained equal across class combination, that is h kh = h, for estimation stability (for discussion, see McLachlan & Peel, ).

5 3 Jason D. Rights and Sonya K. Sterba Table 1. Multilevel model (MLM) specifications with corresponding non-parametric multilevel regression mixture model (NPMM) specifications MLM NPMM a Random-intercept-only model yij = c + uj + eij () uj ~ N(, s) eij ~ N(, r ) + Fixed slope of x ij yij = c + c1xij + uj + eij (4) uj ~ N(, s) eij ~ N(, r ) + Random slope of xij yij = c + c1xij + uj + u1jxij + eij (6) uj s N ; eij Nð; r Þ u1j s1 s11 yijj cij ¼ k;dj ¼ h ¼ c kh þ e ij (1) eij Nð; h kh Þ yijj cij ¼ k;dj ¼ h ¼ c kh þ c 1 x ij þ eij (3) eij Nð; h kh Þ yijj cij ¼ k;dj ¼ h ¼ c kh þ c h 1 x ij þ eij (5) eij Nð; h kh Þ + Level-1 heteroscedasticity on xij y ij = c + c 1 x ij + u j + u 1j x ij + e ij + e 1ij x ij (8) uj u1j N e ij ; s s1 s11 e1ij N ; r r;1 r 1 yijj cij ¼ k;dj ¼ h ¼ c kh þ ckh 1 x ij þ eij (7) eij Nð; h kh Þ + Fixed slope of w j yij = c + c1xij + c1wj + uj + u1jxij + eij + e1ijxij (1) uj s e ij r N ; N ; u1j s1 s11 e1ij r;1 r 1 yijj cij ¼ k;dj ¼ h ¼ c kh þ ckh 1 x ij þ c w j þ eij (9) eij Nð; h kh Þ Continued

6 Non-parametric multilevel regression mixture 31 Table 1. (Continued) MLM NPMM a + Level- heteroscedasticity on wj 4 yij = 3 c + c1xij 3 + c1wj + uj + u3jwj u1jxij + eij + e1ijxij uj s 5 u3j 4 5; 4 5 A e ij r s3 s33 N ; e1ij r;1 r 1 u1j s1 s13 s11 (1) yijj ¼ c kh þ ckh c h þ cij ¼ k;dj ¼ h þ ckh 1 x w j eij ij þ c h eij Nð; h kh Þ (11) + Level-1 heteroscedasticity on wj y ij = c + c 1 x ij + c 1 w j + u j + u 3j w j + u 1j x ij + e ij + e 1ij x ij + e ij w j 4 uj u3j u1j ; 4 s s3 s33 s1 s13 s A e ij 4 e1ij eij ; 4 r r;1 r 1 r; r1; r A (14) yijj ¼ c kh þ ckh c kh þ cij ¼ k;dj ¼ h þ ckh 1 x w j eij ij þ eij Nð; h kh Þ (13) Note. a For all NPMM models: pðdj ¼ hþ ¼p d h ¼ expðh Þ= P H h¼1 expð-h Þ pðcij ¼ kjdj ¼ hþ ¼p kjh ¼ expðx k þ d kh Þ= P K k¼1 expðxk þ d kh Þ For identification, - H ¼ x K ¼ d Kh ¼ d kh ¼

7 3 Jason D. Rights and Sonya K. Sterba Table. Matrix specification of corresponding general NPMM (equation 15) and MLM (equation 16) MLM general specification y ij ¼ x ij c þ w ij u j þ v ij e ij u j ~ N(, T) (16) e ij ~ N(, Σ) where: x ij vector of 1 and all L1 or L covariates with fixed components c vector of corresponding fixed components v ij vector of 1 and L1 or L covariates affecting L1 residuals e ij vector of L1 random effects modelling L1 heteroscedasticity Σ covariance matrix of e ij w ij vector of 1 and all covariates with random components u j vector of corresponding L random effects T covariance matrix of u j Φ submatrix of T containing covariances of random effects modelling L heteroscedasticity (see equation 37) w j submatrix of w ij containing 1 and L covariates affecting L residuals (see equation 37) NPMM general specification y ij j cij ¼ k;d j ¼ h ¼ z ij ckh þ e ij e ij Nð; h kh Þ (15) p(d j = h) = p d h p(c ij = k d j = h) = p k h where: z ij vector containing 1 and all L1 and L covariates c kh vector of coefficients for class combination kh h kh residual variance for class combination kh p h and p k h : see Table 1 Notes. L1 = level-1; L = level-. An indirect interpretation of equation (1), however, could consider this NPMM a discrete approximation of a corresponding MLM. For example, rather than the schoollevel and student-level latent classes representing distinct subpopulations, they may be approximating continuous distributions of deviations in maths scores at each level. To illustrate, we begin by defining the corresponding MLM in equation (), the randomintercept-only (null) MLM. Specifically, in equation () we have y ij = c + u j + e ij, where outcome y ij is modelled by a mean intercept, c, cluster-specific deviation, u j ~ N (, s ), and level-1 residual, e ij ~ N(, r ). In contrast to the NPMM, all variance (withincluster and between-cluster) is modelled by continuously distributed residuals, rather than through discrete latent class variation. Researchers fitting MLMs typically use this null MLM exclusively to calculate and interpret the ICC (e.g., Snijders & Bosker, 1), which involves partitioning outcome variance into within-cluster (r ) and between-cluster (s ) components. Specifically, the ICC is the proportion of variance that is between clusters: ICC ¼ s s þ r : ð17þ Because the ICC is typically the only quantity of interpretive interest from the null MLM, to mathematically relate the null NPMM and null MLM thus aiding in an indirect interpretation of NPMM we first will focus on how the ICC would be computed from NPMM parameters.

8 Non-parametric multilevel regression mixture 33 Preliminarily, it is important to consider why one would want to compute the ICC from NPMM parameters. In an MLM context, researchers typically compute the ICC using equation (17) to determine if their degree of nesting is great enough to necessitate fitting MLMs and, if so, report their non-trivial ICC as a rationale for using the MLM. In an NPMM context, no such dependency measure currently exists that uses parameters of a fitted null NPMM model, and consequently there is often no rationale given for the necessity of fitting multilevel data with an NPMM as opposed to, say, fitting a single-level latent class regression model (see Park & Yu, 15). Having such a measure of ICC for the NPMM would allow researchers to understand the degree of nesting in a familiar correlation metric, without having to fit a second model (i.e., a null MLM). In MLMs, calculation of the ICC is straightforward. In NPMMs, however, there is no explicit (i.e., directly estimated) within-cluster or between-cluster variance component. Therefore, we must first partition variance analytically in the NPMM to obtain these quantities, and then use them in constructing the implied ICC. Because all clusters vary discretely in the NPMM (i.e., by level- class), partitioning variance into within-cluster and between-cluster components amounts to partitioning variance into within level- class and between level- class components, respectively. Fig 1 illustrates the concept of partitioning within-cluster and between-cluster variance in a null NPMM. In this figure, both panels depict a mixture of four kh class-combination densities generated from a null NPMM with two level-1 classes (K = ) and two level- classes (H = ). As shown in Fig 1a, to examine within-cluster variance, we examine how parameters vary by level-1 class k within level- class h. Specifically, in Fig 1a, each vertical line represents a kh class-combination intercept, c kh. Comparing the two leftmost vertical lines (denoted c 11 and c 1, respectively), we can see that there is class-combination intercept variance within level- class 1 (i.e., the intercepts c 11 and c 1, both belonging to h = 1, differ). Assessing within level- class variance in classcombination intercepts (and/or residuals) parallels assessing within-cluster variance in MLM. Though c kh is specific to class combination kh, we can marginalize across k within h (denoted E k h ( ) where E represents expectation). We thus obtain marginal level- class parameters, denoted with a dh superscript: E kjh ðc kh Þ¼c h. To examine between-cluster variance, we examine how marginal level- class parameters vary by h, that is, between level- classes, as shown in Fig 1b. There, each vertical line represents a marginal level- class intercept, c h. Comparing the two vertical lines (denoted c 1 and c ), we can see how there is marginal level- class intercept variance between level- classes. Assessing this between level- class variance in marginal level- class intercepts parallels assessing between-cluster variance in MLMs. With this background, we proceed to first calculate within level- class variance from the null NPMM. Because we are relating it to r in the MLM, it s termed an implied r, denoted r NP. Specifically, we take the variance of the intercepts and residuals from equation (1) across level-1 classes within each level- class, denoted var k h ( ), and pool across level- classes, denoted E h ( ), to obtain an overall variance, r NP ¼ E h½var kjh ðc kh þ e ijþš ¼ E h ½E kjh ½ðc kh c h Þ þ h kh ŠŠ: ð18þ The population expression in equation (18) can be converted to a sample estimate by replacing expectations with weighted sums and replacing parameters with corresponding estimates:

9 34 Jason D. Rights and Sonya K. Sterba Figure 1. Conceptualizing within-cluster and between-cluster variance in the null NPMM (equation 1) through weighted within and between level- class intercept differences. (a) Comparing class-combination intercepts within level- class to assess within-cluster variance. (b) Comparing marginal level- class intercepts between level- classes to assess between-cluster variance. Notes. These plots show the kernel density of data generated (in SAS 9.4; SAS Institute Inc., Cary, NC, USA) from the null NPMM in equation (1). There are two level-1 classes (K = ) and two level- classes (H = ), making four kh class combinations. Each plot is a mixture of four normal densities; each density corresponds to a kh class combination. In (a), each vertical line represents a class-combination intercept, c kh. In (b), each vertical line represents a marginal level- class intercept, c h. By comparing these two plots, we can see conceptually how class-combination parameters (here intercepts, but in other contexts also/instead level-1 class probabilities) can vary within level- classes (by level-1 class). We can also see how marginal level- class parameters can vary between level- classes. Comparing parameters across level-1 classes within level- class parallels assessing within-cluster variance in a fitted MLM; comparing parameters between level- classes parallels assessing between-cluster variance in a fitted MLM. ^r NP ¼ XH XK ^p h h¼1 k¼1 ¼ XH X K h¼1 k¼1 ^p kjh ðð^c kh ^p kh ðð^c kh ^c h Þ þ ^h kh Þ ^c h Þ þ ^h kh Þ: ð19þ We now calculate the between level- class variance from the null NPMM, which we term the implied s, denoted s (NP). To do so, we take the variance of marginal level- class intercepts between level- classes, s ðnpþ ¼ var h ðe kjh ½c kh ŠÞ ¼ var hðc h Þ¼E h½ðc h Þ Š E h ½c h Š : ðþ This population expression also can be converted to a sample estimate,

10 ^s ðnpþ ¼ var h X K k¼1 kh ^p kjh^c! ¼ XH h¼1 Non-parametric multilevel regression mixture 35 ^p h ð^c h Þ XH h¼1 ^p h^c h! : ð1þ Using r NP and s (NP) from the NPMM, we then compute the implied ICC: 8 s ðnpþ ICC NP ¼ s ðnpþ þ r : ðþ NP With these implied variance components now enumerated for the NPMM, we can better understand how NPMM parameters relate to the degree of nesting, that is, the magnitude of the ICC if an MLM were fitted. This can aid researchers in indirectly interpreting classes from a null NPMM. Specifically, examining equation (), we see that greater spread of marginal level- class intercepts in NPMM implies larger between-cluster variance in a fitted MLM (increasing ICC). In other words, with more level- class separation, there is more distinction between clusters and, thus, greater effects of nesting. This is illustrated by comparing Fig a,b. Furthermore, examining equation (18), we see that smaller kh class-combination residual variance in NPMM implies smaller withincluster variance in a fitted MLM (increasing ICC). This is illustrated by comparing Fig a,c. Also, from equation (18) we see that less spread of intercepts across level-1 classes within level- class in the NPMM implies smaller within-cluster variance in MLM (increasing ICC). This indicates less distinction across level-1 classes within level- class and, thus, less overall variability within clusters. This pattern is illustrated by comparing Fig a,d. 9 In sum, in addition to directly interpreting kh class-combination parameter differences in the null NPMM, researchers can better understand how the NPMM accounts for nestedness by indirectly interpreting these parameters in relation to the familiar ICC from the MLM. That is, it is now possible for researchers to simultaneously consider two options: that nesting is driven by some continuous distribution of intercepts across clusters, or that nesting is driven by patterns of discrete class differences. We will revisit both in the empirical example in Section Relationship of NPMM to random slope MLM Suppose now that a researcher is interested in fitting a more complex NPMM, involving a single level-1 covariate, x ij, whose effect on y ij is allowed to differ across level- classes, as in equation (5). 1 In equation (5), c h 1 denotes the level- class h slope of x ij; this slope does not vary across k within h. In equation (5), the intercept c kh is again allowed to differ across all kh class combinations. 8 Note that each term in equation () is a variance and is thus bounded by. The implied ICC is thus also bounded by. 9 In the Fig demonstration, between level- class differences are driven by intercepts that vary freely across class combination. For simplicity, generating kh class-combination probabilities are equal. Alternatively, even if parameters (e.g., intercepts) were equal across h for all k (i.e., c 11 ¼ c 1 and c1 ¼ c ), between level- class differences could still be driven by level-1 class proportions varying by level- class. Similar ICC patterns could be found. 1 Covariates can also/instead predict class membership (e.g., Muthen & Asparouhov, 9; Vermunt, 3), but we do not include this here because our focus is on non-parametric multilevel mixture regression models, where covariates predict outcomes within class.

11 36 Jason D. Rights and Sonya K. Sterba From a direct interpretation of classes, the marginal level- class slopes (c h 1 ) could be interpreted as substantively meaningful slopes corresponding to different subpopulations of clusters. For example, suppose the number of hours students spend studying at home (x ij ) has a strong effect on maths achievement in one school-level class but a weak effect in another school-level class, perhaps because these school-level classes differ in how effectively course material is taught in the classroom. Alternatively, we can indirectly interpret these slopes as approximating an underlying continuous distribution of slopes across clusters. We delineate the indirect interpretation by first defining the MLM corresponding to the NPMM in equation (5) namely, the random slope MLM in equation (6). In equation (6), there are normally distributed random effects at level- for the intercept, u j ~ N(, s ), and slope of x ij, u 1j ~ N(, s 11 ), which are allowed to covary (s 1 ). This yields a 9 covariance matrix of random effects, T. There is also a fixed component for the slope, denoted c 1, and intercept, denoted c. Let c denote the vector of fixed effects. Here, c = [c c 1 ]. Next, to aid in an indirect interpretation, we compute the implied c and implied T from equation (5) NPMM parameters. To approximate c using NPMM parameters, we first define c kh as the vector of all coefficients in the NPMM and take its expectation across k and h: c NP ¼ E h ½E kjh ½c kh ŠŠ ¼ c : ð3þ The corresponding sample estimate of equation (3) is ^c NP ¼ XH X K h¼1 k¼1 ^p kh^c kh ¼ ^c : ð4þ To approximate T using NPMM parameters, we first define c h as a vector of marginal level- class coefficients in the NPMM with implied random effects. Here, c h ¼½c h c h 1 Š. Because the intercept (c kh ) is allowed to differ across all kh in equation (5), the marginal level- class h intercept, c h, is obtained as E kjh½c kh Š¼c h. Because the slope of x ij differs only across h in equation (5), the marginal level- class h slope c h 1 is directly estimated.11 We can then take the variance of c h across level- classes to obtain the implied T, approximated using NPMM parameters: T NP ¼ var h ðc h Þ¼E h ½c h c h Š E h ½c h ŠE h ½c h Š: ð5þ Expanding this expression for our example yields a 9 implied T consisting of the implied intercept variance (as in equation ), implied slope variance, and implied intercept slope covariance: T NP ¼ Eh½ðc h Þ Š E ½c h h Š E ½c h h c h 1 Š Eh½c h ŠEh½c h 1 Š Eh½ðc h 1 Þ Š E ½c h h 1 Š : ð6þ The matrix expression in equation (5) would apply generally, that is, to any number of level-1 covariates whose effects in c h differ across level- classes. The corresponding sample estimate of equation (5) is 11 If the slope of x ij does vary across kh, it would be computed as c h 1 ¼ E kjh ½c kh 1 Š, as done later.

12 Non-parametric multilevel regression mixture 37 (a) versus (b): Increased marginal level- class intercept separation in NPMM increases ICC in fitted MLM (a) Frequency γ γ (b) Frequency γ γ y ij y ij Fitted MLM ICC.81 τ 7.56 σ 1.81 Fitted MLM ICC.87 τ 14.4 σ 1.81 (a) versus (c): Decreased class-combination residual variance in NPMM increases ICC in fitted MLM (c) Frequency γ 1 γ y ij Fitted MLM ICC.85 τ 7.55 σ 1.33 (a) versus (d): Decreased class-combination intercept separation within L classes in NPMM increases ICC in fitted MLM (d) Frequency γ 1 γ Fitted MLM ICC τ σ y ij Figure. Relationship between NPMM parameters and the ICC from a fitted MLM: generating model = null NPMM (equation 1) and fitted model = null MLM (equation ). (a) versus (b) Increased marginal level- class intercept separation in NPMM increases ICC in fitted MLM. (a) versus (c) Decreased class-combination residual variance in NPMM increases ICC in fitted MLM. (a) versus (d) Decreased class-combination intercept separation within level- classes in NPMM increases ICC in fitted MLM. Notes. Plots show kernel densities of data generated from a K = (level-1 classes), H = (level- classes) null NPMM and fitted with a random-intercept-only MLM. For illustrations in Figures 5, data were generated in SAS 9.4 and MLM estimates were computed using Proc MIXED. Note that the same pattern of results would hold with smaller ICC values; the conditions here were chosen for clarity of graphical illustration.

13 38 Jason D. Rights and Sonya K. Sterba ^T NP ¼ var ð^c h h Þ¼ XH ^p h^c h^c h h¼1 XH h¼1 ^p h^c h! X H h¼1 ^p h^c h!: ð7þ A key result from equations (5) to (7) is that the implied random effect variance from the NPMM will be non-zero only when there is variance in the marginal level- class parameters. This is illustrated in Fig 3a. On the other hand, if the marginal level- class parameters are equal, the implied random effect variance will be zero. This is illustrated in Fig 3b. In the context of fitting an NPMM, if marginal level- class parameters are explicitly constrained equal, this yields an approximation of a fixed slope, as shown in equations (3) and (4). In the discussion in Section 8, we consider the potential for utilizing this knowledge in significance testing of implied random effect variances using NPMM parameters. 4. Relationship of NPMM to MLM with heteroscedasticity of residuals at level 1 Suppose now that a researcher is interested in fitting an NPMM where the effect of a level-1 covariate, x ij, is allowed to vary by level-1 class within level- class, as in equation (7). That is, in the equation (7) NPMM, the slope of x ij is allowed to vary not only across h (as in equation 5), but also across k within each h. The slope of x ij in class combination kh is denoted c kh 1. From a direct interpretation standpoint, we could consider x ij slope variation by level-1 class within level- class to be substantively meaningful. For instance, within the school-level class where the number of hours spent studying (x ij ) strongly affects maths achievement, the effect of studying may still vary across student-level classes (perhaps because student-level classes differ in ability levels and test anxiety). Alternatively, we could consider an indirect interpretation for this x ij slope variation within level- class. However, no previous sources have addressed precisely what this slope variation in the NPMM would be approximating in MLM. Here we show that, under an indirect interpretation, slope variation within level- class in the NPMM approximates heteroscedastic level-1 residual variance in MLM, in which MLM residuals (e ij ) depend linearly on x ij. To explain this, we start by defining the MLM corresponding to the NPMM in equation (7). This corresponding MLM is provided in equation (8), where e ij = e ij + e 1ij x ij (e.g., Goldstein, 11; Snijders & Berkhof, 7). This non-standard MLM specification is conceptually equivalent to having a random effect of x ij at level 1, meaning that the effect of x ij varies across level-1 units (unlike a conventional random effect that varies across clusters). The equation (8) MLM 1 where residuals depend linearly on x ij yields the following quadratic residual variance expression (Goldstein, 11; Snijders & Bosker, 1): 1 Note that residuals can also depend linearly on a covariate in single-level contexts. Thus, the approximation outlined in this section is also relevant for relating single-level regression mixture models to conventional singlelevel regression models with residual heteroscedasticity.

14 Non-parametric multilevel regression mixture 39 yˆ ij yˆ ij (a) (b) x ij x ij Figure 3. Relationship between NPMM parameters and the random slope from a fitted MLM. (a) Implied random slope and (b) implied fixed slope. Notes. Both plots show predicted scores (^y ij jx ij ; u j ) in a fitted MLM. Each line represents the regression line for a single cluster. Data were generated from a K = 4, H = 4 NPMM with a single cluster-mean-centred level-1 covariate. Analyses in (a) are of data generated from an NPMM with varying marginal level- class slopes of x ij, whereas analyses in (b) are of data generated from an NPMM with constant marginal level- class slopes of x ij. As such, the data for (a) are consistent with a random slope in a fitted MLM and the data for (b) are consistent with a fixed slope in a fitted MLM. r ij ¼ var ijðe ij þ e 1ij x ij Þ¼r þ x ij r 1 þ x ijr ;1 : ð8þ Variances r and r 1 and covariance r,1 are defined in equation (8). For instance, this MLM allowing for level-1 heteroscedasticity can be useful when within-cluster variability in maths achievement is expected to be smaller for students who spend moderate hours studying, but larger for students who spend many or few hours studying. Within cluster, achievement can be highly variable for students who study intensively (some perform well whereas others perform poorly due to test anxiety) or study infrequently (some perform poorly whereas others perform well due to high ability). Taken together, equation (8) clarifies that the residual variance in the equation (8) MLM is heteroscedastic: r ij is specific to the x ij score for observation i in cluster j. To further clarify how the equation (7) NPMM indirectly approximates the MLM s level-1 heteroscedasticity in equation (8), we compute the corresponding quadratic variance expression with NPMM parameters. Specifically, we take the variance of y ij j cij ¼ k;d j ¼ h across level-1 classes within level- class, conditional on x ij, and pool across level- classes:

15 33 Jason D. Rights and Sonya K. Sterba r ijðnpþ ¼ E h½var kjh ðc kh þ ckh 1 x ij þ e ij ÞŠ ¼ E h ½E kjh ½ðc kh The sample estimate of equation (9) is ^r ijðnpþ ¼ XH X K h¼1 k¼1 ^p kh ðð^c kh c h Þ þ xij ðckh 1 c h 1 Þ þ x ij ðc kh c h Þðckh 1 c h 1 Þþhkh ŠŠ: ð9þ ^c h Þ þ xij ð^ckh 1 ^c h 1 Þ þ x ij ð^c kh ^c h Þð^ckh 1 ^c h 1 Þþ^h kh Þ: ð3þ Comparing equation (9) to (8), the expectation of ðc kh c h Þ þ h kh approximates r, the expectation of ðckh 1 c h 1 Þ approximates r 1, and the expectation of ðc kh c h Þðckh 1 c h 1 Þ approximates r,1. More generally, with q covariates affecting level-1 residuals, the MLM level-1 residual variance in equation (8) is expressed as r ij ¼ v ij Rv ij ð31þ where the vector v ij contains 1 and q covariates affecting level-1 residuals and Σ is a (q + 1) 9 (q + 1) covariance matrix of level-1 random effects used in modelling r ij. The corresponding expression implied by the NPMM generalizes to r ijðnpþ ¼ E h½var kjh ðv ij tkh þ e ij ÞŠ ¼ v ij Eh ½E kjh ½ðt kh t h Þðt kh t h Þ þ ah kh a ŠŠ v ij ¼ v ij R NPv ij ; ð3þ where a ¼ v ij v ij v ij 1 and υ kh is the (q +1)91 subvector of c kh containing coefficients for v ij. In the sample, this yields ^r ijðnpþ ¼ v ij X H X K h¼1 k¼1 ¼ v ij ^R NP v ij : ^p kh ðð^t kh ^t h Þð^t kh ^t h Þ þ a^h kh a Þ From equations (9) and (3), a key result is evident: if there is no slope variation across level-1 class within level- class (e.g., c kh 1 ¼ c h 1 for all h, as in equation 5), the linear and quadratic terms drop out. All that remains is a constant, implied homoscedastic residual variance, given by equation (19). This point is illustrated in Fig 4b in which, under these conditions, the variance of residuals ^e ij from a fitted MLM is shown to be homoscedastic constant across x ij. Fig 4a depicts the alternative situation: when level-1 class slopes vary within level- class (c kh 1 6¼ c h 1 for some h and k) the variance of residuals ^e ij from a fitted MLM is heteroscedastic greater at the extremes of x ij. In sum, in NPMM, slope differences by level-1 class within level- class afford not only a direct interpretation, but also an indirect interpretation in terms of accommodating level-1 heteroscedasticity in MLM. Given that the latter MLM specification is! v ij ð33þ

16 Non-parametric multilevel regression mixture 331 (a) eˆij x ij (b) eˆij x ij Figure 4. Relationship between NPMM parameters and heteroscedasticity of residuals at level-1 in MLM. (a) Level-1 heteroscedasticity in fitted MLM and (b) Level-1 homoscedasticity in fitted MLM. Notes. Both plots show estimated level-1 residuals from a fitted MLM, ^e ij, plotted by x ij. Data were generated from a K = 4, H = 4 NPMM with a single group-mean-centred level-1 covariate. Analyses in (a) are of data generated from an NPMM with varying slopes across level- 1 class within level- class, while analyses in (b) are of data generated from an NPMM with constant slopes across level-1 class within level- class. As such, the data for (a) are consistent with level-1 heteroscedasticity in a fitted MLM and the data for (b) are consistent with level-1 homoscedasticity in a fitted MLM. underused, unfamiliar (Goldstein, 11; Snijders & Bosker, 1) and, moreover, unavailable in some MLM software, an indirect interpretation of NPMM can be used to highlight or investigate level-1 heteroscedasticity. 5. Relationship of NPMM to MLM with heteroscedasticity of residuals at level- Now suppose a researcher is interested in fitting a more complex NPMM that also includes a level- covariate, w j, such as student teacher ratio. If we fit the NPMM in equation (9) where the slope of w j does not vary across level- or level-1 classes (c ), the corresponding MLM is defined as having a fixed slope of w j, as shown in equation (1). Suppose instead we fit the NPMM in equation (11) where the slope of w j (i.e., c h ) varies across level- classes. From a direct interpretation standpoint, one could consider variance in the marginal level- class slope of w j between level- classes in equation (11) to represent substantively

17 33 Jason D. Rights and Sonya K. Sterba meaningful slope differences between cluster-level subpopulations. For example, perhaps the effect of student-teacher ratio (w j ) on maths achievement is stronger for some school-level classes (e.g., those with more focus on class participation) compared to other school-level classes. If we instead desired an indirect interpretation of equation (11), previous sources have not addressed what the w j slope variation across level- class in the NPMM would be approximating in the MLM. Under an indirect interpretation, w j slope variation across level- classes in the NPMM approximates heteroscedastic level- residual variance in the MLM, where MLM intercept residuals, u j, depend linearly on w j. To illustrate this, we first define the MLM corresponding to the NPMM in equation (11). This non-standard MLM is provided in equation (1), where u j = u j + u 3j w j (see Goldstein, 11; Snijders & Berkhof, 7). Modelling level- heteroscedasticity in this way in the MLM essentially adds a random effect, u 3j, for a level- covariate at level, only u 3j is not interpreted as a conventional random effect, but as a means to model heteroscedasticity. The equation (1) MLM yields the following quadratic variance expression for intercept residuals (e.g., Goldstein, 11; Snijders & Bosker, 1): s j ¼ varðu j þ u 3j w j Þ¼s þ w j s 33 þ w j s 3 : ð34þ The variances s and s 33 and covariance s 3 were defined in equation (1). Such an expression can be useful when, for example, larger intercept residual variance is anticipated for more extreme student teacher ratios. This heteroscedastic intercept variance, s j, is specific to cluster j because it depends on the value of w j. To further illustrate this indirect interpretation of the equation (11) NPMM, we compute the quadratic variance expression corresponding to equation (34) that is implied by NPMM parameters, as shown in the following equation, which computes the variance of the sum of the marginal level- class intercept and slope of conditional on w j : s jðnpþ ¼ var ðc h h þ c h w jþ ¼ E ½ðc h h c Þ þ w j ðc h c Þ þ w ðc h j c Þðc h c ÞŠ: ð35þ Comparing equation (35) to (34), the expectation of ðc h c Þ approximates the variance s, the expectation of ðc h c Þ approximates the variance s 33, and the expectation of ðc h c Þðc h c Þ approximates the covariance s 3. The sample counterpart to equation (35) is: ^s jðnpþ ¼ XH h¼1 ^p h ðð^c h ^c Þ þ w j ð^c h ^c Þ þ w ð^c h j ^c Þð^c h ^c ÞÞ: ð36þ We now consider a more general expression with p level- covariates affecting intercept residuals. The equation (34) MLM expression generalizes to s j ¼ w j Uw j; ð37þ where the vector w j contains 1 and p level- covariates affecting intercept residuals and Φ is a (p + 1) 9 (p + 1) covariance matrix, a submatrix of T, used in modelling the

18 Non-parametric multilevel regression mixture 333 heteroscedastic intercept variance, s j. The corresponding expression implied by the NPMM generalizes to s jðnpþ ¼ var h ðw j a h Þ ¼ w j Eh½ða h a Þða h a Þ Šw j ¼ w j U NPw j ; ð38þ where a h is a (p +1) 9 1 subvector of c h containing marginal level- coefficients for w j. In the sample this yields ^s jðnpþ ¼ w j X H h¼1 ¼ w j ^U NP w j : ^p h ð^a h ^a Þð^a h ^a Þ!w j ð39þ From equations (35) and (36) a key result is evident: if there is no marginal level- class slope variation for level- covariates in NPMM (e.g., c h ¼ c for all h, as in equation 9), the linear and quadratic terms drop out. This leaves the implied homoscedastic intercept variance expression given by equation (). This point is illustrated in Fig 5b in which, under these conditions, the variance of level- intercept residuals in a fitted MLM, ^u j, remains constant across w j (i.e., homoscedastic). Fig 5a illustrates the opposite situation: provided c h 6¼ c for some h, there is implied heteroscedasticity of level- intercept residuals in a fitted MLM. Thus, an indirect interpretation of NPMM offers the ability to explore and accommodate MLM level- intercept heteroscedasticity another underused and unfamiliar MLM specification (Snijders & Bosker, 1). 6. Relationship of NPMM to general matrix formulation of MLM In previous sections, we demonstrated specific relationships between particular NPMM and MLM specifications in isolation. In this section, we integrate these results to (a) describe a general NPMM expression, (b) define its counterpart MLM, and (c) summarize how this general NPMM indirectly accommodates any number of covariates and any combination of implied fixed or random effects and implied homoscedasticity or heteroscedasticity of level-1 and/or level- residuals in MLM. The general NPMM specification is given in equation (15) in Table. Here, intercepts and both level-1 and level- covariate slopes are allowed to freely vary across all kh class combinations. This general NPMM in equation (15) corresponds with the MLM specification in equation (16) when v ij contains all covariates (i.e., v ij = z ij ) and w j contains all level- covariates. Specifically, the general equation (15) NPMM indirectly approximates the following in the corresponding MLM: random intercept, random slopes for all level-1 covariates (because their c h vary across h), level-1 heteroscedasticity for all covariates (because their c kh vary across k within h), and level- heteroscedasticity for all level- covariates (because their c h vary across h). To summarize how the MLM parameters in equation (16) are indirectly approximated with general NPMM parameters: c is indirectly approximated by equation (4), T by equation (7), Σ by equation (33), and Φ by equation (39). An R function for computing elements of each matrix is provided in our Appendix S1 to aid in indirect interpretation of the NPMM.

19 334 Jason D. Rights and Sonya K. Sterba (a) uˆ j w j (b) uˆ j w j Figure 5. Relationship between NPMM parameters and heteroscedasticity of intercept residuals at level in MLM. (a) Level- heteroscedasticity in fitted MLM and (b) Level- homoscedasticity in fitted MLM. Notes. Both plots show estimated level- intercept residuals in a fitted MLM, ^u j, plotted by the level- covariate w j. Data were generated from an NPMM. Analyses in (a) are of data generated from an NPMM with varying marginal level- class slopes, while analyses in (b) are of data generated from an NPMM with constant marginal level- class slopes. As such, the data for (a) are consistent with level- heteroscedasticity in a fitted MLM and the data for (b) are consistent with level- homoscedasticity in a fitted MLM. *Generating models included a single level- covariate, w j, which is slightly different than equations (9) and (11), which also include x ij. 7. Empirical example We now present an empirical application implementing the indirect interpretation of NPMM and, further, demonstrate how this interpretation can provide additional utility beyond simply implementing either an MLM or a direct interpretation of the NPMM. Using a data set of students nested within schools from the Trends in International Mathematics and Science Study (TIMSS; Mullis, Martin, Gonzalez, & Chrostowski, 4), we investigate effects of mathematics confidence (MATHCONF, cluster-mean-centred level-1 x ij ) and percentage of students receiving free/reduced lunch (PERLUNCH, level- w j ) on mathematics exam scores involving data/probability (MATH) for eighth graders. Confidence should be positively associated with maths achievement (Pajares & Miller, 1994) as it is thought to enable students to better demonstrate their knowledge. Schoollevel socioeconomic deprivation should be negatively associated with maths achievement (Bradley & Corwyn, ) as it is a proxy for disadvantages that impede learning (for further discussion, see Lubinski, 9). Level-1 residual heteroscedasticity by confidence could arise if students with high or low confidence levels have more performance variability

Multilevel Modeling: A Second Course

Multilevel Modeling: A Second Course Multilevel Modeling: A Second Course Kristopher Preacher, Ph.D. Upcoming Seminar: February 2-3, 2017, Ft. Myers, Florida What this workshop will accomplish I will review the basics of multilevel modeling

More information

Multilevel regression mixture analysis

Multilevel regression mixture analysis J. R. Statist. Soc. A (2009) 172, Part 3, pp. 639 657 Multilevel regression mixture analysis Bengt Muthén University of California, Los Angeles, USA and Tihomir Asparouhov Muthén & Muthén, Los Angeles,

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement Second meeting of the FIRB 2012 project Mixture and latent variable models for causal-inference and analysis

More information

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

A multivariate multilevel model for the analysis of TIMMS & PIRLS data A multivariate multilevel model for the analysis of TIMMS & PIRLS data European Congress of Methodology July 23-25, 2014 - Utrecht Leonardo Grilli 1, Fulvia Pennoni 2, Carla Rampichini 1, Isabella Romeo

More information

Multilevel Regression Mixture Analysis

Multilevel Regression Mixture Analysis Multilevel Regression Mixture Analysis Bengt Muthén and Tihomir Asparouhov Forthcoming in Journal of the Royal Statistical Society, Series A October 3, 2008 1 Abstract A two-level regression mixture model

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England 1 Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England By Shakir Hussain 1 and Ghazi Shukur 1 School of health

More information

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Research Methods Festival Oxford 9 th July 014 George Leckie

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

Investigating Population Heterogeneity With Factor Mixture Models

Investigating Population Heterogeneity With Factor Mixture Models Psychological Methods 2005, Vol. 10, No. 1, 21 39 Copyright 2005 by the American Psychological Association 1082-989X/05/$12.00 DOI: 10.1037/1082-989X.10.1.21 Investigating Population Heterogeneity With

More information

Online Appendix for Sterba, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48,

Online Appendix for Sterba, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48, Online Appendix for, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48, 775-815. Table of Contents. I. Full presentation of parallel-process groups-based trajectory

More information

The Basic Two-Level Regression Model

The Basic Two-Level Regression Model 7 Manuscript version, chapter in J.J. Hox, M. Moerbeek & R. van de Schoot (018). Multilevel Analysis. Techniques and Applications. New York, NY: Routledge. The Basic Two-Level Regression Model Summary.

More information

Overview of Talk. Motivating Example: Antisocial Behavior (AB) Raw Data

Overview of Talk. Motivating Example: Antisocial Behavior (AB) Raw Data An Evaluative Comparison of Random Coefficient Growth Models for Individual Development May 6, 7 Association for Psychological Science Overview of Talk Motivating example Definition of Random Coefficient

More information

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010 Dyadic Data Analysis Richard Gonzalez University of Michigan September 9, 2010 Dyadic Component 1. Psychological rationale for homogeneity and interdependence 2. Statistical framework that incorporates

More information

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3 Table of contents TABLE OF CONTENTS...1 1 INTRODUCTION TO MIXED-EFFECTS MODELS...3 Fixed-effects regression ignoring data clustering...5 Fixed-effects regression including data clustering...1 Fixed-effects

More information

Growth Mixture Model

Growth Mixture Model Growth Mixture Model Latent Variable Modeling and Measurement Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 28, 2016 Slides contributed

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

1 Introduction. 2 A regression model

1 Introduction. 2 A regression model Regression Analysis of Compositional Data When Both the Dependent Variable and Independent Variable Are Components LA van der Ark 1 1 Tilburg University, The Netherlands; avdark@uvtnl Abstract It is well

More information

Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs

Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

More information

Analyzing Criminal Trajectory Profiles: Bridging Multilevel and Group-based Approaches Using Growth Mixture Modeling

Analyzing Criminal Trajectory Profiles: Bridging Multilevel and Group-based Approaches Using Growth Mixture Modeling J Quant Criminol (2008) 24:1 31 DOI 10.1007/s10940-007-9036-0 ORIGINAL PAPER Analyzing Criminal Trajectory Profiles: Bridging Multilevel and Group-based Approaches Using Growth Mixture Modeling Frauke

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information

Estimating Multilevel Linear Models as Structural Equation Models

Estimating Multilevel Linear Models as Structural Equation Models Bauer & Curran, Psychometric Society Paper Presentation, 6/2/02 Estimating Multilevel Linear Models as Structural Equation Models Daniel J. Bauer North Carolina State University Patrick J. Curran University

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Partitioning variation in multilevel models.

Partitioning variation in multilevel models. Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response

More information

Determining the number of components in mixture models for hierarchical data

Determining the number of components in mixture models for hierarchical data Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000

More information

Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models

Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models Journal of Of cial Statistics, Vol. 14, No. 3, 1998, pp. 267±275 Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models Michael P. ohen 1 Behavioral and social data commonly

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Three-level meta-analysis of dependent effect sizes

Three-level meta-analysis of dependent effect sizes Behav Res (213) 45:576 594 DOI 1.3758/s13428-12-261-6 Three-level meta-analysis of dependent effect sizes Wim Van den Noortgate & José Antonio López-López & Fulgencio Marín-Martínez & Julio Sánchez-Meca

More information

Abstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size

Abstract Title Page. Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size Abstract Title Page Title: Degenerate Power in Multilevel Mediation: The Non-monotonic Relationship Between Power & Effect Size Authors and Affiliations: Ben Kelcey University of Cincinnati SREE Spring

More information

Multilevel Analysis of Grouped and Longitudinal Data

Multilevel Analysis of Grouped and Longitudinal Data Multilevel Analysis of Grouped and Longitudinal Data Joop J. Hox Utrecht University Second draft, to appear in: T.D. Little, K.U. Schnabel, & J. Baumert (Eds.). Modeling longitudinal and multiple-group

More information

Growth models for categorical response variables: standard, latent-class, and hybrid approaches

Growth models for categorical response variables: standard, latent-class, and hybrid approaches Growth models for categorical response variables: standard, latent-class, and hybrid approaches Jeroen K. Vermunt Department of Methodology and Statistics, Tilburg University 1 Introduction There are three

More information

Two-Level Models for Clustered* Data

Two-Level Models for Clustered* Data Two-Level Models for Clustered* Data Today s Class: Fixed vs. Random Effects for Modeling Clustered Data ICC and Design Effects in Clustered Data Group-Mean-Centering vs. Grand-Mean Centering Model Extensions

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models Fei Gu Kristopher J. Preacher Wei Wu 05/21/2013 Overview Introduction: estimate MLM as SEM (Bauer,

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

This is the publisher s copyrighted version of this article.

This is the publisher s copyrighted version of this article. Archived at the Flinders Academic Commons http://dspace.flinders.edu.au/dspace/ This is the publisher s copyrighted version of this article. The original can be found at: http://iej.cjb.net International

More information

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students

More information

Introduction to. Multilevel Analysis

Introduction to. Multilevel Analysis Introduction to Multilevel Analysis Tom Snijders University of Oxford University of Groningen December 2009 Tom AB Snijders Introduction to Multilevel Analysis 1 Multilevel Analysis based on the Hierarchical

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Modeling Heterogeneity in Indirect Effects: Multilevel Structural Equation Modeling Strategies. Emily Fall

Modeling Heterogeneity in Indirect Effects: Multilevel Structural Equation Modeling Strategies. Emily Fall Modeling Heterogeneity in Indirect Effects: Multilevel Structural Equation Modeling Strategies By Emily Fall Submitted to the Psychology and the Faculty of the Graduate School of the University of Kansas

More information

Logistic And Probit Regression

Logistic And Probit Regression Further Readings On Multilevel Regression Analysis Ludtke Marsh, Robitzsch, Trautwein, Asparouhov, Muthen (27). Analysis of group level effects using multilevel modeling: Probing a latent covariate approach.

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs Chad S. Briggs, Kathie Lorentz & Eric Davis Education & Outreach University Housing Southern Illinois

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén Part 2 May 7, 2018 Tihomir Asparouhov and Bengt Muthén Part 2 Muthén & Muthén 1/ 42 Overview

More information

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents Longitudinal and Panel Data Preface / i Longitudinal and Panel Data: Analysis and Applications for the Social Sciences Table of Contents August, 2003 Table of Contents Preface i vi 1. Introduction 1.1

More information

Growth Mixture Modeling and Causal Inference. Booil Jo Stanford University

Growth Mixture Modeling and Causal Inference. Booil Jo Stanford University Growth Mixture Modeling and Causal Inference Booil Jo Stanford University booil@stanford.edu Conference on Advances in Longitudinal Methods inthe Socialand and Behavioral Sciences June 17 18, 2010 Center

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES Cora J.M. Maas 1 Utrecht University, The Netherlands Joop J. Hox Utrecht University, The Netherlands In social sciences, research

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Diagnostic Procedures for Detecting Nonlinear Relationships Between Latent Variables

Diagnostic Procedures for Detecting Nonlinear Relationships Between Latent Variables Structural Equation Modeling, 19:157 177, 2012 Copyright Taylor & Francis Group, LLC ISSN: 1070-5511 print/1532-8007 online DOI: 10.1080/10705511.2012.659612 Diagnostic Procedures for Detecting Nonlinear

More information

Testing and Interpreting Interaction Effects in Multilevel Models

Testing and Interpreting Interaction Effects in Multilevel Models Testing and Interpreting Interaction Effects in Multilevel Models Joseph J. Stevens University of Oregon and Ann C. Schulte Arizona State University Presented at the annual AERA conference, Washington,

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

IMPACT OF NOT FULLY ADDRESSING CROSS-CLASSIFIED MULTILEVEL STRUCTURE IN TESTING MEASUREMENT INVARIANCE AND CONDUCTING MULTILEVEL MIXTURE MODELING

IMPACT OF NOT FULLY ADDRESSING CROSS-CLASSIFIED MULTILEVEL STRUCTURE IN TESTING MEASUREMENT INVARIANCE AND CONDUCTING MULTILEVEL MIXTURE MODELING IMPACT OF NOT FULLY ADDRESSING CROSS-CLASSIFIED MULTILEVEL STRUCTURE IN TESTING MEASUREMENT INVARIANCE AND CONDUCTING MULTILEVEL MIXTURE MODELING WITHIN STRUCTURAL EQUATION MODELING FRAMEWORK A Dissertation

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Multi-level Models: Idea

Multi-level Models: Idea Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random

More information

Outline

Outline 2559 Outline cvonck@111zeelandnet.nl 1. Review of analysis of variance (ANOVA), simple regression analysis (SRA), and path analysis (PA) 1.1 Similarities and differences between MRA with dummy variables

More information

Multilevel Structural Equation Modeling

Multilevel Structural Equation Modeling Multilevel Structural Equation Modeling Joop Hox Utrecht University j.hox@uu.nl http://www.joophox.net 14_15_mlevsem Multilevel Regression Three level data structure Groups at different levels may have

More information

ABSTRACT. Futoshi Yumoto, Doctor of Philosophy Fairness is necessary to successful evaluation, whether the context is simple and

ABSTRACT. Futoshi Yumoto, Doctor of Philosophy Fairness is necessary to successful evaluation, whether the context is simple and ABSTRACT Title of dissertation: EFFECTS OF UNMODELED LATENT CLASSES ON MULTILEVEL GROWTH MIXTURE ESTIMATION IN VALUE-ADDED MODELING Futoshi Yumoto, Doctor of Philosophy 2011 Dissertation directed by: Professor

More information

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011 INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA Belfast 9 th June to 10 th June, 2011 Dr James J Brown Southampton Statistical Sciences Research Institute (UoS) ADMIN Research Centre (IoE

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

From Micro to Macro: Multilevel modelling with group-level outcomes

From Micro to Macro: Multilevel modelling with group-level outcomes From Micro to Macro: Multilevel modelling with group-level outcomes by C.A. (Marloes) Onrust s274 A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science Methodology

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017 MLMED User Guide Nicholas J. Rockwood The Ohio State University rockwood.19@osu.edu Beta Version May, 2017 MLmed is a computational macro for SPSS that simplifies the fitting of multilevel mediation and

More information

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal

More information

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,

More information

1 Introduction. 2 Example

1 Introduction. 2 Example Statistics: Multilevel modelling Richard Buxton. 2008. Introduction Multilevel modelling is an approach that can be used to handle clustered or grouped data. Suppose we are trying to discover some of the

More information

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018 How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018 The random intercept cross lagged panel model (RI CLPM) as proposed by Hamaker, Kuiper and Grasman (2015, Psychological Methods) is a model

More information

A NEW MODEL FOR THE FUSION OF MAXDIFF SCALING

A NEW MODEL FOR THE FUSION OF MAXDIFF SCALING A NEW MODEL FOR THE FUSION OF MAXDIFF SCALING AND RATINGS DATA JAY MAGIDSON 1 STATISTICAL INNOVATIONS INC. DAVE THOMAS SYNOVATE JEROEN K. VERMUNT TILBURG UNIVERSITY ABSTRACT A property of MaxDiff (Maximum

More information

Bayesian Mixture Modeling

Bayesian Mixture Modeling University of California, Merced July 21, 2014 Mplus Users Meeting, Utrecht Organization of the Talk Organization s modeling estimation framework Motivating examples duce the basic LCA model Illustrated

More information

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012 Ronald Heck Week 14 1 From Single Level to Multilevel Categorical Models This week we develop a two-level model to examine the event probability for an ordinal response variable with three categories (persist

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java)

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Pepi Zulvia, Anang Kurnia, and Agus M. Soleh Citation: AIP Conference

More information

Regression in R. Seth Margolis GradQuant May 31,

Regression in R. Seth Margolis GradQuant May 31, Regression in R Seth Margolis GradQuant May 31, 2018 1 GPA What is Regression Good For? Assessing relationships between variables This probably covers most of what you do 4 3.8 3.6 3.4 Person Intelligence

More information

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Spring RMC Professional Development Series January 14, 2016 Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Ann A. O Connell, Ed.D. Professor, Educational Studies (QREM) Director,

More information

THE EFFECTS OF MULTICOLLINEARITY IN MULTILEVEL MODELS

THE EFFECTS OF MULTICOLLINEARITY IN MULTILEVEL MODELS THE EFFECTS OF MULTICOLLINEARITY IN MULTILEVEL MODELS A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy By PATRICK C. CLARK B.A., Indiana University

More information

Notation Survey. Paul E. Johnson 1 2 MLM 1 / Department of Political Science

Notation Survey. Paul E. Johnson 1 2 MLM 1 / Department of Political Science MLM 1 / 52 Notation Survey Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas 2015 MLM 2 / 52 Outline 1 Raudenbush and Bryk 2 Snijders

More information

FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES

FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES Daniel M. McNeish Measurement, Statistics, and Evaluation University of Maryland, College Park Background

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

Multilevel Analysis of Assessment Data

Multilevel Analysis of Assessment Data 8 Multilevel Analysis of Assessment Data Jee-Seon Kim University of Wisconsin-Madison Carolyn J. Anderson University of Illinois Bryan Keller Teachers College, Columbia University Contents Introduction...390

More information

Technical and Practical Considerations in applying Value Added Models to estimate teacher effects

Technical and Practical Considerations in applying Value Added Models to estimate teacher effects Technical and Practical Considerations in applying Value Added Models to estimate teacher effects Pete Goldschmidt, Ph.D. Educational Psychology and Counseling, Development Learning and Instruction Purpose

More information

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate

More information

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED Here we provide syntax for fitting the lower-level mediation model using the MIXED procedure in SAS as well as a sas macro, IndTest.sas

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén August 5, 2018 Abstract We discuss different methods for centering a predictor

More information

Confidence Intervals for a Semiparametric Approach to Modeling Nonlinear Relations among Latent Variables

Confidence Intervals for a Semiparametric Approach to Modeling Nonlinear Relations among Latent Variables Structural Equation Modeling, 18:537 553, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1070-5511 print/1532-8007 online DOI: 10.1080/10705511.2011.607072 Confidence Intervals for a Semiparametric Approach

More information

Online Appendices for: Modeling Latent Growth With Multiple Indicators: A Comparison of Three Approaches

Online Appendices for: Modeling Latent Growth With Multiple Indicators: A Comparison of Three Approaches Online Appendices for: Modeling Latent Growth With Multiple Indicators: A Comparison of Three Approaches Jacob Bishop and Christian Geiser Utah State University David A. Cole Vanderbilt University Contents

More information