Psychological Methods

Size: px
Start display at page:

Download "Psychological Methods"

Transcription

1 Psychological Methods A Cautionary Note on Modeling Growth Trends in Longitudinal Data Goran Kuljanin, Michael T. Braun, and Richard P. DeShon Online First Publication, April 5, 011. doi: /a CITATION Kuljanin, G., Braun, M. T., & DeShon, R. P. (011, April 5). A Cautionary Note on Modeling Growth Trends in Longitudinal Data. Psychological Methods. Advance online publication. doi: /a003348

2 Psychological Methods 011, Vol., No., American Psychological Association X/11/$1.00 DOI: /a A Cautionary Note on Modeling Growth Trends in Longitudinal Data Goran Kuljanin, Michael T. Braun, and Richard P. DeShon Michigan State University Random coefficient and latent growth curve modeling are currently the dominant approaches to the analysis of longitudinal data in psychology. The application of these models to longitudinal data assumes that the data-generating mechanism behind the psychological process under investigation contains only a deterministic trend. However, if a process, at least partially, contains a stochastic trend, then random coefficient regression results are likely to be spurious. This problem is demonstrated via a data example, previous research on simple regression models, and Monte Carlo simulations. A data analytic strategy is proposed to help researchers avoid making inaccurate inferences when observed trends may be due to stochastic processes. Keywords: random coefficient model, latent growth curve model, stochastic trend, unit root tests, spurious regression Longitudinal data structures enable the evaluation and modeling of psychological processes that evolve over time. These data can often reveal interesting, dynamic relationships not readily apparent in crosssectional data (e.g., Block, 1995; Mitchell & James, 001; Molenaar, 004; Vancouver, Thompson, & Williams, 001). It is not surprising, then, that longitudinal data collections occur more frequently in psychological research than in the past (Collins, 006). At the same time, technological advances in experience sampling methods make it easier to collect more observations over a longer time span yielding intensive longitudinal data structures (i.e., Walls & Schafer, 006). The unique problems encountered in longitudinal data, compared with cross sectional data, necessitate the use of more complex methods to support accurate psychological inference. Advances in the development of longitudinal data analysis provide researchers with multiple analytical tools. Repeated measures analysis of variance and regression provide researchers with average trajectories across a group of individuals. However, interest in interindividual differences in intraindividual change led to the development of more complex analytic methods, such as latent growth and random coefficient models (a.k.a. multilevel models, hierarchical linear models, mixed effects models). These more complex analytic methods provide both the average trajectory across a group of individuals and capture heterogeneity in individual trajectories if it exists. Irrespective of whether trajectory heterogeneity is substantively interesting or thought of as nuisance variance, it must be represented in the model, and thus, random coefficient modeling is currently the most widely used longitudinal data analysis technique in psychology. Although random coefficient modeling has substantially advanced longitudinal data analysis in psychology, the validity of the parameter estimates, significance tests, and the resulting scientific inferences depend on understanding the applicability of the model assumptions to the longitudinal processes under investigation. Goran Kuljanin, Michael T. Braun, and Richard P. DeShon, Department of Pyschology, Michigan State University. Correspondence concerning this article should be addressed to Goran Kuljanin, Department of Psychology, Michigan State University, East Lansing, MI kuljanin@msu.edu A model fit to a longitudinal process results in a trajectory or set of trajectories. When modeling multiple longitudinal processes (e.g., individuals, teams), the observed trajectories are frequently heterogeneous (Collins & Horn, 1991; Collins & Sayer, 001). Across individuals, trajectory slopes may differentially increase or decrease. Slope heterogeneity further implies a change in the variance of the modeled variable over time, and commonly, the model-implied variance increases especially when there is a positive covariance between intercepts and slopes (e.g., McArdle & Nesselroade, 003). Figure 1 is a graphic depiction of heterogeneous trajectories consistent with those commonly observed in psychological research (e.g., Bollen & Curran, 006; DeLucia & Pitts, 006; Grimm, 007; McArdle & Nesselroade, 003; Willett, 1989). Such observed trajectories are uniformly modeled as a noisy deterministic process by entering a polynomial time index (e.g., linear, quadratic) as a response predictor. Although not widely recognized in psychology, trajectories can also result, at least partially, from a non-deterministic or stochastic process. If the trends present in the trajectories observed in Figure 1 reflect the functioning of a stochastic process, then using a random coefficient model to reflect the dependencies present in the longitudinal data will almost certainly yield spurious results and inferences. The purpose of this presentation is to highlight the existence of this statistical and inferential problem, to evaluate its magnitude, and to provide recommendations for avoiding spurious inference when using random coefficient models to analyze longitudinal data. The Problem Presented The heterogeneous trajectories depicted in Figure 1 are the result of regressing a dependent variable, Y, onto an index variable, Time, either separately for each individual or jointly using random coefficient modeling. The recommended analytic approach when using random coefficient modeling to represent the variances and covariances in this longitudinal data is to fit a sequence of increasingly complex models (e.g., Singer & Willett, 003). The initial model, termed the unconditional means model, is used to evaluate the proportion of variance due to clustering of observations within individ- 1

3 KULJANIN, BRAUN, AND DESHON 4 Y Time Figure 1. Regression lines fit to 50 simulated longitudinal trajectories representative of psychological data. uals. This model also serves as a baseline model for evaluating improvements in model fit achieved by fitting more complex models. After fitting the unconditional means model, a more complex model, termed the unconditional growth model, is used to evaluate the variability of intercepts and slopes. Next, conditional growth models may be fit that incorporate predictors of the observed heterogeneity. To understand the extent to which the modeling recommendations are implemented in empirical research, we reviewed the modeling practices reported in five of the leading American Psychological Association journals (Developmental Psychology, Journal of Applied Psychology, Journal of Clinical and Consulting Psychology, Journal of Educational Psychology, and Journal of Personality and Social Psychology) from January 006 to August 010. In this time span, researchers in 73 journal articles reported using either random coefficient or latent growth curve models to analyze longitudinal data. We found it interesting that researchers in only 1 (i.e., less than 17%) journal articles reported examining the unconditional means model. This is surprising because knowledge of the relative magnitudes of between and within variance can productively inform inference in these models. In contrast, researchers in 55 (i.e., approximately 75%) journal articles reported and interpreted the unconditional growth model. Given that these unconditional models are recommended analytic procedure and the high frequency of analyzing unconditional growth models, we first examine the impact of stochastic processes on the unconditional models. Following this analysis, we switch attention to the impact of stochastic processes on conditional growth models. Unconditional Means Model Using the notation presented in Singer and Willett (003), the unconditional means model is specified as Y ij 0i ε ij 0i 00 0i, (1) where ε ij N 0, ε and 0i N 0, 0, Y ij is the dependent variable measured for person i on occasion j, 0i is the mean of Y for individual i, 00 is the mean of Y across everyone in the population, ε ij is the residual for individual i on occasion j, ε is the pooled within-person variance of each individual s data around his or her mean, 0i is the random effect for individual i (i.e., deviation of the person-specific mean from the grand mean), and 0 is the random effect variance. The unconditional means model splits the total variance into within- and between-person variance. For the data presented in Figure 1, the within-person variation, ˆ, is 1.08, and the between person ε variation, ˆ, is.09. An intraclass correlation coefficient, ICC(1), 0 indicating the proportion of total variance due to individual differences, may be computed using these variance component estimates. For these data, ICC(1) is ˆ 0 /( ˆ 0 ˆ ε ).09/( ) That is, 66% of the total variance resides between individuals. A test may be used to evaluate whether the estimated between-person variance, ˆ, differs from zero (Raudenbush & Bryk, 00). In this 0 example, 49, N 50) , p.05, and thus, there is statistically significant between-person variation. This result suggests that it is reasonable to examine predictors that may explain the observed between-person variability. Another useful statistic for examining variability is the reliability of the estimate, indicating how much of the between-person variation in observed scores is due to variability in true scores. This index is equivalent to ICC(1), with the within-person variance divided by the number of time periods observed. If the number of measurement points for each individual is the same, then this estimate is referred to as ICC() (Bliese, 000). In this example, the reliability of the sample means is ˆ , indicating that 90% of the between-person variation in observed means is variability in population means. In other words, a large proportion of

4 STOCHASTIC TRENDS 3 the observed mean differences between individuals reflects differences in the underlying population means. Thus, the unconditional means model indicates that the majority of variance is betweenperson, the between-person variation is significant, and the variability in sample means is, in large part, a reflection of the variability in populations means. Unconditional Growth Model The unconditional growth model, as specified by Singer and Willett (003), is where Y ij 0i i1 Time ij ε ij 0i 00 0i 1i 10 1i, () ε ij N 0, ε and 0i 1i N 0 0, , 0i is now the initial status (i.e., intercept) of Y for individual i, 00 is the average initial status of Y across everyone in the population, 1i is the rate of change (i.e., slope) of Y for individual i, 10 is the average rate of change of Y across everyone in the population, ε is the pooled variance of each individuals data around his linear change trajectory, 0i is the intercept random effect for individual i, 0 is the variance of intercept random effects, 1i is the slope random effect for individual i, 1 is the variance of slope random effects, 10 is the population covariance between intercepts and slopes, and all other terms are as defined above. Although we focus on the most common unconditional growth model applied in psychological research, other unconditional growth models are possible, such as those that model nonlinear growth or alternative assumptions about random effects and error structures. A log-likelihood ratio test allows the researcher to evaluate whether the unconditional growth model fits the data better than the unconditional means model. For the data presented in Figure 1, the loglikelihood ratio test is significant, (3, N 50) 85.06, p.05, indicating that the unconditional growth model provides a better representation of the data than the unconditional means model. Following this omnibus evaluation, individual variance components are typically examined. In the example data, the estimated intercept variance, ˆ 0 1.3, (49, N 50) 67.66, p.05, and slope variance, ˆ 1 0.3, (49, N 50) 74.48, p.05, both differ from zero. In other words, there is significant between-person heterogeneity in both intercepts and slopes. The estimated average intercept ( ˆ , p.05) and slope ( ˆ , p.05) do not differ from zero. The reliability of observed intercepts, ˆ , and slopes, ˆ 1 0.8, indicates that a large proportion of differences in sample intercepts and slopes are differences in population intercepts and slopes. Given this pattern of results, additional models would likely be investigated by adding predictors to the model (i.e., conditional growth models) to explain the observed variance in initial status (i.e., intercepts) and rate of change (i.e., slopes) across individuals. The random coefficient, longitudinal modeling process just provided is representative of the presentations that may be found in any one of the hundreds of publications that now exist using this methodology. However, an important assumption concerning the source of the modeled heterogeneity using this class of methods has received virtually no attention. In reality, the data presented in Figure 1 were generated from a completely random process with no individual differences in slope trajectories, and yet, the conclusion reached from the unconditional growth model indicated substantial slope heterogeneity across individuals. This result is entirely spurious because, given the data generating mechanism, any observed variance in slopes is a result of only sampling error and not actual variance in population slopes, as indicated by the slope reliability statistic. The recommended next steps in this modeling effort would be to search for predictors of the observed heterogeneity. However, given the data-generating mechanism responsible for the observed heterogeneity, this search would be erroneous, and any predictors found to significantly reduce the observed heterogeneity must be Type I errors. Thus, random coefficient models may yield spurious results supporting incorrect inferences when applied to completely random longitudinal processes of a certain type. The specific nature of this stochastic process is described in the following section. Random Walks and Spurious Regression Random walks are one of the most commonly encountered and studied stochastic processes, and they are prevalent in virtually every scientific discipline, including computer science models of information search (Tang, Jin, & Zhang, 008), physics models of Brownian motion (Uhlenbeck & Ornstein, 1930), genetics models of genetic drift (Wright, 1931), ecological models of biodiffusion (Skellam, 1951) and population dynamics (Wang & Getz, 007), and economic models of real gross national product and employment (Nelson & Plosser, 198). In psychology, random walks are fundamental to the study of neuronal firing (Gerstein & Mandelbrot, 1964), speeded categorization (Nosofsky & Palmeri, 1997), diffusion models of decision processes (Busemeyer & Townsend, 1993), and consumer behavior such as new product adoption (Eliashberg & Chatterjee, 1986). The trajectories in Figure 1 represent regression lines fitted to realizations of an underlying random walk (with drift) described by Y t Y t 1 ε t, (3) where Y t is the current status on a variable, Y t 1 is the status at time t 1, is a constant (set to 0 for the data in Figure 1) known as drift, and ε t is a series with mean zero and constant variance ε. 1 Note that this variance is distinguished from the residual variance resulting from the application of regression to random walks, ε. At any time point, t, the expected value of such a random walk is Y 0 t (where Y 0 is the initial value), the variance is t ε, and the elements of the covariance matrix,, are j,k min j, k ε. Additional insight into the functioning of random walks is obtained using an equivalent representation of Equation 3 as 1 The random walk model as presented here assumes no measurement error in Y. However, the substantive conclusions drawn in this article do not change when measurement error is included in Y.

5 4 KULJANIN, BRAUN, AND DESHON t Y t Y 0 t ε i. (4) i 1 Starting with an initial value, Y 0, a random walk process is an accumulation of a deterministic trend component (i.e., t) and t error (i.e., i 1 ε i ). It is the accumulation of random errors that results in a stochastic trend. If 0, then the trends in the data are only stochastic in nature. When using regression methods to model a data generating process, it is either explicitly or implicitly assumed that the observed trajectory trends are the sole result of a deterministic process (e.g., Y t t ε t ). Unfortunately, it is difficult to visually distinguish trends resulting from deterministic and stochastic processes. As an example, Figure plots a regression line on data generated with a purely deterministic trend (i.e., Y t t ε t )or t a purely stochastic trend (i.e., Y t Y 0 i 1 ε i ). The similarity of these two trajectories, generated from very different mechanisms, highlights the nature of the problem. It is difficult to determine whether observed trends reflect the functioning of a deterministic or a stochastic process, and the distinction is critical because it has long been recognized that applying regression models to data generating processes that contain stochastic trends results in spurious results and inferences. Nelson and Kang (1984) demonstrated the pitfalls of applying the simple regression model (i.e., Y t t ε t ) to data containing only stochastic trends (i.e., random walks with no drift, t Y t Y 0 i 1 ε i ). Using both mathematical analysis and Monte Carlo simulations, they found that the deterministic time trend (Time) explained 44% of the variation in the random walk, even though the dependent variable, Y, did not in fact depend on Time. The true null hypotheses a 0 and b 0 were rejected in 80% and 87% of 1,000 replications, respectively, at a nominal 5% significance level when there were 100 time periods. When the regression model accounted for the autocorrelation present in the data, the model still incorrectly indicated that Time explained 1% of the variation in the random walk. The true null hypotheses a 0 and b 0 were now rejected with 45% and 58% frequency, respectively. That is, in both cases, the regression model performs poorly by providing inaccurate fixed effects tests and leading to mistaken inferences more often than not. The spurious regression results, found largely in the economics literature, focus on a single longitudinal series analogous to single-subject research in psychology. This existing literature does not address the situation most interesting to psychologists, in which a large set of individuals is observed over time and inference is often directed at trajectory heterogeneity. The implications of the spurious regression results for random coefficient models are developed in the following section. Spurious Random Coefficient Model Results The standard, fixed-effects regression models considered in the spurious regression literature possess a single probability distribution associated with the errors, ε t. Random coefficient models represent a generalization of the fixed-effects model where additional probability distributions are associated with the model coefficients. In this case, a single regression represents a sampled realization of an infinite set of possible regressions that are consistent with the underlying coefficient probability distribution(s). To understand the effects of applying a random coefficient model to a set of trajectories generated by a random walk process, a number of mathematical results are needed. Nelson and Kang (1984) provided some useful mathematics, but the bulk of their results are based on simulation. Durlauf and Phillips (1988) provided a rigorous mathematical foundation (i.e., the coefficient sampling distributions) for the spurious regression simulation results described in Nelson and Kang (1984). Using this work, we discuss the expected random coefficient regression results when the unconditional means model and the unconditional growth model are fit to random walks (i.e., stochastic trends). For the unconditional growth model, the fixed effects estimates and tests are discussed first followed by a discussion of the variance component estimates and tests. Finally, to keep the presentation as simple as possible, the current focus is on random walks with no drift. Generalization to random walks with drift is straightforward using the mathematical results discussed in the next two sections, unless otherwise noted. Unconditional Means Model When applied to longitudinal data, the primary purpose of the unconditional means model is to estimate the variance between and within individual trajectories. The resulting variance components are typically interpreted using the ICC(1). A surprising result occurs when the unconditional means model is used to summarize data generated by an underlying random walk process with no drift. As highlighted above, the variance of Y t for a set of random walk trajectories at any point in time is t ε. Assuming that each trajectory is of equal length, the total variance in a longitudinal sample of trajectories is simply the average of these values at each time point Ytot 1 ε ε T ε, (5) T where T is the length of all trajectories. This simplifies to T 1 ε. Nelson and Kang (1984) showed that the within trajectory (i.e., individual) variance in the sample is T 1 6 ε. The between trajectory variance may then be determined by subtracting the within trajectory variance from the total variance resulting in T 1 3 ε as the approximate between-person variance. As a result, the ICC(1) is closely approximated by ICC 1 T 1 3 ε T 1 3 ε 6 T 1 ε 3. (6) Terms such as random, disturbance, and stochastic process may need some clarification. A disturbance is a random realization from a distribution of possible values. A stochastic process is the evolution of a trajectory subject to disturbances at each point in time (Basu, 003).

6 STOCHASTIC TRENDS 5 DeterministicTrend StochasticTrend Response Response Time Time Figure. Modeling deterministic and stochastic trends with a regression line. In words, the ICC(1) for the unconditional means model when applied to trajectories generated by a random walk with no drift will always be approximately This value could serve the function of a diagnostic for the inappropriate application of a random coefficient model applied to random walks. Unfortunately, more complex random walk processes, such as random walks with drift, result in different values for ICC(1). Using the equation for ICC(1), the reliability of the sample means is then closely approximated by 0 T 1 3 ε T 1 3 ε T 1 T 6T T 1. (7) ε Therefore, the reliability of the sample means will approach 1 as the length of the trajectories increases. Unconditional Growth Model Fixed effects: Estimates. Durlauf and Phillips (1988) derived the asymptotic theory for applying simple regression (i.e., Y t t t ε t ) to random walks with no drift (i.e., Y t Y 0 i 1 ε i ). Here, the initial condition, Y 0, and error, ε t, for the random walks are assumed to be realizations from a normal distribution with a mean of zero and a constant variance, ε. In such a case, Durlauf and Phillips showed that the expected value of both the intercept and slope is 0 for the simple regression applied to random walks. Therefore, the estimate of the intercept (i.e., ˆ ) and slope (i.e., ˆ) is unbiased and will, on average, equal the true intercept (i.e., 0) and slope (i.e., 0) of the data-generating mechanism (i.e., initial condition and drift of the random walk). This mathematical result is supported by the simulation results provided in Nelson and Kang (1984), where the average intercept (i.e., ˆ ) and slope (i.e., ˆ) estimates across 1,000 replications were effectively 0. Thus, we expect that, on average, the fixed effects in the unconditional growth model, ˆ 00 and ˆ 10, will equal the true fixed effects, 00 and 10, which are both 0. Fixed effects: Tests. Nelson and Kang (1984) found that the statistical tests on the fixed effects in the simple regression model did not perform well, as the true null hypotheses a 0 and b 0 were rejected with frequency 80% and 87%, respectively, at a 5% significance level. Durlauf and Phillips (1988) found that the tests for both the intercept and slope diverge as the number of time periods increases. These results occur because the standard errors for the both the intercept and slope greatly underestimate the actual standard deviation of their respective sampling distributions. However, as we show in Appendix A, the standard errors for the fixed effects in the unconditional growth model closely approximate the standard deviation of the sampling distributions for the fixed effects 00 and 10. Therefore, we expect that the tests on the fixed effects in the unconditional growth model will perform well and not exceed the nominal Type I error rate. Variance components: Estimates. The variance of the single time series regression parameters is equal to the variance of intercepts and slopes in the unconditional growth model. Durlauf and Phillips (1988) provided the needed distributional theory, showing that the variances of the intercepts and slopes, respectively, are 0 T 15 ε, (8)

7 6 KULJANIN, BRAUN, AND DESHON 1 6 5T ε. (9) The simulation results in Nelson and Kang (1984) are consistent with these values. Therefore, as the number of time periods increases, the variance of intercepts will tend to approach infinity, whereas the variance of slopes will tend to approach zero. If the initial conditions in the random walks are allowed to vary randomly and the variance of the drift is set to zero, then intercept variance is either under- or overestimated, and slope variance is only correct when many time periods are observed. Variance components: Tests. Single-parameters tests for the variance components 0 and 1 are discussed by Raudenbush and Bryk (00). Given the variance estimates from Durlauf and Phillips (1988) and the residual variance estimate from Nelson and Kang (1984), the behavior of the variance components tests is discussed in Appendix B. When the unconditional growth model is fit to random walks, the variance component tests are effectively increasing functions of the number of time periods and individuals sampled. This is not surprising for the variance of intercepts, because the intercept variance increases without bound as the number of time periods observed increases. More surprising, the variance of slopes decreases as the number of time periods observed increases, but the test on the slope variance diverges. Thus, it would lead researchers to reject the null-hypothesis of no variance, even if the estimate of slope variance is very small, as is the case when the number of time periods is large. Model fit statistics. A number of additional statistics are often reported for the unconditional growth model to evaluate model fit. Implications for the reliability of intercepts and slopes, the likelihood ratio test, and the pseudo-r are presented here. The problem examined here is fundamentally one of model misspecification, and as a result, virtually all known model fit indices result in similarly inaccurate results. For the unconditional growth model, the reliability of intercepts is approximately 0 T 15 ε. (10) T 15 4T ε 15T ε The second term in the denominator is rapidly dominated by the first term as T increases and, as a result, the reliability of intercepts approaches 1.0. The reliability of slopes is approximately 1 6 5T ε 6 5T ε. (11) 4 5T T 1 ε As is the case for the reliability of intercepts, the second term in the denominator is rapidly dominated by the first term as T increases, and thus, the reliability of slopes also approaches 1.0. The likelihood ratio chi-square test ( ) is used to evaluate whether the unconditional growth model yields improved model fit relative to the unconditional means model. If the trajectory trends are purely stochastic then a deterministic Time trend has no impact on the response variable. In this case, the null hypothesis associated with the test is true (i.e., the unconditional growth model should not result in improved model fit relative to the unconditional means model), and the test should maintain the stated alpha rate. As discussed above, however, the test associated with the slope variance increases without bound as the number of time periods increases, and so the likelihood ratio test will fail to maintain the selected Type I error rate. Despite many ambiguities and inconsistencies, it is increasingly common to report a pseudo-r as an index of effect size for random coefficient models. One approach to this index (e.g., Snijders & Bosker, 1999) is to compare the error variances associated with the unconditional means model ( ε1 ) with the error variance resulting from the unconditional growth model ( ε ). For the random walks with no drift case, the expected value is pseudo-r Within ε 1 ε ε1 T 1 6 ε T 1 T 1 6 ε 15 ε 3 5, (1) which estimates the amount of within-variance in the response variable explained by Time. Although Y is not a function of Time in the data-generating mechanism, the pseudo-r statistic indicates that Time is responsible for 60% of the within-trajectory variance. Summary of Mathematical Results To summarize, when a random coefficient model is used to represent trends that are due, at least in part, to a stochastic process, the fixed effects estimates in the unconditional growth model are unbiased, and the tests on the fixed effects maintain nominal alpha levels. In contrast, the estimates of the intercept and slope variance will largely be inaccurate, and their tests will lead researchers to conclude that there is significant variance in intercepts and slopes. In addition, the model fit indices will indicate that the unconditional growth model better explains the observed trajectories than the unconditional means model. These results would likely result in an inappropriate search for predictors of individual differences in trajectories. Because random walks result in random trends, any predictors found to be significant must be Type I errors. Monte Carlo Evidence for Unconditional Growth Models The mathematical results presented above are clear and general. To verify the mathematical results and to provide estimates of the magnitude of the problem under realistic conditions, a small set of Monte Carlo simulations are now presented. Using the statistical software R (Development Core Team, 008), random walks were generated by Equation 3 with the drift parameter set to zero and ε t sampled from a standard normal distribution ( 0, 1.0) including the initial value (i.e., Y 0 ). This implies that the initial conditions for the trajectories (i.e., intercepts in regression terminology) have an expected value of zero and a variance equal to one, and the drift (i.e., slope in regression terminology) has an

8 STOCHASTIC TRENDS 7 expected value and variance of zero. Four simulated conditions were examined by varying the number of time periods (five or 0) and the number of individuals (50 or 100) for both the unconditional means and growth models. These values are consistent with the sample size and length of longitudinal research designs in psychology. For each combination of simulation parameters 1,000 data sets were generated, and both the unconditional means and growth models were fit to the resulting data sets. Table 1 presents the simulation results for the unconditional means model, and Table presents the results of the unconditional growth model. Unconditional Means Model When 50 random walks (e.g., individuals) are observed over five time points, and the unconditional means model is fit, the average fixed effect is ˆ , and the rejection rate of the true null hypothesis, 00 0, is 6.7% at the 5% nominal significance level. Thus, the observed rejection rate approximated the nominal significance level. However, the average variance of the means is ˆ and is significant in every replication. The average reliability of the observed means is ˆ , implying that 90% of the interindividual differences in observed means are interindividual differences in true (population) means. Thus, a researcher may conclude not only that is there significant variation in observed means but also that this variation of observed means captures a large proportion of variance in true means. As expected, the average ICC(1) 0.66, and thus, researchers would conclude that 66% of the variance in the dependent variable is attributable to between-person variance. Examining the other simulations in Table 1, where the number of time periods or sample size increases, the average grand mean is effectively zero and the rejection rate maintains the nominal 5% significance level, the variance of means is always significant, the reliability of observed means increases as the number of time periods increases, and the average ICC(1) The simulations of fitting the unconditional means model to random walks suggests that researchers would conclude that there is heterogeneity in observed means and that much of this heterogeneity is variability in true means. Unconditional Growth Model Now consider fitting the unconditional growth model to 50 random walks observed over five time points. The average fixed effects are ˆ and ˆ , and the rejection rates of the true null hypotheses, 00 0 and 10 0, are 6.8% and 5.4%, respectively, using a 5% Type I error rate ( ). Thus, observed rejection rates maintained the nominal rejection rates as the standard errors of the fixed effects closely approximate their standard deviations. The average variance of the intercepts is ˆ 0.95, and 0 slopes ˆ 1 0.1, and both are significant in every replication. Both of the variance component estimates are incorrect, as the variance of intercepts was set to one and the variance of slopes was set to zero. The fact that the intercept variance was close to one is clearly due to the particular number of time periods used in this condition, and, as can be seen in Table, more observations results in clearly inaccurate results. The average reliability of the observed intercepts is ˆ , and slopes ˆ , which implies that about 80% of the interindividual differences in observed intercepts and slopes are interindividual differences in true (population) intercepts and slopes. Thus, researchers would conclude that much of the variation in observed intercepts and slopes is variability in true intercepts and slopes. However, as previously mentioned, there is no variability in slopes. The average log-likelihood ratio comparing the fit of the unconditional growth model with the unconditional means model was 3, N 50) and was significant for every replication. Based on this result, the natural but inaccurate conclusion would be that the unconditional growth model is a better representation of the data than the unconditional means model. This result is largely due to the significance of the slope variance, which, again, does not exist in the actual data. Together, these results indicate that random coefficient models will lead researchers to conclude that there is between-person variation in intercepts and slopes, that this observed variation largely represents true variation, and, ultimately, that a search for predictors of the observed heterogeneity is warranted. Given the data generating mechanism, any predictors identified as significant must be Type I errors. Examining the other simulations presented in Table, when the number of time periods or sample size increases, the average intercepts and slopes are all close to the true value of zero, and the observed rejection rates are close to the nominal 5% significance level. The amount of variance in intercepts increases, whereas the variance in slopes decreases as the number of time periods observed increases, and they both approach the expected values from Equations 8 and 9. The variance of intercepts and slopes is significant in every replication, the reliabilities of the observed intercepts and slopes increases as the number of time Table 1 Random Coefficient Model Parameter Estimates and Tests For the Unconditional Means Model Across 1,000 Replications N T 00 S 00 SE 00 p 00 0 p 0 ε ICC (0.1) (0.37) (0.14) (0.7) Note. T length of each time series; 00 average estimate of grand mean; S 00 standard deviation of grand mean; SE 00 standard error of grand mean; p 00 rejection rate of hypothesis test on the grand mean; 0 average variance of individual means; p 0 rejection rate of hypothesis test on the variance of individual means; ε average within-person variance; ICC 1 intraclass correlation coefficient; 0 average reliability of sample means.

9 8 KULJANIN, BRAUN, AND DESHON Table Random Coefficient Model Parameter Estimates and Tests for the Unconditional Growth Model Across 1,000 Replications Fixed effects Variance components Model statistics N T 00 (S 00 ) SE 00 p (S 10 ) SE 10 p 10 0 p 0 1 p 1 ε 01 p p pr w (0.16) (0.07) (0.5) (0.04) (0.11) (0.05) (0.18) (0.0) Note. T length of each time series; 00 average estimate of population intercept; S 00 standard deviation of population intercept; SE 00 standard error of intercept estimate; p 00 rejection rate of hypothesis test on the population intercept; 10 average estimate of population slope; S 10 standard deviation of population slope; SE 10 standard error of slope estimate; p 10 rejection rate of hypothesis test on the population slope; 0 average variance of intercepts; p 0 rejection rate of hypothesis test on the variance of intercepts; 1 average variance of slopes; p 1 rejection rate of hypothesis test on the variance of slopes; ε average within-person variance around the linear trajectories; 01 average correlation between intercepts and slopes; p 01 rejection rate of hypothesis test on correlation between intercepts and slopes; 0 average reliability of sample intercepts; 1 average reliability of sample slopes; p rejection rate of the hypothesis test on the difference between model fit of unconditional means and unconditional growth models; pr w proportion of within person variance explained by Time. points increases, and the log-likelihood ratio test is significant in every replication. All of these results follow expectations, and without regard to the size of the longitudinal data set, these results indicate that statistics from random coefficient models would mislead researchers into believing that much of the significant variation in intercepts and slopes is attributable to variation in true intercepts and slopes. However, the true slope for each person in every simulation and replication is zero. Thus, there is no variability in true slopes. The observed variance in intercepts and slopes would lead researchers to model explanatory variables of this variance. Consistent with this conclusion, researchers in 46 of 73 (i.e., approximately 63%) journal articles in our literature review used the tests on variance components in their models to justify the search for predictors of intercept or slope heterogeneity. If the underlying data-generating mechanism is random, as is the case here, this search can only result in the identification of predictors of heterogeneity that are entirely spurious. Monte Carlo Evidence for Conditional Growth Models The simulation results presented above highlight the consequences of fitting unconditional models to a stochastic process. The results for these models indicate that the variance component estimates and tests are inaccurate, whereas the fixed effect estimates and tests are accurate. Investigations of growth typically focus on predictors of heterogeneity in the random effects, and the potential predictors of heterogeneity are modeled as fixed effects in random coefficient and latent growth curve models. The fact that the fixed effects estimates and tests are well-behaved in the results presented above may result in a mistaken belief that random coefficient and latent growth curve models protect against finding spurious predictors of random effect heterogeneity. The following simulations evaluate the accuracy of this inference. The focus here is to examine what happens to the tests on fixed effects and variance components when a Level predictor is incorporated into the unconditional growth model. In our literature review of 73 journal articles utilizing random coefficient or latent growth curve models, we identified two distinct approaches for incorporating predictors into the unconditional growth model. In 45 (i.e., approximately 6%) journal articles researchers added a Level predictor as a fixed effect to the unconditional growth model. However, in the other 8 journal articles, researchers examined predictors of heterogeneity by including a Level predictor as a fixed effect and excluding the random slope effects. To reflect this practice, we examined four new simulated conditions where stochastic processes were created as described in the previous section with the same set of time (five or 0) and sample size (50 or 100) conditions. Two conditional growth models were then fit to the resulting data by including a Level predictor in Equation as a predictor of both intercepts and slopes. Then, either the random effect of slopes was included (i.e., 1i P i 1i, where P is for predictor) or excluded (i.e., 1i P i ) in the conditional growth model. Without loss of generality to continuous predictors, values of 0 or 1 were randomly assigned with equal probability to each random walk, which, in substantive research, may represent dichotomous predictors, such as gender, race, or experimental condition, among other variables. Conditional Growth Model Without a Random Effect for Slopes In this model, there are four fixed effects and random effects for intercepts. Focusing on the accuracy of the tests on fixed effects, the results in Table 3 indicate that when a conditional model without a random effect for slopes is fit to random walks, the fixed effects tests for all four fixed effects are not close to the nominal 5% significance level. Bradley (1978) proposed a liberal and a stringent criterion for robustness of a statistical test. According to his liberal criterion, a test is robust if the probability of a Type I error is between 0.5 (i.e., 0.05) and 1.5 (i.e., 0.075). The average slope (i.e., 10 ) and the moderating effect (i.e., 11 ) of the Level predictor are well above the upper limit of the liberal criterion, and the fixed effects tests on those two parameters become less accurate with longer time series. On the other hand, the fixed effects tests on the average intercept (i.e., 00 ) and the main effect (i.e., 01 ) of the Level predictor are well below the lower limit of the liberal criterion and similarly become less accurate with longer time series. These results are particularly interesting when one considers that, in the actual data-generating mechanism, there are

10 STOCHASTIC TRENDS 9 Table 3 Random Coefficient Model Parameter Estimates and Tests for the Conditional Growth Model Without a Random Effect for Slopes Across 1,000 Replications N T Fixed effects Variance components 00 SE 00 p SE 01 p SE 10 p SE 11 p 11 0 p 0 ε Note. T length of each time series; 00 average estimate of population intercept; SE 00 standard error of intercept estimate; p 00 rejection rate of hypothesis test on the population intercept; 01 estimate of main effect of level predictor; SE 01 standard error of main effect of level predictor; p 01 rejection rate of hypothesis test on the main effect of level predictor; 10 average estimate of population slope; SE 10 standard error of slope estimate; p 10 rejection rate of hypothesis test on the population slope; 11 estimate of moderating effect of level predictor; SE 11 standard error of moderating effect of level predictor; p 11 rejection rate of hypothesis test on the moderating effect of level predictor; 0 average variance of intercepts; p 0 rejection rate of hypothesis test on the variance of intercepts; ε average within-person variance around the linear trajectories. no differences in slopes because every random walk is generated without drift. Although this conditional growth model correctly specifies that there is no variance in slopes, it suffers from highly inflated Type I error rates for the fixed effects tests related to slopes. Conditional Growth Model With a Random Effect for Slopes When the conditional growth model includes the random effect for slopes, then the results in Table 4 indicate that the previously problematic fixed effects tests on the average slope (i.e., 10 ) and the moderating effect (i.e., 11 ) of the Level predictor are accurate. The previously conservative tests on the average intercept (i.e., 00 ) and the main effect (i.e., 01 ) of the Level predictor are now at the 5% nominal significance level as well. Although the fixed effects tests pertaining to slopes are accurate, the variance component test for slopes is inaccurate as it indicates significant variance between slopes even though the data-generating mechanism does not contain any heterogeneity in slopes. Thus, when a random coefficient model appropriately excludes heterogeneity in slopes, the model becomes inaccurate in its fixed effects tests (see Table 3). On the other hand, when a random coefficient model includes a random effect for slopes (i.e., growth heterogeneity), the test on this random effect indicates that it is inaccurately significant (see Table 4). In either case, the random coefficient model is incapable of correctly representing the data-generating mechanism when that mechanism contains a stochastic trend. The simulation results from the unconditional growth model indicate that the variance components for the random effects of intercepts and slopes are always significant if stochastic trends are present in the data. As discussed above, our literature review demonstrated that researchers relied on this evidence to justify their search for predictors of heterogeneity of intercepts and slopes. The results in Table 4 indicate that the Type I error rate for each fixed effect test maintains the nominal alpha level when a random effect for slopes is included in a conditional growth model. Because the primary focus of conditional models is the fixed effects tests of predictors, this may lead researchers to the mistaken belief that, as long as all random effects are included, the random coefficient model behaves well. Although the probability of a Type I error for a single predictor of intercept or slope heterogeneity Table 4 Random Coefficient Model Parameter Estimates and Tests for the Conditional Growth Model With a Random Effect for Slopes Across 1,000 Replications N T Fixed effects 00 SE 00 p SE 01 p SE 10 p SE 11 p 11 0 p 0 1 Variance components p 1 ε Note. T length of each time series; 00 average estimate of population intercept; SE 00 standard error of intercept estimate; p 00 rejection rate of hypothesis test on the population intercept; 01 estimate of main effect of level predictor; SE 01 standard error of main effect of level predictor; p 01 rejection rate of hypothesis test on the main effect of level predictor; 10 average estimate of population slope; SE 10 standard error of slope estimate; p 10 rejection rate of hypothesis test on the population slope; 11 estimate of moderating effect of level predictor; SE 11 standard error of moderating effect of level predictor; p 11 rejection rate of hypothesis test on the moderating effect of level predictor; 0 average variance of intercepts; p 0 rejection rate of hypothesis test on the variance of intercepts; 1 average variance of slopes; p 1 rejection rate of hypothesis test on the variance of slopes; ε average within-person variance around the linear trajectories; 01 average correlation between intercepts and slopes; p 01 rejection rate of hypothesis test on correlation between intercepts and slopes. 01 p 01

11 10 KULJANIN, BRAUN, AND DESHON may be 5%, as suggested by our simulation results, our literature review indicated that researchers typically include somewhere between four and eight predictors of heterogeneity of intercepts and slopes. As is the case for other modeling techniques, such as regression and structural equations models, the probability of obtaining at least one Type I error increases well beyond the nominal 5% level with several predictors in a model. If eight independent tests are conducted at a nominal 5% level, then the probability of at least one Type I error is [1 (1 0.05) 8 ] 0.34, or 34%. If the tests are dependent, then the probability of at least one Type I error may be lower or higher than this value. In any case, if stochastic trends are present in the data, researchers cannot rely on random coefficient or latent growth curve models to protect against inferential mistakes, even when all random effects are modeled, because each examined predictor increases the risk of finding at least one spurious predictor of intercept or slope heterogeneity. Recommendations Random walks and stochastic trends, as previously mentioned, are encountered in virtually all scientific disciplines. In particular, economists commonly deal with stochastic trends in their data. Because of the statistical and inferential problems that arise in regression models as a result of stochastic trends, economists first attempt to identify whether the trends present in their data are due to a deterministic or stochastic process before applying a statistical model. To do so, economists use one or more statistical tests known as unit root tests. If a unit root test indicates that the trends present in the data are not stochastic, then economists frequently use regression models similar to those found in psychology. However, if a unit root test indicates that a time series is not distinguishable from a random walk, then economists use an alternative class of models that reflect the stochastic trend in the data. Further research is needed to evaluate comprehensively whether this practice of using unit root tests as a precondition of applying regressionbased models is appropriate for psychological data structures. For now, we recommend that longitudinal data analysts report the results of one or more unit root tests and appropriately qualify inferences from random coefficient models if it is found that the trends present in the data are not distinguishable from random walks. One example of a unit root test is the augmented Dickey-Fuller (ADF) test (Dickey & Fuller, 1979; Said & Dickey, 1984). It is the most common method used in economics to distinguish between deterministic and stochastic trends in growth data. For a single time series, the equation for the ADF is p Y t Y t 1 p Y t p ε t, (13) i 1 where Y t is the series, Y t Y t Y t 1, is the drift, 0is the null hypothesis associated with a random walk, p is the lag order of the autoregressive process, p are the structural autoregressive effects, and ε t is the error term. If the null hypothesis, 0, is rejected, then the series is distinguishable from a random walk with drift. To run the test, the analyst needs to determine the lag structure of the time series and if there is a drift. This is not a high-power test, and estimating unnecessary parameters for long lags and drift wastes degrees of freedom. The lag structure of a time series is investigated by looking at the autocorrelation and partial autocorrelation functions, whereas the existence of drift in the series is generally assessed visually. The free statistical software R includes the ADF in its set of analytical techniques as well as the autocorrelation and partial autocorrelation functions. The standard ADF test is used to evaluate the trends present in a single trajectory. Psychologists generally gather data on several individuals, and a panel version of the ADF is needed to determine whether the sample of trajectories, as a whole, is distinguishable from multiple random walks. To use the test, it is necessary to examine the autocorrelation and partial autocorrelation functions of each series to determine the most common lag structure across the sample and to determine whether drift exists in at least the majority of the series. This process is described in most introductory time series texts (e.g., Enders, 1989). Once these decisions are made, the panel version of the ADF test developed by Im, Pesaran, and Shin (003) is computed by applying the standard ADF on each series and then taking the average value of ˆ in Equation 13. This average value is compared with a percentile (e.g., 90th or 95th) from the distribution of estimated unit roots (i.e., ) on random walks for the specified lag order, drift, and length and number of time series (see Im et al., 003). An example of running the panel ADF test is given in Appendix C using one of the data sets discussed in the next section. If stochastic trends are present in psychological data, then the common longitudinal methods (i.e., latent growth curve and random coefficient models) used in psychology do not adequately capture the data-generating mechanism. As previously mentioned, economists use methods that are capable of modeling stochastic trends. Most commonly, economists use autoregressive integrated moving average (ARIMA) models. The dependent variable in these models is differenced the appropriate number of times to produce a series without a stochastic trend. Then researchers estimate the desired model parameters (Enders, 1989). Although this is the dominant approach in economics, it is primarily useful for a single time series. Therefore, ARIMA methods are not particularly useful for the typical longitudinal data structures found in psychology. Less common, but perhaps more promising, in economics is the use of state space models and seemingly unrelated time series equations (Chu & Durango-Cohen, 008; Harvey & Shephard, 1993). Chow, Ho, Hamaker, and Dolan (010) and Yang and Chow (010) provide examples of state-space analyses using psychological data. These methods make it possible to model simultaneously deterministic and stochastic trends, while offering the flexibility to include predictors of deterministic trend components. Although these models are most commonly applied to single time series, they may also be used to examine time series obtained from multiple participants (see Harvey & Koopman, 1996). State space models provide a general longitudinal modeling framework, and psychologists may find them useful, even when their data do not contain stochastic trends.

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior David R. Johnson Department of Sociology and Haskell Sie Department

More information

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Stochastic vs. deterministic

More information

Testing for Unit Roots with Cointegrated Data

Testing for Unit Roots with Cointegrated Data Discussion Paper No. 2015-57 August 19, 2015 http://www.economics-ejournal.org/economics/discussionpapers/2015-57 Testing for Unit Roots with Cointegrated Data W. Robert Reed Abstract This paper demonstrates

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND Testing For Unit Roots With Cointegrated Data NOTE: This paper is a revision of

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3 Table of contents TABLE OF CONTENTS...1 1 INTRODUCTION TO MIXED-EFFECTS MODELS...3 Fixed-effects regression ignoring data clustering...5 Fixed-effects regression including data clustering...1 Fixed-effects

More information

BCT Lecture 3. Lukas Vacha.

BCT Lecture 3. Lukas Vacha. BCT Lecture 3 Lukas Vacha vachal@utia.cas.cz Stationarity and Unit Root Testing Why do we need to test for Non-Stationarity? The stationarity or otherwise of a series can strongly influence its behaviour

More information

Multivariate Time Series: Part 4

Multivariate Time Series: Part 4 Multivariate Time Series: Part 4 Cointegration Gerald P. Dwyer Clemson University March 2016 Outline 1 Multivariate Time Series: Part 4 Cointegration Engle-Granger Test for Cointegration Johansen Test

More information

How well do Fit Indices Distinguish Between the Two?

How well do Fit Indices Distinguish Between the Two? MODELS OF VARIABILITY VS. MODELS OF TRAIT CHANGE How well do Fit Indices Distinguish Between the Two? M Conference University of Connecticut, May 2-22, 2 bkeller2@asu.edu INTRODUCTION More and more researchers

More information

A nonparametric test for seasonal unit roots

A nonparametric test for seasonal unit roots Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna To be presented in Innsbruck November 7, 2007 Abstract We consider a nonparametric test for the

More information

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions Journal of Modern Applied Statistical Methods Volume 12 Issue 1 Article 7 5-1-2013 A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions William T. Mickelson

More information

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 11 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 30 Recommended Reading For the today Advanced Time Series Topics Selected topics

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 9 Jakub Mućk Econometrics of Panel Data Meeting # 9 1 / 22 Outline 1 Time series analysis Stationarity Unit Root Tests for Nonstationarity 2 Panel Unit Root

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Multilevel Modeling: A Second Course

Multilevel Modeling: A Second Course Multilevel Modeling: A Second Course Kristopher Preacher, Ph.D. Upcoming Seminar: February 2-3, 2017, Ft. Myers, Florida What this workshop will accomplish I will review the basics of multilevel modeling

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED by W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand Email: bob.reed@canterbury.ac.nz

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique

More information

LECTURE 11. Introduction to Econometrics. Autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

E 4101/5101 Lecture 9: Non-stationarity

E 4101/5101 Lecture 9: Non-stationarity E 4101/5101 Lecture 9: Non-stationarity Ragnar Nymoen 30 March 2011 Introduction I Main references: Hamilton Ch 15,16 and 17. Davidson and MacKinnon Ch 14.3 and 14.4 Also read Ch 2.4 and Ch 2.5 in Davidson

More information

Describing Change over Time: Adding Linear Trends

Describing Change over Time: Adding Linear Trends Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test E 4160 Autumn term 2016. Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test Ragnar Nymoen Department of Economics, University of Oslo 24 October

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation 1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption

More information

Time Series Methods. Sanjaya Desilva

Time Series Methods. Sanjaya Desilva Time Series Methods Sanjaya Desilva 1 Dynamic Models In estimating time series models, sometimes we need to explicitly model the temporal relationships between variables, i.e. does X affect Y in the same

More information

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends

More information

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful? Journal of Modern Applied Statistical Methods Volume 10 Issue Article 13 11-1-011 Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Testing Main Effects and Interactions in Latent Curve Analysis

Testing Main Effects and Interactions in Latent Curve Analysis Psychological Methods 2004, Vol. 9, No. 2, 220 237 Copyright 2004 by the American Psychological Association 1082-989X/04/$12.00 DOI: 10.1037/1082-989X.9.2.220 Testing Main Effects and Interactions in Latent

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

How likely is Simpson s paradox in path models?

How likely is Simpson s paradox in path models? How likely is Simpson s paradox in path models? Ned Kock Full reference: Kock, N. (2015). How likely is Simpson s paradox in path models? International Journal of e- Collaboration, 11(1), 1-7. Abstract

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

EC821: Time Series Econometrics, Spring 2003 Notes Section 9 Panel Unit Root Tests Avariety of procedures for the analysis of unit roots in a panel

EC821: Time Series Econometrics, Spring 2003 Notes Section 9 Panel Unit Root Tests Avariety of procedures for the analysis of unit roots in a panel EC821: Time Series Econometrics, Spring 2003 Notes Section 9 Panel Unit Root Tests Avariety of procedures for the analysis of unit roots in a panel context have been developed. The emphasis in this development

More information

Sample Size Planning for Longitudinal Models: Accuracy in Parameter Estimation for Polynomial Change Parameters

Sample Size Planning for Longitudinal Models: Accuracy in Parameter Estimation for Polynomial Change Parameters Psychological Methods 011, Vol. 16, No. 4, 391 405 011 American Psychological Association 108-989X/11/$1.00 DOI: 10.1037/a00335 Sample Size Planning for Longitudinal Models: Accuracy in Parameter Estimation

More information

Regression-Discontinuity Analysis

Regression-Discontinuity Analysis Page 1 of 11 Home» Analysis» Inferential Statistics» Regression-Discontinuity Analysis Analysis Requirements The basic RD Design is a two-group pretestposttest model as indicated in the design notation.

More information

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems *

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * February, 2005 Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * Peter Pedroni Williams College Tim Vogelsang Cornell University -------------------------------------------------------------------------------------------------------------------

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Inflation Revisited: New Evidence from Modified Unit Root Tests

Inflation Revisited: New Evidence from Modified Unit Root Tests 1 Inflation Revisited: New Evidence from Modified Unit Root Tests Walter Enders and Yu Liu * University of Alabama in Tuscaloosa and University of Texas at El Paso Abstract: We propose a simple modification

More information

ARDL Cointegration Tests for Beginner

ARDL Cointegration Tests for Beginner ARDL Cointegration Tests for Beginner Tuck Cheong TANG Department of Economics, Faculty of Economics & Administration University of Malaya Email: tangtuckcheong@um.edu.my DURATION: 3 HOURS On completing

More information

Technical Appendix C: Methods

Technical Appendix C: Methods Technical Appendix C: Methods As not all readers may be familiar with the multilevel analytical methods used in this study, a brief note helps to clarify the techniques. The general theory developed in

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Testing and Interpreting Interaction Effects in Multilevel Models

Testing and Interpreting Interaction Effects in Multilevel Models Testing and Interpreting Interaction Effects in Multilevel Models Joseph J. Stevens University of Oregon and Ann C. Schulte Arizona State University Presented at the annual AERA conference, Washington,

More information

interval forecasting

interval forecasting Interval Forecasting Based on Chapter 7 of the Time Series Forecasting by Chatfield Econometric Forecasting, January 2008 Outline 1 2 3 4 5 Terminology Interval Forecasts Density Forecast Fan Chart Most

More information

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén August 5, 2018 Abstract We discuss different methods for centering a predictor

More information

Technical Appendix C: Methods. Multilevel Regression Models

Technical Appendix C: Methods. Multilevel Regression Models Technical Appendix C: Methods Multilevel Regression Models As not all readers may be familiar with the analytical methods used in this study, a brief note helps to clarify the techniques. The firewall

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Goals for the Morning

Goals for the Morning Introduction to Growth Curve Modeling: An Overview and Recommendations for Practice Patrick J. Curran & Daniel J. Bauer University of North Carolina at Chapel Hill Goals for the Morning Brief review of

More information

Econometric Methods for Panel Data

Econometric Methods for Panel Data Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies

More information

Rejection Probabilities for a Battery of Unit-Root Tests

Rejection Probabilities for a Battery of Unit-Root Tests WORKING PAPERS IN ECONOMICS No 568 Rejection Probabilities for a Battery of Unit-Root Tests Authors Florin G. Maican Richard J. Sweeney May 2013 ISSN 1403-2473 (print) ISSN 1403-2465 (online) Department

More information

Longitudinal Data Analysis of Health Outcomes

Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

On Consistency of Tests for Stationarity in Autoregressive and Moving Average Models of Different Orders

On Consistency of Tests for Stationarity in Autoregressive and Moving Average Models of Different Orders American Journal of Theoretical and Applied Statistics 2016; 5(3): 146-153 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20160503.20 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person

More information

Regression tree-based diagnostics for linear multilevel models

Regression tree-based diagnostics for linear multilevel models Regression tree-based diagnostics for linear multilevel models Jeffrey S. Simonoff New York University May 11, 2011 Longitudinal and clustered data Panel or longitudinal data, in which we observe many

More information

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path

More information

Multilevel Analysis of Grouped and Longitudinal Data

Multilevel Analysis of Grouped and Longitudinal Data Multilevel Analysis of Grouped and Longitudinal Data Joop J. Hox Utrecht University Second draft, to appear in: T.D. Little, K.U. Schnabel, & J. Baumert (Eds.). Modeling longitudinal and multiple-group

More information

Time Metric in Latent Difference Score Models. Holly P. O Rourke

Time Metric in Latent Difference Score Models. Holly P. O Rourke Time Metric in Latent Difference Score Models by Holly P. O Rourke A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved June 2016 by the Graduate

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Economics 308: Econometrics Professor Moody

Economics 308: Econometrics Professor Moody Economics 308: Econometrics Professor Moody References on reserve: Text Moody, Basic Econometrics with Stata (BES) Pindyck and Rubinfeld, Econometric Models and Economic Forecasts (PR) Wooldridge, Jeffrey

More information

This chapter reviews properties of regression estimators and test statistics based on

This chapter reviews properties of regression estimators and test statistics based on Chapter 12 COINTEGRATING AND SPURIOUS REGRESSIONS This chapter reviews properties of regression estimators and test statistics based on the estimators when the regressors and regressant are difference

More information

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling G. B. Kingston, H. R. Maier and M. F. Lambert Centre for Applied Modelling in Water Engineering, School

More information

11/18/2008. So run regression in first differences to examine association. 18 November November November 2008

11/18/2008. So run regression in first differences to examine association. 18 November November November 2008 Time Series Econometrics 7 Vijayamohanan Pillai N Unit Root Tests Vijayamohan: CDS M Phil: Time Series 7 1 Vijayamohan: CDS M Phil: Time Series 7 2 R 2 > DW Spurious/Nonsense Regression. Integrated but

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Nonstationary Time Series:

Nonstationary Time Series: Nonstationary Time Series: Unit Roots Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board Summer School in Financial Mathematics Faculty of Mathematics & Physics University of Ljubljana September

More information

LM threshold unit root tests

LM threshold unit root tests Lee, J., Strazicich, M.C., & Chul Yu, B. (2011). LM Threshold Unit Root Tests. Economics Letters, 110(2): 113-116 (Feb 2011). Published by Elsevier (ISSN: 0165-1765). http://0- dx.doi.org.wncln.wncln.org/10.1016/j.econlet.2010.10.014

More information

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests Working Paper 2013:8 Department of Statistics The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests Jianxin Wei Working Paper 2013:8 June 2013 Department of Statistics Uppsala

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

Inferential statistics

Inferential statistics Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,

More information

Estimation and Centering

Estimation and Centering Estimation and Centering PSYED 3486 Feifei Ye University of Pittsburgh Main Topics Estimating the level-1 coefficients for a particular unit Reading: R&B, Chapter 3 (p85-94) Centering-Location of X Reading

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Department of Economics, UCSB UC Santa Barbara

Department of Economics, UCSB UC Santa Barbara Department of Economics, UCSB UC Santa Barbara Title: Past trend versus future expectation: test of exchange rate volatility Author: Sengupta, Jati K., University of California, Santa Barbara Sfeir, Raymond,

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

An Introduction to Parameter Estimation

An Introduction to Parameter Estimation Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction

More information

Problems with Stepwise Procedures in. Discriminant Analysis. James M. Graham. Texas A&M University

Problems with Stepwise Procedures in. Discriminant Analysis. James M. Graham. Texas A&M University Running Head: PROBLEMS WITH STEPWISE IN DA Problems with Stepwise Procedures in Discriminant Analysis James M. Graham Texas A&M University 77842-4225 Graham, J. M. (2001, January). Problems with stepwise

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information