Dr. StrangeLOVE, or. How I Learned to Stop Worrying and Love Omitted Variables. Adam W. Meade, Tara S. Behrend and Charles E.
|
|
- Jason Cobb
- 6 years ago
- Views:
Transcription
1 AU: Check that your name is presented correctly and consistently here against the TOC 4 Dr. StrangeLOVE, or How I Learned to Stop Worrying and Love Omitted Variables Adam W. Meade, Tara S. Behrend and Charles E. Lance A well-known problem in path analysis and structural equation modeling (SEM) is that even the largest and most comprehensive models cannot contain all of the causes of models endogenous variables. This violation of one of the underlying assumptions of path analysis and SEM gives rise to a commonly held belief that failure to include all relevant causes of endogenous variables may invalidate study results in path analysis and SEM. This problem has been referred to variously as the unmeasured variables problem (Duncan, 1975; James, 1980), the omitted variables problem (James, 1980; Kenny, 1979; Sackett, Laczo, & Lippe, 2003), left out variables error (LOVE; Mauro, 1990), a lack of perfect isolation (i.e., pseudo-isolation; Bollen, 1989), and lack of self-containment (James, Mulaik, & Brett, 1982). It has also been discussed as a particular type of model specification error (Hanushek & Jackson, 1977; Kenny, 1979). The omitted variables problem arises when the assumption that all relevant variables that influence the dependent (endogenous) variables are included in the model is violated. However, in the social sciences, this assumption is rarely, if ever, fulfilled. Although there is no shortage of scholarly discussion and writing related to omitted variables, it is less clear how often this issue arises in substantive academic and applied research. This is because discussion of omitted variables usually takes place behind the scenes, for example during the manuscript review process. In response to a post to the RMNET message board on June 11, 2007, several authors 91 RT2382X.indb 91 5/19/08 7:44:32 AM
2 92 Adam W. Meade, Tara S. Behrend and Charles E. Lance indicated that omitted variable discussions have arisen during the review process. In one example, an anonymous reviewer commented on a paper related to sources of work absenteeism: However, omitted variables that are tied to absenteeism still remain a concern as family size, number of children, and being single head of household are also related to race/ethnicity. The issue is not that perceived value of diversity and children, etc. are related (as the authors contend), it is that race is correlated with both reports of value of diversity and number of children etc., and then with absenteeism. Hence, absenteeism is potentially being driven by factors other than what the author(s) allege. Simply acknowledging the lack of critical data (pages 26 & 27) does not eliminate the concern that major confounds were not adequately controlled. (S. Tonidandel, personal communication, June 12, 2007). This comment is undoubtedly typical of those researchers regularly encounter. In order to provide some index of the extent to which researchers consider omitted variable issues in their work, we conducted a cited reference search. Specifically, we used the Social Science Citation Index to identify works that cited two seminal papers on omitted variables, James (1980) and Mauro (1990), on the assumption that authors dealing with omitted variables issues in their research would be likely to cite these works. A total of 63 sources were found that cited these studies. We then coded each of these sources into one of four categories based on the context in which they discussed omitted variables. Of the 63 sources, 12 actually took steps to assess risk from omitted variables or acted to minimize the impact of omitted variables in some way (e.g., including relevant variables not of central focus to the model [Prussia, Kinicki, & Bracker, 1993], testing alternative models with and without potential additional determinant variables [Colquitt, LePine, & Noe, 2000; Prussia & Kinicki, 1996]). An additional 21 articles cited James (1980) or Mauro (1990) when discussing the potential biasing effect of omitted variables but did not attempt to account for such variables in any way. Twenty-six sources cited these works as part of a methodological review of path analysis or SEM. Finally, four sources mentioned the potential of omitted variables as a limitation of previous research in order to help justify their current study. In sum, it seems that reviewers and others critically evaluating organizational research are aware of the omitted variables issue and voice concerns over LOVE, perhaps even in contexts in which there is minimal risk of omitted variables compromising research RT2382X.indb 92 5/19/08 7:44:32 AM
3 Dr. StrangeLOVE 93 conclusions. On the other hand, authors seem to address omitted variables in a meaningful way less frequently than would be desired. This is not surprising given that authors may not want to call attention to methodological issues that could question the validity of their study conclusions. However, there are some instances in which omitted variables do pose a considerable threat to the conclusions of path analyses and SEM. In order to provide a better understanding of when omitted variables may or may not jeopardize the validity of path analysis and SEM, this chapter has three goals: (a) review the relevant assumptions in path analysis and SEM and present a mathematical explanation of the omitted variables problem, (b) discuss the conditions under which omitted variables are likely to be problematic and those under which the effects of omitted variables are negligible, and (c) provide recommendations for minimizing the risk of LOVE. Theoretical and Mathematical Definition of the Omitted Variables Problem Conceptually, the problems that may be caused by omitted variables are not difficult to understand. When researchers specify path or structural equation models in order to evaluate a theory, path coefficients are estimated based on the correlations among the measured variables in the model and the pattern of structural relations specified. If an endogenous (dependent) variable is affected by a variable that is unmeasured, and the unmeasured variable correlates to a moderate degree with other causal determinants in the model, the effects of the unmeasured variable can be incorrectly attributed to the measured causal determinants in the model. While the effect of the omitted variable could serve to decrease the magnitude of the path coefficient of the measured variable (i.e., a suppressor effect), it is more often assumed that the effect would cause a positive bias in the path coefficient of the measured variable. This positive bias could also result in the determination that a determinant has a statistically significant effect on an endogenous variable, when such a finding would not have been the case if the unmeasured variable had been included in the path model. This error is referred to as LOVE. The omitted variables problem is perhaps best understood by first looking at the basic mathematics supporting path modeling. In RT2382X.indb 93 5/19/08 7:44:32 AM
4 94 Adam W. Meade, Tara S. Behrend and Charles E. Lance order to clearly demonstrate this issue, we outline a series of progressively more complex path models based on standardized variables (i.e., β will be used as the symbol for path coefficients and regression weights). These models may then be generalized to the case of latent variables in SEM as the underlying conceptual issues are the same. The simplest linear causal model includes one exogenous variable (X) and a single endogenous variable (Y). Assuming that both are expressed in standard score form, the relationship between them can be expressed as Y = β yx X + d (4.1) where β yx is the standardized regression coefficient, and d is a disturbance term composed of (a) random shocks, (b) nonsystematic measurement error, (c) unmeasured relevant causes, and (d) unmeasured nonrelevant causes (James et al., 1982). Random shocks can be thought of as unstable causal influences, measurement error refers to nonsystematic error, and unmeasured causes are omitted variables (see James et al., 1982). Whether or not a cause is relevant depends on the nature of its relationship with other variables in the model and is illustrated below. Figure 4.1 illustrates the path model for the case of a single causal exogenous variable and a single endogenous variable. In Figure 4.1a, the disturbance term (d) consists exclusively of random shocks (RS), measurement error (ME), and unmeasured nonrelevant causes (NRC). For this model, the expected relationship between X and Y is given by the equation E(X*Y) = β yx E(Y*Y) + E(X*d) (4.2) For Figure 4.1a, E(X*Y) reduces to β yx as E(Y*Y) = 1.0 for standardized variables and E(X*d) = 0 because the expected relationship between each of the three components of d (random shocks, measurement error, nonrelevant causes) and X equals zero. In this case r xy is an unbiased estimate of the causal parameter β yx. In Figure 4.1b, however, an additional component is present in the disturbance term, an omitted relevant cause (O). As before, the expected relationship between the random shocks, measurement error, and nonrelevant causes and X equals zero. However, the RT2382X.indb 94 5/19/08 7:44:32 AM
5 Dr. StrangeLOVE 95 (a) d (= RS + ME + NRC) X β yx Y (b) r xo d (= RS + ME + NRC + O) X β yx Y Figure 4.1 Path model for one exogenous and one endogenous variable. expected relationship between X and d = r xo b yo as there is an indirect effect of X on d due to the omitted variable that is present in d. An important concept to highlight is that the relevance of an omitted determinant of the endogenous variable is based entirely on the omitted variable s relationship with other variables in the model. That is, if the omitted causal variable correlates with other determinants of Y, the omitted variable is by definition a relevant omitted variable. Conversely, if the omitted variable does not correlate with other determinants of Y, it is by definition a nonrelevant cause of Y. Consider now the case of a path model in which one of two exogenous variables is erroneously omitted from the path model (O in Figure 4.2). Assume further that O correlates significantly with both X and Y. In this case, the measured correlation between X and Y reflects not only the direct effect of X on Y, but also the indirect effect of X on Y via the shared correlation both variables have with O. In other words, the observed correlation is determined by the equation r xy = β yx + r xo β yo (4.3) X 1 β yx1 d X β yx d r x1x2 Y r xo Y X 2 β yx2 O β yo (a) (b) Figure 4.2 Path model for two exogenous variables (one omitted). RT2382X.indb 95 5/19/08 7:44:33 AM
6 96 Adam W. Meade, Tara S. Behrend and Charles E. Lance However, because O is omitted from the path model, the (naively) estimated path between X and Y (β yx ) will be equal to r yx, though r yx is actually determined by the effect of both β yx and r xo β yo. As a result, r yx as an estimate of β yx will be biased by a factor of r xo β yo. The effect of r xo is obvious. If X were not correlated with O, then r yx is not affected by O and r yx is an unbiased estimate of β yx. In this case, O is a nonrelevant omitted cause of Y. That is, its omission from the path equation has minimal effect on the estimated path coefficient of the included exogenous variables or on their associated tests of statistical significance. Conversely, if X were nontrivially correlated with O, r xy would differ from β xy by a factor equal to r xo β yo so that r xy would be a biased estimate of β xy. This bias can affect tests of statistical significance and lead to erroneous conclusions regarding the model. In this case, O is a relevant omitted cause of Y. Although the potential biasing effect of r xo on β yx is obvious, the effect of β yo is less transparent. The equation for the path coefficient β yo is r r r (4.4) 2 1 r yo yx xo β yo = xo so that in order for β yo to have a biasing effect on r xy, which could be taken as the estimate of β yx, the correlation between X and O must be nonzero. If the correlation between X and O is nontrivially positive, bias in β yx will be greater when the correlation between Y and O is large and the correlation between Y and X is small. In order to provide some context for illustration, Table 4.1 includes several hypothetical values for r xy, r xo, and r yo. Note that no values of r xo = 0 are presented because there is no bias in r xy as an estimate of β yx when there is no correlation between the exogenous variable and the omitted variable (i.e., O is a nonrelevant cause of Y). As can be seen in Table 4.1, bias is greatest when the correlation between the X and Y is somewhat low (.20) yet the omitted variable correlates highly with both X and Y. This is the classic third variable problem (e.g., the spurious correlation between ice cream sales and drowning deaths) and a primary reason that correlation cannot be interpreted as causation. In this case, much of the effect attributed to the relationship between X and Y is actually due to their mutual correlation with and/or dependence on O. RT2382X.indb 96 5/19/08 7:44:34 AM
7 Dr. StrangeLOVE 97 Table 4.1 Biasing Effects of an Omitted Variable in a Two- Determinant Model ˆβ yx = r xy r xo r yo β yo β xy Bias Note. Bias is the estimated path coefficient ( ˆβyx = r xy ) minus the true path coefficient β yx. This value is equal to r xo β yo. Conditions in which r xo = 0 are not displayed, as there is no bias under these conditions. Note that when the correlation between the endogenous variable (Y) and the omitted variable (O) is close to zero, b yo can take on negative values. When b yo is negative, r xy (which is used to estimate b yx but is mathematically equal to b yx + r xo b yo ) will actually be greater than β yx. In this case, the omission of O causes an underestimate of the path coefficient between X and Y and variable O is said to have a suppressor effect such that its inclusion in the model serves to increase the estimated path coefficient between X and Y. Examples of such negative bias are present in Table 4.1. Suppressor effects are most readily manifested when the omitted variable has a very low correlation with RT2382X.indb 97 5/19/08 7:44:35 AM
8 98 Adam W. Meade, Tara S. Behrend and Charles E. Lance the endogenous variable but a moderate or large correlation with the exogenous variable in question. In such cases, the true path coefficient for the observed exogenous variable is considerably larger than the zero-order correlation between the exogenous variable and endogenous variable that is used as an estimate of the path coefficient. In sum, several important points result from the discussion of a model with one observed determinant (X) and one omitted determinant (O) of a single endogenous variable: 1. r xy will be a biased estimate of b yx to the extent that there exist omitted relevant causes of Y. 2. This bias will be upward (i.e., r xy > b yx ) to the extent that r xo b yo > By extension, both r xo and b yo must be nonzero for bias to occur. If either r xo 0 (O is unrelated to X and thus is a nonrelevant cause) or b yo 0 (there is not unique effect of O on Y; it is not a determinant of Y), no bias occurs. 4. If one of the terms, r xo or b yo, is negative and the other is positive, a suppression situation occurs (i.e., r xy < b yo ). 5. If r xo and b yo are both negative, there will be upward bias in the estimation of b yx from r xy. Violated Assumptions Omitted relevant variable represents a violation of the assumption of self-containment in causal modeling (James et al., 1982; Simon, 1977) and is but one type of model misspecification. We cannot isolate an endogenous variable from all potential causal explanatory variables in the social sciences. Instead, we replace the assumption of isolation with one of pseudo-isolation by assuming that the disturbance term, variance in the endogenous variable not accounted for by its modeled causes, is uncorrelated with exogenous variables (Bollen, 1989), or with endogenous variables that precede the variable in question in the causal path (Duncan, 1975; James, 1980). This can be seen by again examining Figure 4.2b. In Figure 4.2b, the disturbance term, d, would now include the effect of the standardized omitted variable (β yo ). Clearly, the self-containment assumption is violated, as X will correlate with the disturbance term by a magnitude of r xo β oy. RT2382X.indb 98 5/19/08 7:44:35 AM
9 Dr. StrangeLOVE 99 X r xo β mx M β yx β ym Y d β mo O β yo Figure 4.3 Partially mediated path model with omitted variable. More Complex Models Although the effects of the omitted variable are clearly visible in a model with two exogenous variables, things rapidly become more complex when more variables are added to the model. Figure 4.3 depicts a path model illustrating the partially mediating effect of a mediator (M) on the relationship between an exogenous variable, X, and an omitted relevant causal variable, O, with the endogenous variable (Y). The path model for M is identical to that of a two exogenous variable model. As in the previous example, if O is omitted, then the expected path coefficients and potential for bias are identical to those of a path coefficient with two determinants. There are three causes of Y, yet one of these is omitted. The true population path equation for this model is Y = β yx X + β ym M + β yo O + d (4.5) And the path coefficient β yx in the true model is given as ( ) + ( ) + ( ) r 1 r r r r r r r r rxo (4.6) yx mo ym xo mo xm yo xm mo β yx = 1+ 2rxmrmo rxo rxo rmo rxm More complicated models are obviously possible as well, though algebraic expressions for the path coefficients rapidly become unwieldy. In the current example, if variable O were omitted, the estimated path coefficient for the direct effect of X on Y would be RT2382X.indb 99 5/19/08 7:44:36 AM
10 100 Adam W. Meade, Tara S. Behrend and Charles E. Lance that of a two-determinant model, in which the effect of the omitted variable is ignored: r r r = (4.7) 2 1 r ˆβ yx yx ym xm xm In order to further illustrate the effects of an omitted variable in this model, data were simulated for several levels of correlation between variables O and Y. Table 4.2 contains the level of bias observed in the path coefficient of X for different levels of correlation between the omitted variable and the other causal variables in the model. Readily apparent from Table 4.2 is that the magnitude of bias is not large in any of the conditions when the correlation between O and Y is.20. Results are more mixed for those conditions in which the correlation between the omitted variable and Y is.60. In these conditions, the magnitude of the bias of path coefficient of X can be large, but only when the correlation between the X and O is also quite large. Also, the magnitude of the bias is mitigated somewhat by the correlation between the omitted variable and M, though the bias is still sizable. Note the values presented in Table 4.2 that represent the case in which there is a relatively small correlation between X and Y, and large correlations between O and both Y and X. Under these circumstances, bias can be sizable. We set the correlations in Tables 4.1 and 4.2 to arbitrary values in order to demonstrate their effects, but in practice correlation coefficients may not plausibly vary independently of one another (Mauro, 1990). In other words, a situation in which two variables correlate very highly, and one of those two correlates highly with a third variable while the other correlates negatively with the third variable, is mathematically improbable. The patterns of correlations that result in the most bias are those in which there is a very low correlation between the measured determinants and the endogenous variable, and high correlations between both the measured determinants and omitted variables and the omitted and endogenous variables (refer to Tables 4.1 and 4.2). While such patterns of correlations are mathematically possible, they may be unlikely in some domains of study given what is known from previous research. To summarize, omitted variables can introduce bias in estimated path coefficients and this bias may be positive or negative in RT2382X.indb 100 5/19/08 7:44:36 AM
11 Dr. StrangeLOVE 101 Table 4.2 Biasing Effects of an Omitted Variable in a Three- Determinant Model r yx r ym r yo r xm r xo r mo β yx ˆβyx Bias Note. β yx represents the true path coefficient of the exogenous variable X in the completely specified model. ˆβyx represents the estimated path coefficient of X in the omitted variable model. Bias is the difference between these two. direction. The issue is then, under what conditions is it possible for an omitted variable to bias path coefficients? Below is a summary for a model with one observed exogenous variable and one relevant omitted variable: If O is uncorrelated with the exogenous variable, r xy is an unbiased estimator of b yx and the omitted variable has no effect. If the variance in Y accounted for by O is completely redundant with the variables in the model, its unique effect (β yo ) will be near zero and it will have little biasing effect. RT2382X.indb 101 5/19/08 7:44:37 AM
12 102 Adam W. Meade, Tara S. Behrend and Charles E. Lance If O is uncorrelated with the endogenous variable but strongly correlated with the exogenous variable, r xy may underestimate b yx (i.e., a suppressor effect). Thus, there are three conditions which must be present in order for an omitted variable to cause positive bias in estimated path coefficients; that variable must (a) correlate at a nonzero level with other determinants of Y, (b) not be completely redundant with other variables included in the path model, and (c) correlate with the endogenous variable. If (a) and (b) are true, but (c) is not, the omitted variable may serve to artificially deflate the estimate of the path coefficient of the variables included in the model. In sum, the potential for LOVE is greatest when the omitted variable correlates highly with the outcome variable and moderately with other determinants in the model. Path Coefficient Bias Versus Significance Testing It is important to make a distinction between the biasing effect of omitted variables on the magnitude of path coefficients and the effect of omitted variables on the significance tests of those path coefficients. Generally speaking, in theory building via path analysis and SEM, there are two important outcomes of interest to the researcher: the magnitudes of the estimates of the path coefficients themselves and associated significance tests. Often in early stages of research, the primary outcome of interest in path analyses is the significance test associated with the path coefficient. In other words, the answer to the question does the variable have a unique effect on the outcome? would seem more important than the question what is the precise magnitude of the unique effect of the variable on the outcome? If early forays into model testing with a given set of variables indicate that the effect of a determinant on an endogenous variable is nonsignificant, it is less likely that future researchers would include this variable as a measured cause as frequently as if the variable did have a significant effect on the outcome. In this context, the magnitude of the path coefficient per se is less important than the decision as to the presence or absence of an effect of X on Y. If there does appear to be an effect (i.e., the test is significant), then future use and, importantly, replication of this effect is much more likely. While the rough magnitude of the effect RT2382X.indb 102 5/19/08 7:44:38 AM
13 Dr. StrangeLOVE 103 is undoubtedly important, small bias in the path coefficients would likely be of little concern so long as the conclusion of the significance test is not affected at this stage of investigation. The second outcome of path analysis is the magnitude of the path coefficients themselves. Estimates of path coefficients are important in that standardized coefficients are one index of the unique variance in the endogenous variable accounted for by the determinant. Additionally, unstandardized coefficients can be compared over time, and cumulative evidence can be collected such that the relative effect of a determinant on an outcome can be estimated. As research cumulates over time, the precision of estimated paths becomes important to future meta-analysts such that an accurate estimate of the effect of a determinant on an endogenous variable can be calculated. Thus, even though precise estimates of effects may not be of primary interest to a researcher in early stages of research on a topic, these estimates take on additional importance over time as research accumulates and meta-analyses are conducted. Recall that if the omitted variable does not correlate with the endogenous variable but correlates with other variables in the model, it may act as a suppressor variable. This was shown in Tables 4.1 and 4.2 where the exclusion of an omitted variable resulted in negative bias of the estimated path coefficient. That is, its inclusion in the model could serve to increase the estimated path coefficients of the observed variables. In regard to significance testing, omitted variables that do not correlate with the endogenous are potentially problematic in that they may result in Type II errors (i.e., failure to detect an effect that truly exists). However, reviewer criticisms of a lack of comprehensive path models typically center more on the potential upward biasing effects of omitted variables and associated Type I error (i.e., wrongly identifying an effect that does not exist). The focus on Type I errors is understandable as such errors may translate to immediate implications for practice and use of an determinant variable whereas Type II errors are less likely to be published and likely will be rectified in future studies. If Type II error is seen as less problematic as Type I error, the requirement of a significant correlation between the omitted variable and the outcome may be added to the list of conditions that must be met before the possibility of an omitted variable becomes a concern in path models. Omitted variables that do not correlate with the outcome cannot cause RT2382X.indb 103 5/19/08 7:44:38 AM
14 104 Adam W. Meade, Tara S. Behrend and Charles E. Lance upward bias in path coefficient estimates, which is typically the focus of LOVE concerns. Minimizing the Risk of LOVE There are specific conditions under which omitted variables can be problematic, and it is true that no matter how comprehensive a path model, there are always omitted relevant variables in organizational research. We have also illustrated that there can be substantial bias under some conditions; thus, there is a kernel of truth relating to LOVE in organizational research. To this extent, educating researchers on the ways in which to minimize the risk of omitted variable problems is of paramount importance. There are several ways in which organizational researchers can minimize the risk of omitted variables biasing path coefficients, discussed below. Experimental Control First, one could incorporate design characteristics that minimize the correlation between measured exogenous variables and omitted variables. Random assignment of participants is extremely successful in controlling for a wide range of known or unknown omitted individual difference variables. As we have emphasized, there can be no possible biasing effect of an omitted variable if that variable does not correlate with the observed variables in the path model (given sufficient sample size). As such, random assignment is highly effective for controlling for almost any individual difference variable in a path model. Although random assignment may not be possible in many instances of organizational research, there are some cases in which it may be employed. For example, participants may be randomly assigned to different types of training courses, reward systems, equipment and other environmental factors, or organizational interventions for which the effectiveness may be evaluated. In more mathematical terms, recall that in the case of one exogenous variable (X) and one omitted variable (O), the estimated effect of X on the endogenous variable (Y) is the zero-order correlation between X and Y. However, the true effect of X on Y should be given as Equation 4.8: RT2382X.indb 104 5/19/08 7:44:38 AM
15 Dr. StrangeLOVE 105 r r r (4.8) 2 1 r xy yo xo β yx = xo When random assignment is used, the correlation between X and O will be near zero (with sufficient sample size). Thus, Equation 4.8 reduces to r xy and there is no bias. More Inclusive Models Second, researchers should include as many known causes of the endogenous variable as is practically possible in the path model. The potential for bias in path coefficient estimates caused by omitted variables is much greater when they serve as unique causal agents of the endogenous variable. Recall that for a two determinant model with one determinant omitted, the bias present is equal to r xo β yo. By incorporating more determinants of the outcome, the unique effects of omitted variables may be reduced as β yo approaches zero. Note however, that there is a paradoxical side effect of including more variables. That is, each additional determinant that is included in the model is also prone to LOVE and is subject to the assumption of model self-containment. Use Previous Research to Justify Assumptions Researchers may also use what is already known from past research to demonstrate that omitted variables are not likely to be problematic. For example, when estimating the effects of ability determinants of job performance, one could legitimately leave out entire classes of other performance determinants such as personality and motivation, because these are likely to be uncorrelated with ability determinants and therefore are nonrelevant causes (Ackerman & Heggestad, 1997; Sackett, Gruys, & Ellingson, 1998; Salgado, Viswesvaran, & Ones, 2001; Schmidt & Hunter, 1998; see also Lance & James, 1999). On the other hand, if both verbal and quantitative aptitude were thought to be causes of employee job performance, it is unlikely that the omission of similar types of tests (e.g., mechanical ability) would RT2382X.indb 105 5/19/08 7:44:38 AM
16 106 Adam W. Meade, Tara S. Behrend and Charles E. Lance produce a strong biasing effect on path coefficients of those tests in the model as mechanical ability is exceedingly likely to have a large correlation (i.e., be redundant with) with the measured ability test variables. As such, the plausibility of bias due to omitting mechanical ability tests is very low as again β yo will be closer to zero. Put differently, in many instances nonrelevant causes can largely be ignored because they are either (a) not related to measured causes or (b) largely redundant with relevant causes that are already measured. To this extent, prior research on correlates of both the outcome and other determinants can provide guidance on what variables are essential to include in the model and which may be safely omitted. Consideration of Research Purpose If the goal is to provide a precise estimate of path coefficients, or to compare the relative variance accounted for by different determinants, omitted variables are considerably more problematic than if the goal is to test the statistical significance of the effect of a determinant on an outcome. Examining again the simple two determinant case, influence due to omitted variables can result in bias in the estimated path coefficient (r xy ) with respect to its true value (Equation 4.8). However, with large sample sizes, even sizable bias in estimated path coefficients are less likely to change decisions drawn from the statistical significance test associated with those coefficients. With large sample sizes, power is such that even small estimated effects tend to be statistically significant. In sum, omitted variables are a fact of life in organizational research and they can be problematic. Researchers should be particularly vigilant in cases in which (a) there are a large number of determinants of the outcome variable, (b) the study in question includes only a small subset of those determinants, (c) it is likely that the omitted variables have moderate or large correlations with the measured determinants, and (d) it is likely that the omitted variables would account for unique variance in the outcome variables. However, the notion that omitted variables are always problematic is a myth as the threat to the inferences that we tend to draw may not be as serious as some have believed. RT2382X.indb 106 5/19/08 7:44:39 AM
17 Dr. StrangeLOVE 107 References Ackerman, P. L., & Heggestad, E. D. (1997). Intelligence, personality, and interests: Evidence for overlapping traits. Psychological Bulletin, 121, Bollen, K. A. (1989). Structural equations with latent variables. Oxford, England: John Wiley and Sons. Colquitt, J. A., LePine, J. A., & Noe, R. A. (2000). Toward an integrative theory of training motivation: A meta-analytic path analysis of 20 years of research. Journal of Applied Psychology, 85, Duncan, O. D. (1975). Introduction to structural equation models. New York: Academic Press. Hanushek, E. A., & Jackson, J. E. (1977). Statistical methods for social scientists. San Diego: Academic Press. James, L. R. (1980). The unmeasured variables problem in path analysis. Journal of Applied Psychology, 65, James, L. R., Mulaik, S. A., & Brett, J. M. (1982). Causal analysis: Assumptions, models and data. Beverly Hills: Sage. Kenny, D. A. (1979). Correlation and causality. New York: Wiley-Interscience. Lance, C. E., & James, L. R. (1999). ν 2 : A proportional variance-accountedfor index for some cross-level and person-situation research designs. Organizational Research Methods, 2, Mauro, R. (1990). Understanding L.O.V.E. (left out variables error): A method for estimating the effects of omitted variables. Psychological Bulletin, 108, Prussia, G. E., & Kinicki, A. J. (1996). A motivational investigation of group effectiveness using social-cognitive theory. Journal of Applied Psychology, 81, Prussia, G. E., Kinicki, A. J., & Bracker, J. S. (1993). Psychological and behavioral consequences of job loss: A covariance structure analysis using Weiner s (1985) attribution model. Journal of Applied Psychology, 78, Sackett, P. R., Gruys, M. L., & Ellingson, J. E. (1998). Ability-personality interactions when predicting job performance. Journal of Applied Psychology, 83, Sackett, P. R., Laczo, R. M., & Lippe, Z. P. (2003). Differential prediction and the use of multiple predictors: The omitted variables problem. Journal of Applied Psychology, 88, Salgado, J. F., Viswesvaran, C., & Ones, D. S. (2001). Predictors used for personnel selection: An overview of constructs, methods and techniques. In D. S. Ones et al. (Eds.), Handbook of industrial, work and organizational psychology, Vol. 1: Personnel psychology (pp ). London, England: Sage Publications. RT2382X.indb 107 5/19/08 7:44:39 AM
18 108 Adam W. Meade, Tara S. Behrend and Charles E. Lance Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, Simon, H. A. (1977). Models of discovery: And other topics in the methods of science. Dordrecht, Holland: D. Reidel. RT2382X.indb 108 5/19/08 7:44:39 AM
Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models
Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path
More informationSC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)
SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,
More informationOutline
2559 Outline cvonck@111zeelandnet.nl 1. Review of analysis of variance (ANOVA), simple regression analysis (SRA), and path analysis (PA) 1.1 Similarities and differences between MRA with dummy variables
More informationComparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior
Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior David R. Johnson Department of Sociology and Haskell Sie Department
More informationEstimating Operational Validity Under Incidental Range Restriction: Some Important but Neglected Issues
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More informationChapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models
Chapter 8 Models with Structural and Measurement Components Good people are good because they've come to wisdom through failure. Overview William Saroyan Characteristics of SR models Estimation of SR models
More informationMethods for Integrating Moderation and Mediation: Moving Forward by Going Back to Basics. Jeffrey R. Edwards University of North Carolina
Methods for Integrating Moderation and Mediation: Moving Forward by Going Back to Basics Jeffrey R. Edwards University of North Carolina Research that Examines Moderation and Mediation Many streams of
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationAssessing Studies Based on Multiple Regression
Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables
More informationOnline Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha
Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d
More informationAn Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016
An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions
More informationCORRELATIONS ~ PARTIAL REGRESSION COEFFICIENTS (GROWTH STUDY PAPER #29) and. Charles E. Werts
RB-69-6 ASSUMPTIONS IN MAKING CAUSAL INFERENCES FROM PART CORRELATIONS ~ PARTIAL CORRELATIONS AND PARTIAL REGRESSION COEFFICIENTS (GROWTH STUDY PAPER #29) Robert L. Linn and Charles E. Werts This Bulletin
More information6. Assessing studies based on multiple regression
6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal
More informationComments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.
Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D. David Kaplan Department of Educational Psychology The General Theme
More informationChapter 11. Correlation and Regression
Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationFAQ: Linear and Multiple Regression Analysis: Coefficients
Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable
More informationPropensity Score Matching
Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In
More informationstatistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:
Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility
More informationVariance Partitioning
Lecture 12 March 8, 2005 Applied Regression Analysis Lecture #12-3/8/2005 Slide 1 of 33 Today s Lecture Muddying the waters of regression. What not to do when considering the relative importance of variables
More informationInstrumental Variables
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 3 4 Instrumental variables allow us to get a better estimate of a causal
More information6.3 How the Associational Criterion Fails
6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P
More informationPath Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis
Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate
More informationReconciling factor-based and composite-based approaches to structural equation modeling
Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationStatistical Models for Causal Analysis
Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring
More informationECON Introductory Econometrics. Lecture 17: Experiments
ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More informationMplus Code Corresponding to the Web Portal Customization Example
Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationInstrumental Variables
Instrumental Variables James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Instrumental Variables 1 / 10 Instrumental Variables
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationA Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts
A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationVariance Partitioning
Chapter 9 October 22, 2008 ERSH 8320 Lecture #8-10/22/2008 Slide 1 of 33 Today s Lecture Test review and discussion. Today s Lecture Chapter 9: Muddying the waters of regression. What not to do when considering
More informationUsing Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models
Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)
More informationAn Introduction to Parameter Estimation
Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample
More informationEMERGING MARKETS - Lecture 2: Methodology refresher
EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different
More informationOn line resources Should be able to use for homework
On line resources Should be able to use for homework http://www.amstat.org/publications/jse/v10n3/aberson/po wer_applet.html http://www.indiana.edu/~psyugrad/gradschool/apply.php http://onlinestatbook.com/stat_sim/conf_interval/index.ht
More informationFORMATIVE AND REFLECTIVE MODELS: STATE OF THE ART. Anna Simonetto *
Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 452 457 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p452 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More informationSTRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013
STRUCTURAL EQUATION MODELING Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 Introduction: Path analysis Path Analysis is used to estimate a system of equations in which all of the
More informationIntroduction to Matrix Algebra and the Multivariate Normal Distribution
Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate
More informationCorrelation and Regression Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength
More informationCausal Inference Using Nonnormality Yutaka Kano and Shohei Shimizu 1
Causal Inference Using Nonnormality Yutaka Kano and Shohei Shimizu 1 Path analysis, often applied to observational data to study causal structures, describes causal relationship between observed variables.
More informationStructural equation modeling
Structural equation modeling Rex B Kline Concordia University Montréal ISTQL Set B B1 Data, path models Data o N o Form o Screening B2 B3 Sample size o N needed: Complexity Estimation method Distributions
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationTwo-sample Categorical data: Testing
Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationMeasurement Error and Causal Discovery
Measurement Error and Causal Discovery Richard Scheines & Joseph Ramsey Department of Philosophy Carnegie Mellon University Pittsburgh, PA 15217, USA 1 Introduction Algorithms for causal discovery emerged
More informationIntroduction. Consider a variable X that is assumed to affect another variable Y. The variable X is called the causal variable and the
1 di 23 21/10/2013 19:08 David A. Kenny October 19, 2013 Recently updated. Please let me know if your find any errors or have any suggestions. Learn how you can do a mediation analysis and output a text
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More informationNonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015
Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural
More informationIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression
More information, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1
Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems
Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including
More informationCHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum
CHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum 1997 65 CHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE 4.0. Introduction In Chapter
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationIV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors
IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard
More informationreview session gov 2000 gov 2000 () review session 1 / 38
review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review
More informationPotential Outcomes Model (POM)
Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics
More informationUpon completion of this chapter, you should be able to:
1 Chaptter 7:: CORRELATIION Upon completion of this chapter, you should be able to: Explain the concept of relationship between variables Discuss the use of the statistical tests to determine correlation
More informationMediation for the 21st Century
Mediation for the 21st Century Ross Boylan ross@biostat.ucsf.edu Center for Aids Prevention Studies and Division of Biostatistics University of California, San Francisco Mediation for the 21st Century
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More information2 Prediction and Analysis of Variance
2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering
More informationA Distinction between Causal Effects in Structural and Rubin Causal Models
A istinction between Causal Effects in Structural and Rubin Causal Models ionissi Aliprantis April 28, 2017 Abstract: Unspecified mediators play different roles in the outcome equations of Structural Causal
More informationSTATISTICS Relationships between variables: Correlation
STATISTICS 16 Relationships between variables: Correlation The gentleman pictured above is Sir Francis Galton. Galton invented the statistical concept of correlation and the use of the regression line.
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationSTOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3
More information1 Correlation and Inference from Regression
1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is
More informationModern Mediation Analysis Methods in the Social Sciences
Modern Mediation Analysis Methods in the Social Sciences David P. MacKinnon, Arizona State University Causal Mediation Analysis in Social and Medical Research, Oxford, England July 7, 2014 Introduction
More informationSHOPPING FOR EFFICIENT CONFIDENCE INTERVALS IN STRUCTURAL EQUATION MODELS. Donna Mohr and Yong Xu. University of North Florida
SHOPPING FOR EFFICIENT CONFIDENCE INTERVALS IN STRUCTURAL EQUATION MODELS Donna Mohr and Yong Xu University of North Florida Authors Note Parts of this work were incorporated in Yong Xu s Masters Thesis
More informationTechnical Track Session I:
Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human
More informationPBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.
PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the
More informationWhat is Wrong With Hypotheses Sociology? Or: How Theory-Driven Empirical Research Should Look Like. Katrin Auspurg and Josef Brüderl November 2016
What is Wrong With Hypotheses Sociology? Or: How Theory-riven Empirical Research Should Look Like Katrin Auspurg and Josef Brüderl November 2016 Social Research in the Era of Regression Since the advent
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationDonghoh Kim & Se-Kang Kim
Behav Res (202) 44:239 243 DOI 0.3758/s3428-02-093- Comparing patterns of component loadings: Principal Analysis (PCA) versus Independent Analysis (ICA) in analyzing multivariate non-normal data Donghoh
More information8. Instrumental variables regression
8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control You know how ANOVA works the total variation among
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationDEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS
DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu
More informationCausal Inference. Prediction and causation are very different. Typical questions are:
Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting
More informationUsing Mplus individual residual plots for. diagnostics and model evaluation in SEM
Using Mplus individual residual plots for diagnostics and model evaluation in SEM Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 20 October 31, 2017 1 Introduction A variety of plots are available
More informationCausal inference in multilevel data structures:
Causal inference in multilevel data structures: Discussion of papers by Li and Imai Jennifer Hill May 19 th, 2008 Li paper Strengths Area that needs attention! With regard to propensity score strategies
More informationControlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded
Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded 1 Background Latent confounder is common in social and behavioral science in which most of cases the selection mechanism
More informationB. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i
B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts
More informationSimpson s paradox, moderation, and the emergence of quadratic relationships in path models: An information systems illustration
Simpson s paradox, moderation, and the emergence of quadratic relationships in path models: An information systems illustration Ned Kock Leebrian Gaskins Full reference: Kock, N., & Gaskins, L. (2016).
More informationWorkshop on Statistical Applications in Meta-Analysis
Workshop on Statistical Applications in Meta-Analysis Robert M. Bernard & Phil C. Abrami Centre for the Study of Learning and Performance and CanKnow Concordia University May 16, 2007 Two Main Purposes
More informationPrerequisite Material
Prerequisite Material Study Populations and Random Samples A study population is a clearly defined collection of people, animals, plants, or objects. In social and behavioral research, a study population
More informationResearch Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.
Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D. Curve Fitting Mediation analysis Moderation Analysis 1 Curve Fitting The investigation of non-linear functions using
More informationA Guide to Proof-Writing
A Guide to Proof-Writing 437 A Guide to Proof-Writing by Ron Morash, University of Michigan Dearborn Toward the end of Section 1.5, the text states that there is no algorithm for proving theorems.... Such
More informationVariable Selection and Model Building
LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 37 Variable Selection and Model Building Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur The complete regression
More informationEXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science
EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned
More informationModeration 調節 = 交互作用
Moderation 調節 = 交互作用 Kit-Tai Hau 侯傑泰 JianFang Chang 常建芳 The Chinese University of Hong Kong Based on Marsh, H. W., Hau, K. T., Wen, Z., Nagengast, B., & Morin, A. J. S. (in press). Moderation. In Little,
More informationImplications of Direct and Indirect Range Restriction for Meta-Analysis Methods and Findings
Journal of Applied Psychology Copyright 006 by the American Psychological Association 006, Vol. 91, No. 3, 594 61 001-9010/06/$1.00 DOI: 10.1037/001-9010.91.3.594 Implications of Direct and Indirect Range
More information