Københavns Universitet. Correlations and Non-Linear Probability Models Breen, Richard; Holm, Anders; Karlson, Kristian Bernt
|
|
- Elinor Pope
- 6 years ago
- Views:
Transcription
1 university of copenhagen Køenhavns Universitet Correlations and Non-Linear Proaility Models Breen, Richard; Holm, Anders; Karlson, Kristian Bernt Pulished in: Sociological Methods & Research DOI: / Pulication date: 014 Document Version Peer reviewed version Citation for pulished version (APA: Breen, R, Holm, A, & Karlson, K B (014 Correlations and Non-Linear Proaility Models Sociological Methods & Research, 43(4, DOI: / Download date: 9 Apr 018
2 Correlations and Non-Linear Proaility Models Richard Breen*, Anders Holm**, and Kristian Bernt Karlson*** THIS PAPER IS PUBLISHED IN SOCIOLOGICAL METHODS & RESEARCH, 014, VOL 43, NO 4, Link: This is a post-print (ie final draft post-refereeing version according to SHERPA/ROMEO * Center for Research on Inequality and the Life Course, Department of Sociology, Yale University, richardreen@yaleedu ** Department of Sociology, University of Copenhagen; SFI The Danish National Centre for Social Research, ah@sockudk *** SFI - The Danish National Centre for Social Research; Danish School of Education, Aarhus University, kk@dpudk Tuesday, May 8, 013 Acknowledgements: We would like to thank the SMR editor and reviewers and the participants at the RC8 conference held in Trento, Italy, in May 013, for helpful comments on this paper
3 Correlations and Non-Linear Proaility Models Astract Although the parameters of logit and proit and other non-linear proaility models are often eplained and interpreted in relation to the regression coefficients of an underlying linear latent variale model, we argue that they may also e usefully interpreted in terms of the correlations etween the dependent variale of the latent variale model and its predictor variales We show how this correlation can e derived from the parameters of non-linear proaility models, develop tests for the statistical significance of the derived correlation, and illustrate its usefulness in two applications Under certain circumstances, which we eplain, the derived correlation provides a way of overcoming the prolems inherent in cross-sample comparisons of the parameters of non-linear proaility models
4 3 Correlations and Non-Linear Proaility Models Both tetook epositions and discussions of research findings commonly interpret the coefficients of non-linear proaility models (henceforth NLPMs such as logits, proits, the ordered logit and proit, and the multinomial logit, as the coefficients of an underlying latent linear model, aleit identified only up to scale In this paper drawing on and etending the work of McKelvey and Zavoina (1975 we point out that the coefficients of NLPMs are, in fact, closely related to the correlations etween the dependent and predictor variales of the latent linear model, and, in some circumstances, interpreting them in the light of this could e particularly useful This is especially so when we want to compare the effects of the same variale in the same NLPM fitted to different groups (for the prolems in doing this see Allison 1999 We first develop the logit and proit models in a latent variale framework and show how their coefficients can e used to recover the correlation etween a predictor variale,, and y*, the underlying latent dependent variale We discuss the conditions under which the correlation coefficient is likely to prove useful and we provide statistical tests of the derived correlation coefficient and for differences in correlations etween samples We also show how to derive the correlation in the ordered and multinomial case We conclude the paper with two eamples dealing with trends in educational inequality The first is a reanalysis of data on educational inequality in Europe (Breen et al 009 The second uses GSS data to study the trend in educational inequality in the US among cohorts orn etween 1930 and 1969 Four appendices contain some technical details of our approach
5 4 A Latent Variale Formulation of Logit and Proit Models Let y* denote a continuous outcome variale, and let e a predictor variale whose effect on y* we want to estimate We can write a regression model for y* as y* = α + β y * + u, with sd( u = σ u (1 Here u is a random error term and σ u its standard deviation, which summaries that part of the variation in y* uneplained y α is the intercept of the model, and β y* captures the effect of on y* α, β y*, and σ u are all unknown parameters However, we assume that y* is unoserved and instead we oserve: y = 1 if y* > τ y = 0 if y* τ ( This means that we only oserve whether an oservation s value of y* is greater or less than a constant, τ, which is a threshold parameter whose value is unknown Following a standard latent variale formulation of discrete choice models we rewrite the error term in (1 such that u = sω For the logit model, we assume that ω is a standard logistic random variale, with mean ero and variance π /3 and s is a scale parameter, yielding a variance of σ = s π / 3 for the error term in (1 For the proit model, we assume that ω is a standard normal random variale, with mean ero and unit variance and s is a scale parameter, yielding a variance of σ u = s for the error term in (1 For the logit case we specify the proaility of success as a function of : u α τ β y* ep + u β α τ y* s s Pr( y = 1 = Pr( y* > τ = Pr > + =, (3a s s s α τ β y* 1+ ep + s s
6 5 where the final equality holds ecause we assumed ω to e a standard logistic random variale Taking the logarithm of the odds of the proaility in (3a, we otain the well-known logistic regression model: α τ β y* logit(pr( y = 1 = + = a + y (3 s s For the proit case we also specify the proaility of success as a function of : α τ s * Pr( y* > τ = Φ + y = Φ ( a + y β s (4 Where Φ( denotes the cumulative distriution function of the standard normal distriution, and where the first equality holds ecause we assumed ω to e a standard normal random variale The intercept of the logit or proit model depends on the intercept of the underlying linear model (α and the threshold parameter (τ and these two cannot e recovered separately, eing asored in the intercept, a, of the logit or proit The parameters y in the logit and proit models are equal to the regression coefficients from the underlying linear model in (1 divided y the scale parameter, which is a function of the underlying conditional error variance Thus in proit and logit models we can identify regression coefficients only up to scale, which is a function of the conditional standard deviation of the latent outcome variale This presents particular difficulties for parameter comparisons, whether these are etween parameter estimates for the same variale in different model specifications (for eample, with and without control variales; see Karlson, Holm and Breen 01 or parameter estimates for the same variale in the same models fitted to different samples (Allison 1999 In oth cases, real differences in a variale s effect will e confounded y possile differences in scaling The issues pertaining to scale identification of coefficients from non-linear proaility models have wide-ranging consequences for comparative research analying discrete outcomes
7 6 Most significantly, ecause these coefficients are inherently standardied, researchers cannot generally follow the recognied advice of using "unstandardied coefficients" in group comparisons (Tukey 1954; Blalock 1967 Consequently, we need to look to other metrics that might prove useful in certain areas of sociological research The Prolem in Group Comparisons When Not Assuming a Latent Outcome Variale The previous derivations hold under the assumption that we oserve the latent outcome, y*, via its inary manifestation, y However, in some sociological applications, motivating the logit model in terms of latent variales is less ovious Yet, even when the inary outcome variale can e said to e truly inary, the prolem in comparing logit or other nonlinear proaility model coefficients across groups remains Using the distinction etween marginal and conditional models (Agresti 00, in Appendi A we show that the group comparison prolem also applies when the outcome can e said to e truly inary and the latent variale formulation consequently does not apply In either case, group comparisons of coefficients are distorted y differences in unmeasured heterogeneity across groups even when the unmeasured heterogeneity is independent of the oserved predictor variales The Correlation and Its Relation to the Logit or Proit Coefficient Rather than interpreting logit and proit coefficients as underlying effects identified up to scale, they can also e interpreted as functions of the correlation coefficient etween the predictor and the underlying latent variale, y* To show this, we use results from the early literature on the relationship etween coefficients from linear models and correlations (eg, Blalock 1964, 1967; Linn and Werts 1969; Theil 197 The only difference is that we apply the results to y*, allowing us to derive the correlation coefficient from the proit and logit model
8 7 We can write the variance of the underlying latent variale, y*, as var( y* = β var( + var( y* = β var( s var( (5 y * y* + ω and the correlation etween y* and is equal to r sd ( = β sd ( y* y* y* Using (5 and the fact that the logit or proit coefficient, y, equals β y* s we can write r y* ys sd( y sd( = =, (6 s var( + s var( ω var( + var( ω y y since the s terms cancel For the logit we sustitute var(ω = π /3 and for the proit var(ω =1 With a single predictor variale the correlation in (6 equals the fully standardied coefficient suggested y McKelvey and Zavoina (1975: 115 Using (6 we can write the logit or proit coefficient as a function of the correlation coefficient To do so, we solve for k in y = k ry * Using that * sd( y s = and writing the sd( ω correlation and y coefficient in terms of variances and covariances, we otain cov(, y* cov(, y* = k, var( s sd( sd( y* cov(, y* sd ( ω sd ( sd ( y* sd ( ω sd ( y* 1 sd ( ω k = = = cov(, y* var( sd ( y sd ( sd ( y sd ( * * 1 ry* and thus
9 8 y = r y* 1 r y* sd( ω sd( (7 Thus logit and proit coefficients can e epressed as the square root of the ratio of the eplained to uneplained variance of y* scaled y the ratio of the standardied standard deviation to the standard deviation of the predictor variale 1 From this we see that a logit or proit coefficient has two parts, one which is scale invariant, r y* 1 r y*, and one which is scale-dependent, sd ( ω sd ( As we will argue, this separation provides an entry into addressing the prolem of making comparisons etween groups when using NLPMs The derivative of the logit or proit coefficient with respect to the correlation is r sd ( ω y = 3 / y* sd ( (1 r The derivative tells us that the logit or proit coefficient increases as the correlation deviates from ero From the forgoing we see that the relationship etween the correlation and the y coefficients depends only on known quantities namely sd (ω, which is known (y assumption, and sd ( which is known from the data This is in contrast to the relationship etween y and β y * which depends on the unknown quantity, s 1 In the proit case, ( 1 sd ω =, and if we standardie to have unit standard deviation, the squared proit coefficient will equal the ratio of eplained to uneplained variance in the underlying latent variale regression
10 9 Partial, Semi-Partial, and Fully Standardied Partial Coefficients Equation (6 applies when there is a single predictor variale With two or more predictor variales, we can derive the partial correlation of and y* given (this is demonstrated in Appendi B as r y* = y y sd( var( + var( ω, (8 and, as a consequence, we can write the partial logit or proit coefficient as y = r y* 1 r y* sd( ω sd( (9 The etension to the case with several control variales is straightforward Despite this, the etension from the simple to the partial case has never efore een demonstrated, as far as we are aware As in the linear case (Blalock 1967: 133, the partial correlation in (8 will generally differ from the fully standardied partial coefficient, * y, used y McKelvey and Zavoina (1975, * y = y sd( y var( + y var( + y y cov(, + var( ω, (10 their analytical relation eing: = r * y y* 1 r y* 1 r, Similarly, we derive the semi-partial correlation etween y* and given as The semi-partial correlation is also known as the part correlation
11 10 r y*( = y sd ( y var( + y var( + y y cov(, + var( ω meaning that the semi-partial correlation and the fully standardied partial coefficient have the following relation: * sd( 1 y = ry*( = ry*( sd( 1 r Thus, the fully standardied partial coefficient can e viewed as a rescaled version of the semipartial correlation, with the scale factor eing the square root of the proportion of the variance in that is uneplained y, r 1 When Is the Correlation a Useful Metric? Using correlations rather than logit or proit coefficients themselves implies a shift of focus for researchers interested in group comparisons In this section, we clarify the conditions under which the correlation coefficient is a useful metric for group comparisons and suggest which methods researchers might consider if the conditions we outline are not met When the Correlation Is Useful We have shown how to derive the correlation from the parameters of an NLPM: ut when, and why, should we want to do this? We outline three circumstancesthe first circumstance in which the correlation will e useful is when we want to compare the variation in y* etween and within values of This is particularly the case when is categorical, as, for eample, in studies that focus on the relationship etween educational attainment and social class ackground In many studies the oserved dependent variale, y, is whether or not a student makes a given educational transition (such as the transition from High School to College and the unoserved y* could e
12 11 interpreted as his or her propensity to do so In these sorts of analysis we are often interested in the changing relationship, over irth cohorts, etween dummy variales,, representing social classes, and y* (eg Shavit and Blossfeld 1993 The conventional approach would take comparisons of β y * over cohorts as the ideal measure, and one would try to recover these y estimating the corresponding logit or proit coefficients, y Logit or proit coefficients declining in magnitude over successive irth cohorts would then suggest equaliation in the particular educational transition But the prolem with this interpretation is that we cannot know the etent to which declines over cohorts in y are due to declines in β y * or to increases in the residual variation of y* An alternative measure of equaliation is the degree to which variation in the propensity to make the transition eists etween students from different classes, relative to the variation that eists among students within the same class A decline over cohorts in the variation in y* tells us that there is less variation as a whole: in this case the correlation would decline only if the variation etween students from different classes was declining more quickly than the total variation Thus, if the correlation was constant over irth cohorts, y could decline only as a result of a reduction in the variance of y* and/or an increase in the variance of But it was to remove these spurious influences on the measure of inequality that led sociologists to the use of NLPMs in the first place (Mare 1981: y the same argument, they should then prefer the (partial correlation as a measure of educational inequality The second circumstance in which the correlation can e informative is when we care aout the relative, rather than the asolute, value of the outcome variale In studies of intergenerational income moility, for eample, it is reasonale to focus on an individual s position within a given income distriution (that is, relative to others rather than, or in addition
13 1 to, the individual s asolute level of income For this reason, the correlation of (the logarithm of parent s and child s incomes is sometimes preferred, as a measure of intergenerational immoility, to the elasticity (the coefficient from the regression of log of child s income on log of parent s income ecause it is insensitive to differences in the variance of income in the parental and child generations (see Björklund and Jäntti 009 Because we can recover the correlation etween y* and one or more predictor variales, we can estimate the standardied regression coefficients linking them, and these will e particularly useful when we care aout relative rather than asolute position Consider the following eample: one would epect the impact of a scholar s rate of pulication on his or her academic salary to e affected y the mean and variance of each of these two variales within the scholar s field The pulication of two papers per year more than the mean pulication rate in mathematics is a more impressive performance than the same achievement in chemistry ecause the variance of pulication rates is much larger in the latter field Since the variance of annual salaries also differs across fields, it is unreasonale to epect that pulishing one additional paper during a period of time will have an equivalent impact on annual salary across fields In contrast, it seems more reasonale to epect that the impact of increasing one s pulication rate y one field-specific standard deviation will have an equivalent impact on field-specific standard scores for academic salary across fields (Hargens 1976: 5 If we sustitute the phrase propensity to otain tenure for annual salary we have a inary outcome and here an interpretation of logit or proit coefficients in terms of the -y* correlations or standardied regression coefficients should e preferred over an interpretation of them as underlying regression coefficients identified up to scale
14 13 The third circumstance concerns the situation in which researchers are interested in conditional inference; that is, interested in the association etween two variales (y* and net of a third variale ( For eample, stratification researchers might want to compare across irth cohorts the magnitude of the association etween college completion and parental social status net of race In such situations, the semi-partial correlation will often e preferred over the partial correlation The latter correlation partials out the variance in oth y* and eplained y, while the former only partials out the variance in eplained y In other words, the semi-partial correlation (or its squared counterpart is a measure of the fraction of eplained variance in y* due to net of relative to the unconditional or total variance in y* In contrast to the partial correlation, the semi-partial correlation is unaffected y the predictive power of on y*, therey not conflating differences in the predictive power of in group comparisons An alternative to the semi-partial correlation is the fully standardied partial coefficient of McKelvey and Zavoina (1975, stated in Equation (10 While this coefficient is just a rescaled version of the semi-partial correlation, it is useful whenever researchers are concerned with the relative, rather than the asolute, value of the outcome variale, as outlined aove In this situation, the fully standardied partial coefficient can e interpreted as the epected standard deviation change in y* for a standard deviation change in, given For eample, stratification researchers might want to know whether the dependency of the son's relative position in the socioeconomic status distriution on the relative position of his father in the father's corresponding distriution, once race is controlled, has changed over the 0th century In such a situation, the fully standardied partial coefficient might e preferred One thing which is clear from these eamples of where the correlation may prove useful, is that, when these conditions prevail, interpreting the coefficients of NLPMs in terms of
15 14 correlations also provides a solution to the prolem of comparing the effects of variales across samples, etween which the residual variation, captured in the scale parameter, s, might differ When the Correlation Is Not Useful The conditions under which the correlation is a useful metric for group comparisons may not always e met In this susection we first riefly outline these conditions and then discuss two other methods that might e useful when the conditions are not met Unstandardied or Standardied Coefficients? Unstandardied regression coefficients have long een preferred over standardied coefficients in social science research ecause the former are unaffected y the distriutions of the variales involved in the model (Blalock 1967; Kim and Mueller 1976; Schoenerg 197; Tukey 1954 In the linear regression model, the unstandardied regression coefficient of on y is given y cov(, y β =, var( yielding the epected change in y for a unit change in The standardied or equivalent correlation coefficient is given y sd( r = β, sd( y and is affected y oth the effect of on y and the distriutions of and y Using standardied or correlation coefficients for group comparisons will conflate differences in true effects, captured
16 15 y β, with potential differences in the distriutions of oth and y, captured y sd( and sd(y For this reason, and for reasons stated throughout this paper, using the correlation coefficient for group comparisons is not informative aout differences in effects; it is merely informative aout differences in the predictive power of on y (Achen 1977; King 1986 or aout the association etween relative positions in the distriutions For eample, the correlation coefficient would not e useful for comparing the influence of high school grades on college enrollment etween men and women Because the correlation standardies grades and the latent college enrollment propensity within gender, we cannot know whether gender differences in correlations reflect differences in effects or differences in the dispersion in the latent propensity distriution or in the grade distriution If the effect, β, and the dispersion in the latent propensity, sd(y, were the same for men and women, ut the dispersion in the grade distriution, sd(, was larger among men than women, then we would see larger correlations among men than women Such differences would not e informative aout gender differences in the influence of grades on college enrollment, even if we interpreted them as associations etween relative positions in the distriutions This interpretation would imply that women (men only compete with other women (men in otaining college positions But this is unlikely to e true in most countries Similar eamples can e given of comparison across supopulations that compete for the same goods Nevertheless, although using the correlation for comparative purposes is restricted in many respects, the same applies to nonlinear proaility models in which the coefficients, y design, are inherently standardied on the latent outcome variale via the scale parameter, s Since the true effect, β y*, cannot e separately identified, otaining unstandardied coefficients is not possile in nonlinear proaility models And since the scale parameter, s, depends on oth
17 16 the total variance in the latent outcome and the predictive power of, coefficients of nonlinear proaility models are difficult to interpret 3 This is precisely the reason why we propose viewing these coefficients in terms of correlations which, although limited in their use, have a well-defined interpretation However, viewing nonlinear proaility model coefficients as correlation coefficients is only one way of approaching the limitations of the inherent standardiation of these coefficients As we have already pointed out, the correlation coefficient is standardied on oth the predictor variale and the outcome variale Yet, ecause we know the scale of the predictor variale,, standardiing the coefficient on is not necessary Researchers may instead opt for the coefficient standardied on the latent outcome alone This coefficient was introduced in the seminal work y Winship and Mare ( In the simple case with only two variales, it is given y YSTD y β y* y = = sd ( y* y var( + var( ω This partially standardied coefficient differs from the fully standardied counterpart in (6 y not including sd( in the numerator It yields the epected standard deviation change in y* for a unit change in, and can easily e etended to the multiple case, YSTD y β y* y = ( * = sd y var( + var( + cov(, + var( ω y y y y, 3 The interpretation of coefficients of nonlinear proaility models is even more complicated when several predictor variales are included In this situation, the coefficients reflect the true effects of interest and the predictive power of all predictor variales 4 Breen and Karlson (013 discuss the advantages of using coefficients standardied on the outcome in causal inference involving nonlinear proaility models and show the equivalence etween these coefficients and Cohen's d, an effect sie metric much used in evidence-ased research
18 17 In the multiple case, the coefficient epresses the epected standard deviation change in y* for a unit change in, holding constant The partially standardied coefficient may prove useful in certain areas of comparative research For eample, in studying the lack and white gap in college attainment over the 0th century US, researchers might want to know whether the gap in the underlying propensity to complete college narrowed over time In this situation, the partially standardied coefficient could e used to compare the lack-white gap in y*, measured in standard deviations, across different time periods Such an analysis would e informative aout the changes in the lackwhite gap in college completion as a positional good Moreover, if researchers were interested in comparing these gaps net of socioeconomic status, the partially standardied coefficient for the multiple case could e used Given the often complicated interpretation of nonlinear proaility model coefficients, the partially standardied coefficient is an attractive alternative that acknowledges the inherent standardiation on the latent outcome variale and retains the scale interpretation of the predictor variale of interest Because this coefficient is not sensitive to the distriution in it may e useful in certain areas of comparative research We now turn to two other alternatives to using logit and proit models in group comparisons in sociological research Using Predicted Proailities The issue of comparing logit or proit coefficients across groups was addressed y Ai and Norton (003 whose approach uses the predicted values from the logit or proit model, and Long (009 shows how to implement the method when researchers want to use it to make group
19 18 comparisons Because predicted proailities are non-linear, group differences in them will vary across the distriution of the predictor variale of interest This means that an interaction effect cannot e reduced to a single quantity, ut has to e studied graphically across the distriution of the predictor variale To implement this idea, Long suggests using a logit or proit model to estimate, for each group, the predicted proaility of the outcome at different values of a predictor variale of interest and then plotting the difference etween the groups' predicted proailities across the distriution of the predictor variale Long (and Ai and Norton also suggest statistical tests of group difference in proailities This approach could e applied when hypotheses aout group differences can e tested through their implications for proailities It is also likely to e valuale in showing that the magnitude of group differences can vary across the distriution of predictor variales This can often est e seen graphically; indeed, as Greene (010 notes, such graphical output is needed whenever we are to evaluate interaction effects on the proaility margin Nevertheless, this method has some limitations As Greene (010: 95 oserves, "partial [marginal] effects are neither coefficients nor elements of the specification of the model They are implications of the specified and estimated model" (rackets added One consequence of this property is that tests of significance of differences in predicted proailities cannot e interpreted in the way we would interpret tests of significance of differences in coefficients Furthermore, while the method is simple to implement in models with inary outcomes, in the ordinal or multinomial case the graphical output will rapidly ecome cumersome Using the Heteroskedastic Non-Linear Proaility Model As well as pointing to the prolem of making comparisons across groups when using non-linear proaility models, Allison (1999 also proposed a solution that involved estimating the ratio of
20 19 error variances etween groups Williams (009 showed that Allison s method is a special case of what is variously termed a "heteroskedastic non-linear proaility model", a "heterogeneous choice model", and a "location-scale model" This model is characteried y an equation for the variance as well as the mean The variance can e written as a function of any relevant predictor variales, ut, in the contet of group comparisons, we would e most interested in models that allowed the variance to differ according to dummy variales for groups For eample, applying the model to the male-female iochemist data originally used y Allison, Williams (009: 540 reports that the residual variance for women is 135 times larger than for men What appears to have een overlooked in discussions of these models, however, is the issue of identification; that is, our aility to draw inferences aout the parameters of interest, such as differences in true coefficients across groups If we consider the case of comparisons using the ordered logit or proit model (see equation (11, elow 5, then two groups can differ in one or all of their threshold parameters, τ j, their regression coefficients, β, and their scaling factor, s A model in which all of these differ is not identified This follows ecause such a model is equivalent to fitting a separate model for each group and so, if it were identified, this would allow us to recover the group specific scaling factor from that group s ordered logit model something which we know is not possile Thus, to identify the relative sies of the different groups scaling factors, it is necessary to impose a group constraint on either or oth of τ j and β (see Williams 009: 551, whose application of the heterogeneous choice ordered proit model is "contingent on the thresholds eing the same for oth men and women" 5 Everything that follows holds for inomial as well as ordinal outcomes and, indeed, for non-linear proaility models as a whole
21 0 Whenever we have reason to elieve that a constraint on β or τ j is defensile, the model may e a good choice for making comparisons (see, eg, Mouw and Soel 001 ecause we can recover the group difference in the true coefficients of interest However, as Long (009 argues, in sociology we rarely have either the ody of accumulated knowledge or strong theory that would make the imposition of such constraints anything other than ad hoc 6 Thus, whenever the condition cannot e met, differences in unknown thresholds will e confounded with differences in unknown scaling factors, and so heteroskedastic non-linear proaility models will not solve the group comparaility prolem Some Further Results Etension to the Ordered and Multinomial Case In the ordered logit and proit we oserve not merely whether y* eceeds an unknown value or not (as in the inary logit or proit ut into which interval of y* a given oservation falls Thresholds divide the range of y* into disjoint and ehaustive intervals, τ 1 <τ < <τ J where τ = and J 1 τ = These allow us to define y O = j if τ 1 < y* < τ where j j O y is an ordered, discrete variale with J categories and the ordered proit or logit model takes the form: g(pr( y τ β s * j > τ j = + t, (11 s where g( is a link function the logit or the cumulative normal in the ordered logit and ordered proit respectively and = β / s is the ordered logit or ordered proit coefficient of Because the underlying latent variale model in the ordered models is the same as in the inary case (the 6 It is, of course, possile to make the residual variance depend on other variales But in so far as these variales do not wholly determine the error variance we are still in the dark as to how much it differs etween groups
22 1 difference etween them eing only in the degree to which y* is oserved, the epressions we have derived for the correlation and partial correlation (and for the coefficient of determination which we derive in Appendi D apply directly to the ordered case The multinomial case is more complicated ecause we have a numer of underlying latent variales We define as efore, ut we now define y*a as the propensity of an individual to choose alternative a from among a set of A possiilities We assume that and y*a are related such that y = α + β + s u, (1 * a a a a a where β a captures the effect of for alternative a, ua is an alternative-specific random error term, which is independent across alternatives and follows a standard type-i etreme value distriution, and sa is an alternative-specific scale parameter We only oserve which of the A alternatives the individual actually chooses and we assume that the individual chooses the alternative for which he or she has the greatest propensity: y = a y y > a a * * if a a 0, McFadden (1974 shows that the proaility of choosing alternative a given is equal to ep( ka + a Pr( y = a =, (13 ep( ka + a a with normaliation such as k 1 = 1 = 0 to secure identification, and where a= a= k a α a = and s a a β a = s a The odds of choosing a rather than the aseline category a' are ( k ep a + a and from this we can derive the correlation etween y * * a y a ' and in much the same way as in the inary case, when we add one important qualifier: The standard deviation of used in the correlation should
23 e calculated not on the entire sample, ut on the sample which has chosen either the alternative, a, or the reference category, a' The reason for this is that the standard deviation of on the contrast-specific sample can differ from the standard deviation of in the full sample We may then derive the correlation as r = a a * * ( ya ya ', a a + π sd (, (14 var( / 3 where we define sd( a as the standard deviation of on the sample pertaining to the alternative a versus the references contrast a' (and var( a is its squared counterpart The epression in (14 generalies easily to any contrast etween two alternatives Sensitivity to Mis-Specified Error Term As we saw earlier, the correlation etween a predictor and the latent outcome can e recovered from the parameters of a NLPM ecause we know the variance of the predictor and var(ω is given y assumption Typically, when y is a inary realiation of y* we assume ω to have either a standard normal (for the proit or standard logistic (for the logit distriution But this raises the question: how sensitive are our estimates of the correlation to such an assumption? To answer it we ran a Monte Carlo simulation in which we fitted logit and proit models to data in which we varied the true underlying error distriution, the sample sie, and the magnitude of the correlation In Tale 1 we report the mean, over 500 replications, of the asolute deviation from the true correlation It is noticeale that, ecept for the uniformly distriuted error, the metric is largely insensitive to pure misspecifications, with the mean deviations for the misspecified cases differing little from those where the model is correctly specified (the logit-logistic and proit-
24 3 normal cases This holds even for a lognormal distriution with a skewness of -1 where the error term is highly asymmetric As we would have epected, the sensitivity is less pronounced the larger the sample sie --TABLE 1 HERE -- Relationship to the Polyserial Correlation Coefficient The correlation coefficient we present in this paper is closely related to the polyserial correlation coefficient widely used in psychometrics (Co 1974; Olsson, Drasgow, and Dorens 198 However, we elieve that our approach is more general, more fleile, and easier to implement The polyserial correlation assumes that y* and follow a ivariate normal distriution, whereas our measure places no assumptions on the distriution of and allows y* to have distriutions other than normal (eg, the logistic, complementary log-log Our method is easily generaliale to partial correlations (using partial logits and conditional standard deviation of whereas the partial polyserial correlation has to e derived from several polyserial correlations, severely complicating the derivation of standard errors Furthermore, unlike the polyserial coefficient, our method etends to partial correlations without making any assumptions aout the distriution of the control variale, Significance Testing Because the correlation is asymmetrically distriuted, tests of its statistical significance are usually undertaken on the Fisher transformed coefficient The Fisher transformation of the correlation coefficient, r, is
25 r F ( r = ln = arctanh( r (15 1 r F(r is approimately normally distriuted with a known mean and standard error under the null However, in our case the correlation is not directly calculated from data: rather it is computed as a function of an estimated quantity, y In calculating the standard error of F(r we therefore take this into account, ut, having done this, we can then eploit the normality of F(r to calculate confidence intervals and significance tests We compute the standard error of F(r using the delta method The asymptotic standard error otained from this approach depends on the partial derivative of F(r with respect to and we compute this using the chain rule F( r F( r r = r y y The standard error also depends on the variance of y We can write the asymptotic variance of F(r as: df ( r dr dr d y var( y (16 where df( r dr dr d y = y sd( var( + var( ω The asymptotic standard error of F(r is given y the square root of equation (16 Because the Fisher transformed correlation is asymptotically normally distriuted under the null, we can test the null hypothesis of ero correlation via: Z F( r = (17 se( F( r We analyed the power of the test score in (17 using a Monte Carlo study The results are reported in Tale We generated the data using equation (1 with normally distriuted and u having a standard logistic distriution and we derived the estimated correlation from a logit
26 5 model We varied the true correlation and the sample sie and, for each scenario, we report the percentage of times out of 1000 replications that our test rejected the null hypothesis (ased on a 05 critical value In scenario A of Tale the null hypothesis is true, and here our test rejects the null hypothesis 5% of the time or less In scenarios B through F, the null hypothesis is false, and our test fails to reject it only when the sample sie is small and the true correlation is quite small Even in samples of 00 and for a correlation of 05 a false null has a 90% rejection rate In samples of 1,000 oservations or larger, we otain a power of aove 80 percent even for small correlations -- TABLE HERE -- The Fisher transform can easily e etended to test the null hypothesis of ero difference etween groups in their correlations: Z F( ry*, F( ry*,1 = (18 var ( F( r + var( F( r y*,1 y*, Here the suscript 1 and indicates different samples etween which we want to compare correlations We conducted a Monte Carlo study to evaluate the power of this test The results can e found in Appendi C They show that the larger the difference etween correlations and the larger the sample sie, the more likely the test is to reject the null hypothesis of no difference in correlations With a sample of 1000 the test is effective at rejecting the null hypothesis when the difference in correlations is 0 or greater: for a sample of 5000 the test performs well for differences in correlations of more than aout 07 Applications
27 6 The final part of our eposition applies the methods we have proposed to two eamples that illustrate the relationship etween the correlation and the coefficients of NLPMs Our first eample is a reanalysis of data on trends in educational inequality in Europe (Breen et al 009 Using the correlation slightly modifies the findings of the original study and we eplain why y investigating the source of the differences that we find etween the ordered logit coefficients on which the original conclusions were ased and the correlations on which we rest our conclusions Our second eample uses data from the General Social Survey (GSS to eamine trends in the relationship etween educational attainment and father s socio-economic inde over cohorts orn etween 1930 and 1969, and here we find that the differences etween trends ased on an ordered logit model and those ased on the correlation are much more pronounced Non-Persistent Inequality in Educational Attainment Shavit and Blossfeld (1993 summarie the results of their seminal study of inequalities in educational attainment across different cohorts under the title Persistent Inequality They claim that, despite dramatic educational epansion during the twentieth century, all ut two (Sweden and the Netherlands of the thirteen countries studied in their project ehiit staility of socioeconomic inequalities of educational opportunities (Shavit and Blossfeld 1993: Recently, Breen, Luijk, Müller and Pollak (009 have challenged Shavit and Blossfeld s conclusions using data on educational inequality in the twentieth century in eight European countries (see also Breen and Jonsson 005; Breen, Luijk, Müller and Pollak 010 All these studies ase their comparisons etween irth cohorts on the use of NLPMs: logits in the case of the Shavit and Blossfeld (1993 volume, the ordered logit in the Breen et al (009, 010 analysis This implies that their conclusions may confound real differences in educational
28 7 inequality across cohorts with differences in residual variation Here we re-analye data from three of the eight countries in the Breen et al study and compare the coefficients and thus the conclusions drawn from the original model with the corresponding partial correlations The correlation, as a standardied measure, is particular appropriate in this case ecause, from the perspective of the study of inequality, education can e seen as a positional good: what matters is a person s position in the distriution rather than his or her asolute level of education Furthermore, when, as here, the predictor variales are dummies, the correlation has an attractive interpretation, telling us how much of the variance in the latent y* lies etween classes rather than within them The data we use relate to men in Germany, France, and the Netherlands, orn in one of five cohorts: , , , , and The dependent variale is highest level of educational attainment measured using five ordered categories: 1 Compulsory education with or without elementary vocational education, Secondary intermediate education, vocational or general, 3 Full secondary education, 4 Lower tertiary education, 5 Higher tertiary The predictor variale is social class origins, ased on the respondent s report of his father s occupation when he was growing up Seven classes are distinguished using the EGP class schema (Erikson and Goldthorpe 199, chapter : I II IIIa Upper service, Lower service, Higher grade routine non-manual workers, IVa Self-employed and small employers,
29 8 IVc Farmers, V+VI Skilled manual workers, technicians and supervisors, VIIa+III Semi- and unskilled manual, agricultural, and lower grade routine nonmanual workers Sample sies are 17,14 for Germany, 51,705 for France, and 19,751 for the Netherlands Further detail on the data and on the surveys from which they were otained can e found in Breen et al (009: These three countries were chosen for reanalysis here ecause Breen et al found significant decline in class inequality in educational outcomes over successive irth cohorts in all three countries, whereas in Persistent Inequality a decline was found for the Netherlands ut not for Germany (France was not included The question we now seek to answer is whether Breen et al s claim of non-persistent inequality is roust if we use the correlation as our measure of inequality in educational attainment We egin our analysis y replicating theirs Breen et al use an ordered logit model to regress the five educational categories, considered an ordinal ranking of educational attainment, on dummy variales representing the origin classes They fit a separate model to each of the five irth cohorts and present their results in a series of figures, of which the most important is Figure 4 on page 1495 of their paper Our replication of their analysis is shown here in Figure 1 and this is identical with their Figure 4 -- FIGURE 1 HERE -- Class I serves as the omitted category, having an implicit coefficient of ero, and the lines plotting the coefficients for the other classes (which measure the etent to which the particular class s log-odds of eceeding any given level of education differ from those of class I show a
30 9 general trend of convergence towards ero as we move from later- to earlier-orn cohorts This trend etends across the whole of the 0 th century in the Netherlands ut does not egin until after the cohort orn in Germany and France The trend is most pronounced among the initially most disadvantaged classes, particularly farmers (IVc and unskilled workers (VII+III (see Breen et al 009: Figure shows, for each class, the partial correlation (that is, controlling for the effect of the other class dummies with latent educational attainment, y* These are all negative (ecause they are all measured relative to class I and so a line in Figure that rises towards ero over cohorts (a notale eample is the line for class IVa in the Netherlands tells us that a declining share of the conditional variance in y* lies etween that origin class and class I -- FIGURE HERE -- The trend towards greater equality, measured as the convergence of class coefficients, is somewhat less pronounced in Figure (correlations than in Figure 1 (logits, especially for France, and, in oth France and Germany, Figure suggests that the trend to equaliation egan later (after the cohort than Breen et al found Trends for particular classes are not always the same in the two figures: eg in France the partial correlations show that the positions of classes III and V+VI worsened relative to class I, whereas the logit coefficients in Figure 1 show that their position either remained unchanged or improved Equation (8 tells us that such discrepancies must e due to changes in the conditional (on the other classes variance of the
31 30 class dummies which will, to a considerale etent, reflect the changing relative sies of the social classes 7 We can investigate the differences etween the logit coefficients and the partial correlations y returning to equation (9 that shows that the partial logit coefficient is equal to the square root of the ratio of the eplained to the uneplained conditional variation in the latent y*, r y* 1 r y*, scaled y the ratio of the standardied standard deviation of the error ( π / 3 in the logit case to the residual standard deviation of, sd(ω/sd( We can therefore decompose the logit coefficients into these two parts In Tale 3 we report these two parts for the three countries In these epressions, stands for the dummy variales for the classes other than the one eing studied The logit coefficients shown in Figure 1 are the product of these two components --TABLE 3 HERE -- The trend in the French results reported in Figure are very similar to those of the ratio r shown in Tale 3 But the more pronounced decline in class origin effects in y* / 1 ry* France seen in the logits of Figure 1 is due to the steep reduction in the ratio sd(ω/sd( And since sd (ω is constant over cohorts, this change reflects the increase in the conditional standard deviations of the dummy variales over cohorts The Dutch case is similar: increases in the conditional standard deviations of the dummy variales together with declines in their eplanatory power give rise to the even more marked equaliation shown in Figure 1 But the 7 The conditional variance of a class dummy is the variance of the residual from a regression of that dummy variale on the dummy variales for the other classes Thus, change over cohorts in the conditional variance of a particular dummy reflects not just changes in the relative sie of that particular class; it is also sensitive to changes in the entire class distriution, aleit in a complicated way
32 31 German eperience is somewhat different: here Figures 1 and are more similar ecause less of the decline in the logits is attriutale to change in the conditional standard deviations of the class dummies As the trend in the ratio, sd( ω / sd(, in Tale 3 shows, although the conditional standard deviations of the class dummies have mainly increased (and so the ratio reported in the figure declines they have remained within a narrower range In general, changes in the residual standard deviation of the predictor variales could cause logits (or, generally, NLPMs to eaggerate or underplay trends or differences in the degree to which the predictors account for variation in the latent y* If sd ( increases over cohorts, as here, it will eaggerate a declining trend in educational inequality when measured with logit coefficients ut underestimate the opposite trend when measured with partial correlation coefficients A declining sd ( will magnify an increasing trend in logit coefficients ut dampen a declining one In the case at hand, where changes in sd ( magnify a declining trend, our overall conclusion would nevertheless e that inequalities in educational attainment declined over the 0 th century To support this conclusion, we report in Figure 3 the estimated R values (derived in Appendi D for each cohort We find a clear decline in the degree to which class origins account for variation in latent educational attainment And, ecause correlations are standardied measures, we can also compare them across countries In the oldest cohort class origins eplained most variation in educational attainment in Germany (30%, ut this is the country in which there has een the greatest asolute decline in R and y the youngest cohort class origins account for aout 15% of the variance in oth Germany and France In all cohorts the Netherlands displays the lowest amount of variance eplained y class origins: in the youngest
FinQuiz Notes
Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression
More informationThe Mean Version One way to write the One True Regression Line is: Equation 1 - The One True Line
Chapter 27: Inferences for Regression And so, there is one more thing which might vary one more thing aout which we might want to make some inference: the slope of the least squares regression line. The
More informationLogistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015
Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article
More information1. Define the following terms (1 point each): alternative hypothesis
1 1. Define the following terms (1 point each): alternative hypothesis One of three hypotheses indicating that the parameter is not zero; one states the parameter is not equal to zero, one states the parameter
More informationChapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.
9.1 Simple linear regression 9.1.1 Linear models Response and eplanatory variables Chapter 9 Regression With bivariate data, it is often useful to predict the value of one variable (the response variable,
More informationPartitioning variation in multilevel models.
Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response
More informationEstimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement
Open Journal of Statistics, 07, 7, 834-848 http://www.scirp.org/journal/ojs ISS Online: 6-798 ISS Print: 6-78X Estimating a Finite Population ean under Random on-response in Two Stage Cluster Sampling
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationwhere Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.
Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter
More informationGroup comparisons in logit and probit using predicted probabilities 1
Group comparisons in logit and probit using predicted probabilities 1 J. Scott Long Indiana University May 27, 2009 Abstract The comparison of groups in regression models for binary outcomes is complicated
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationInterpreting and using heterogeneous choice & generalized ordered logit models
Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2
More informationNon-Linear Regression Samuel L. Baker
NON-LINEAR REGRESSION 1 Non-Linear Regression 2006-2008 Samuel L. Baker The linear least squares method that you have een using fits a straight line or a flat plane to a unch of data points. Sometimes
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationMinimum Wages, Employment and. Monopsonistic Competition
Minimum Wages, Employment and Monopsonistic Competition V. Bhaskar Ted To University of Essex Bureau of Laor Statistics August 2003 Astract We set out a model of monopsonistic competition, where each employer
More informationNELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation
NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationA Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts
A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationSchool of Business. Blank Page
Equations 5 The aim of this unit is to equip the learners with the concept of equations. The principal foci of this unit are degree of an equation, inequalities, quadratic equations, simultaneous linear
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationEquation Number 1 Dependent Variable.. Y W's Childbearing expectations
Sociology 592 - Homework #10 - Advanced Multiple Regression 1. In their classic 1982 paper, Beyond Wives' Family Sociology: A Method for Analyzing Couple Data, Thomson and Williams examined the relationship
More informationApplication of Pareto Distribution in Wage Models
Chinese Business Review, ISSN 537-56 Novemer, Vol., No., 974-984 Application of areto Distriution in Wage Models Diana Bílková University of Economics in rague, rague, Czech Repulic This paper deals with
More informationKøbenhavns Universitet. Comparing linear probability model coefficients across groups Holm, Anders; Ejrnæs, Mette; Karlson, Kristian Bernt
university of copenhagen Københavns Universitet Comparing linear probability model coefficients across groups Holm, nders; Ejrnæs, Mette; Karlson, Kristian ernt Published in: Quality and Quantity DOI:
More informationChapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)
Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico
More informationPBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.
PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the
More informationA Practitioner s Guide to Generalized Linear Models
A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically
More informationRegression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression
LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationab is shifted horizontally by h units. ab is shifted vertically by k units.
Algera II Notes Unit Eight: Eponential and Logarithmic Functions Sllaus Ojective: 8. The student will graph logarithmic and eponential functions including ase e. Eponential Function: a, 0, Graph of an
More informationCh 7: Dummy (binary, indicator) variables
Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male
More informationNotes to accompany Continuatio argumenti de mensura sortis ad fortuitam successionem rerum naturaliter contingentium applicata
otes to accompany Continuatio argumenti de mensura sortis ad fortuitam successionem rerum naturaliter contingentium applicata Richard J. Pulskamp Department of Mathematics and Computer Science Xavier University,
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationComparing groups using predicted probabilities
Comparing groups using predicted probabilities J. Scott Long Indiana University May 9, 2006 MAPSS - May 9, 2006 - Page 1 The problem Allison (1999): Di erences in the estimated coe cients tell us nothing
More informationSociology 593 Exam 2 March 28, 2002
Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that
More informationSystematic error, of course, can produce either an upward or downward bias.
Brief Overview of LISREL & Related Programs & Techniques (Optional) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 STRUCTURAL AND MEASUREMENT MODELS:
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationPath analysis for discrete variables: The role of education in social mobility
Path analysis for discrete variables: The role of education in social mobility Jouni Kuha 1 John Goldthorpe 2 1 London School of Economics and Political Science 2 Nuffield College, Oxford ESRC Research
More informationThe Consequences of Unobserved Heterogeneity in a Sequential Logit Model
The Consequences of Unobserved Heterogeneity in a Sequential Logit Model Maarten L. Buis Department of Sociology, Tübingen University, Wilhelmstraße 36, 72074 Tübingen, Germany Abstract Cameron and Heckman
More informationEach element of this set is assigned a probability. There are three basic rules for probabilities:
XIV. BASICS OF ROBABILITY Somewhere out there is a set of all possile event (or all possile sequences of events which I call Ω. This is called a sample space. Out of this we consider susets of events which
More informationECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information
1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationHarperCollinsPublishers Ltd
Guidance on the use of codes for this mark scheme M A C P cao oe ft Method mark Accuracy mark Working mark Communication mark Process, proof or justification mark Correct answer only Or equivalent Follow
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationHypothesis testing. Data to decisions
Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationSimple linear regression: linear relationship between two qunatitative variables. Linear Regression. The regression line
Linear Regression Simple linear regression: linear relationship etween two qunatitative variales The regression line Facts aout least-squares regression Residuals Influential oservations Cautions aout
More informationMATH 225: Foundations of Higher Matheamatics. Dr. Morton. 3.4: Proof by Cases
MATH 225: Foundations of Higher Matheamatics Dr. Morton 3.4: Proof y Cases Chapter 3 handout page 12 prolem 21: Prove that for all real values of y, the following inequality holds: 7 2y + 2 2y 5 7. You
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationGroup comparison test for independent samples
Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations
More informationPolynomial Degree and Finite Differences
CONDENSED LESSON 7.1 Polynomial Degree and Finite Differences In this lesson, you Learn the terminology associated with polynomials Use the finite differences method to determine the degree of a polynomial
More informationHanmer and Kalkan AJPS Supplemental Information (Online) Section A: Comparison of the Average Case and Observed Value Approaches
Hanmer and Kalan AJPS Supplemental Information Online Section A: Comparison of the Average Case and Observed Value Approaches A Marginal Effect Analysis to Determine the Difference Between the Average
More informationAQA GCSE Maths Foundation Skills Book Statistics 1
Guidance on the use of codes for this mark scheme ethod mark A Accuracy mark ark awarded independent of method cao Correct answer only oe Or equivalent ft Follow through AQA GCSE aths Foundation Skills
More informationSTOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3
More informationSimplifying a Rational Expression. Factor the numerator and the denominator. = 1(x 2 6)(x 2 1) Divide out the common factor x 6. Simplify.
- Plan Lesson Preview Check Skills You ll Need Factoring ± ± c Lesson -5: Eamples Eercises Etra Practice, p 70 Lesson Preview What You ll Learn BJECTIVE - To simplify rational epressions And Why To find
More informationRobustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances
Discussion Paper: 2006/07 Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances J.S. Cramer www.fee.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department of
More informationEvaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest
Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest Department of methodology and statistics, Tilburg University WorkingGroupStructuralEquationModeling26-27.02.2015,
More informationB. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i
B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts
More informationSTA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationLogarithmic Functions. 4. f(f -1 (x)) = x and f -1 (f(x)) = x. 5. The graph of f -1 is the reflection of the graph of f about the line y = x.
SECTION. Logarithmic Functions 83 SECTION. Logarithmic Functions Objectives Change from logarithmic to eponential form. Change from eponential to logarithmic form. 3 Evaluate logarithms. 4 Use basic logarithmic
More informationConceptual Explanations: Logarithms
Conceptual Eplanations: Logarithms Suppose you are a iologist investigating a population that doules every year. So if you start with 1 specimen, the population can e epressed as an eponential function:
More informationAnswer Key: Problem Set 5
: Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationModule 9: Further Numbers and Equations. Numbers and Indices. The aim of this lesson is to enable you to: work with rational and irrational numbers
Module 9: Further Numers and Equations Lesson Aims The aim of this lesson is to enale you to: wor with rational and irrational numers wor with surds to rationalise the denominator when calculating interest,
More informationChaos and Dynamical Systems
Chaos and Dynamical Systems y Megan Richards Astract: In this paper, we will discuss the notion of chaos. We will start y introducing certain mathematical concepts needed in the understanding of chaos,
More informationLogistic regression: Uncovering unobserved heterogeneity
Logistic regression: Uncovering unobserved heterogeneity Carina Mood April 17, 2017 1 Introduction The logistic model is an elegant way of modelling non-linear relations, yet the behaviour and interpretation
More informationSimple, Marginal, and Interaction Effects in General Linear Models
Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means
More informationECON 497 Midterm Spring
ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain
More informationInteractions among Continuous Predictors
Interactions among Continuous Predictors Today s Class: Simple main effects within two-way interactions Conquering TEST/ESTIMATE/LINCOM statements Regions of significance Three-way interactions (and beyond
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationStructure learning in human causal induction
Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationRonald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012
Ronald Heck Week 14 1 From Single Level to Multilevel Categorical Models This week we develop a two-level model to examine the event probability for an ordinal response variable with three categories (persist
More informationStatistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018
Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical
More informationMore formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( )
Fortin Econ 56 3. Measurement The theoretical literature on income inequality has developed sophisticated measures (e.g. Gini coefficient) on inequality according to some desirable properties such as decomposability
More informationINTRODUCTION. 2. Characteristic curve of rain intensities. 1. Material and methods
Determination of dates of eginning and end of the rainy season in the northern part of Madagascar from 1979 to 1989. IZANDJI OWOWA Landry Régis Martial*ˡ, RABEHARISOA Jean Marc*, RAKOTOVAO Niry Arinavalona*,
More informationSTA102 Class Notes Chapter Logistic Regression
STA0 Class Notes Chapter 0 0. Logistic Regression We continue to study the relationship between a response variable and one or more eplanatory variables. For SLR and MLR (Chapters 8 and 9), our response
More informationIn matrix algebra notation, a linear model is written as
DM3 Calculation of health disparity Indices Using Data Mining and the SAS Bridge to ESRI Mussie Tesfamicael, University of Louisville, Louisville, KY Abstract Socioeconomic indices are strongly believed
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationECON 5350 Class Notes Functional Form and Structural Change
ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this
More informationComparing Logit & Probit Coefficients Between Nested Models (Old Version)
Comparing Logit & Probit Coefficients Between Nested Models (Old Version) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 1, 2016 NOTE: Long and Freese s spost9
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationComparing coefficients of nested nonlinear probability models
The Stata Journal (2011) 11, Number 3, pp. 420 438 Comparing coefficients of nested nonlinear probability models Ulrich Kohler Kristian Bernt Karlson Wissenschaftszentrum Berlin Department of Education
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationMarginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus
Presentation at the 2th UK Stata Users Group meeting London, -2 Septermber 26 Marginal effects and extending the Blinder-Oaxaca decomposition to nonlinear models Tamás Bartus Institute of Sociology and
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More informationWHAT IS HETEROSKEDASTICITY AND WHY SHOULD WE CARE?
1 WHAT IS HETEROSKEDASTICITY AND WHY SHOULD WE CARE? For concreteness, consider the following linear regression model for a quantitative outcome (y i ) determined by an intercept (β 1 ), a set of predictors
More informationSociology Research Statistics I Final Exam Answer Key December 15, 1993
Sociology 592 - Research Statistics I Final Exam Answer Key December 15, 1993 Where appropriate, show your work - partial credit may be given. (On the other hand, don't waste a lot of time on excess verbiage.)
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More information11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should
11 CHI-SQUARED Chapter 11 Chi-squared Objectives After studying this chapter you should be able to use the χ 2 distribution to test if a set of observations fits an appropriate model; know how to calculate
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems
Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/
More information(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)
12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression
More information