Sociology 593 Exam 2 Answer Key March 28, 2002

Similar documents
Sociology 593 Exam 2 March 28, 2002

Sociology Exam 2 Answer Key March 30, 2012

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 February 17, 1995

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

S o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7,

ECON 497 Midterm Spring

9. Linear Regression and Correlation

Sociology 593 Exam 1 February 14, 1997

Final Exam - Solutions

Statistics and Quantitative Analysis U4320

Review of Multiple Regression

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Equation Number 1 Dependent Variable.. Y W's Childbearing expectations

Can you tell the relationship between students SAT scores and their college grades?

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors

Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame,

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis

Sociology Exam 1 Answer Key Revised February 26, 2007

REVIEW 8/2/2017 陈芳华东师大英语系

Regression. Notes. Page 1. Output Created Comments 25-JAN :29:55

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

2 Prediction and Analysis of Variance

Simple Linear Regression: One Quantitative IV

y response variable x 1, x 2,, x k -- a set of explanatory variables

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Chapter 10-Regression

Exam Applied Statistical Regression. Good Luck!

Systematic error, of course, can produce either an upward or downward bias.

4. Nonlinear regression functions

Simple Linear Regression: One Qualitative IV

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

In Class Review Exercises Vartanian: SW 540

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

Chapter 9 - Correlation and Regression

Ordinary Least Squares Regression Explained: Vartanian

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Binary Logistic Regression

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

Lecture 6: Linear Regression

Nonlinear relationships Richard Williams, University of Notre Dame, Last revised February 20, 2015

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

Final Exam - Solutions

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

Advanced Quantitative Data Analysis

THE PEARSON CORRELATION COEFFICIENT

Soc 3811 Basic Social Statistics Second Midterm Exam Spring Your Name [50 points]: ID #: ANSWERS

Unit 6 - Introduction to linear regression

1 Correlation and Inference from Regression

Nonrecursive models (Extended Version) Richard Williams, University of Notre Dame, Last revised April 6, 2015

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

Ch. 16: Correlation and Regression

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Applied Statistics and Econometrics

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Inference in Regression Model

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES

Chapter 3: Examining Relationships

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Multiple Regression: Chapter 13. July 24, 2015

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Unit 6 - Simple linear regression

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

An Introduction to Mplus and Path Analysis

( ), which of the coefficients would end

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Ch 7: Dummy (binary, indicator) variables

Psych 230. Psychological Measurement and Statistics

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Simple Linear Regression: One Qualitative IV

CHAPTER 6: SPECIFICATION VARIABLES

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology

General Linear Model (Chapter 4)

An Introduction to Path Analysis

Chs. 16 & 17: Correlation & Regression

A Re-Introduction to General Linear Models (GLM)

Interpreting and using heterogeneous choice & generalized ordered logit models

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Practice exam questions

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

Answer Key: Problem Set 6

Job Training Partnership Act (JTPA)

Chs. 15 & 16: Correlation & Regression

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

11 Correlation and Regression

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.

Lecture 6: Linear Regression (continued)

Topic 1. Definitions

Transcription:

Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that only Catholics have a score on this variable. FALSE. More likely is that Catholics are coded on this variable, non-catholics are coded 0.. A researcher believes that Education is a cause of earnings: the more educated you are, the higher your earnings tend to be. Further, she knows that blacks make less, on average, than whites do. This means that the effect of education on earnings must be greater for whites than it is for blacks. FALSE. The effect of education could be the same for both groups. But, if whites tend to have more years of education, they will also tend to have higher incomes.. The correlation between X and X4 is.6. Hence, in a path model using standardized variables, the effect of X on X4 will be.6. FALSE. Variables can be correlated for many reasons, both causal and non-causal, e.g. they can be correlated because of direct effects, indirect effects and common causes. Hence, the direct effect need not equal the correlation. 4. A researcher regresses Y on X, X and X. In reality, X is irrelevant and does not belong in the model. The estimates of the betas will therefore be biased. FALSE. Including extraneous variables does not produce biased estimates, although they can cause standard errors to be higher than is necessary. 5. A researcher computes SUMXX = X + X. She regresses Y on X and SUMXX. She obtains the following. She should conclude that X does not affect Y. Model (Constant) X SUMXX a. Dependent Variable: Y Coefficients a Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig..9.94.44.660 -..87 -.05 -.4.808.9.658.775.64.000 FALSE. The correct conclusion is that the effect of X does not significantly differ from the effect of X. Sociology 59 Exam Answer Key Page

II. Path Analysis/Model specification. (0 points). A researcher is interested in the determinants of teenage smoking. She has measures of Parental Socio-economic status, hours worked at paying jobs during the school week, feelings of control over your destiny ( Do you feel you control what happens to you in life, or do you think things just happen? ), and the number of cigarettes smoked per week. All her variables are in standardized form. The estimated value of each path in her model is included in the diagram..7 X: Feel you control what happens to you u -.6 X: Parental SES -.6 X4: Amount Student Smokes -.5 v X: Hours worked at a job during the week -.4 w a. Write out the structural equation for each endogenous variable. X =.7X + u X = -.5X + v X4 = -.6X -.6X -.4X + w b. The correlation between X (hours worked) and X4 (# of cigarettes smoked) is E( X X ) = ρ 4 = β 4 4 β + β β β + β 4 = (.6 *.5) + (.6 *.7 *.5) + (.4) =.0 +..4 =. Decompose the correlation between X and X4 into Correlation due to direct effects -.4 (as shown by the path from X to X4) Correlation due to indirect effects 0. There is no indirect effect of X on X4. Correlation due to common causes.5. There are two sources of association due to common causes. X is a common cause of both X and X4 (-.6 * -.5 =.0). Also, X is a cause of X, and X is a cause of X which is a cause of X4 (-.5 *.7 * -.6 =.). 4 Sociology 59 Exam Answer Key Page

c. There is controversy over the harms and benefits of students working at jobs while attending school. Based on these results, do you think that working at a job tends to increase smoking, decrease smoking, or have little or no effect? Explain how, if variables were erroneously omitted from the above model, a researcher might reach a very different, and incorrect, conclusion. If you simply regressed X4 on X, the estimated coefficient would be the same as the correlation between X and X4, i.e... You would therefore conclude that working at a job tends to increase the amount you smoke. As the full model shows us (specifically, the effect of X on X4), this conclusion is wrong: in reality, working at a job is beneficial in that it reduces the amount a teenager smokes. Suppressor effects are present. Those who work more tend to come from lower SES families. Lower SES causes individuals to smoke more. Also, lower SES results in lower feelings of control, which in turn leads to more smoking. In short, those who work more have characteristics that increase the amount they smoke; but they would smoke even more if they weren t working. Perhaps working reduces the amount of time available for smoking, or produces some sort of incentive to not smoke so much. III. ). Short answer. Answer two of the following three questions. (5 points each; up to 0 points extra credit if you do all. Each of the following describes a nonlinear or nonadditive relationship between variables. Draw a scatterplot that illustrates the relationship. Describe the harms that might result if you simply regressed Y on X, e.g. would values be overestimated, under-estimated, or what? Indicate the model you think should be estimated, e.g. E(Y) = α + β X + β X. Explain what variables you would need to compute in order to actually estimate the model, e.g. logs of variables, interaction terms. a. Studies have found that moderate amounts of alcohol consumption actually tend to increase life expectancy. However, after a certain point, the more alcohol you consume, the more your life expectancy goes down. Y0N0 0 - -4-6 Observed Linear Quadratic -8-0 -4 - - - 0 4 X Relationship is curvilinear. A linear regression might lead you to believe there was no relationship whatsoever, and in any even would miss the fact that the relationship changes direction after a certain point. You should therefore estimate a polynomial model. Compute X and then estimate E(Y) = α + β X + β X. A piecewise model might also work, but I think it is less plausible, i.e. I don t think the effect of drinking would dramatically change all of a sudden. Plus, with a piecewise model, you have to figure out where the break point is. Sociology 59 Exam Answer Key Page

b. A political analyst wants to know how TV ads affect the opinions of voters. She finds that, for voters aged 50 and above, the more TV ads they see, the less they like her candidate. For voters younger than 50, just the opposite is true: the more TV ads they see, the more they like the candidate. Under 50 50+ Interaction effects are present. Watching TV ads increases candidate popularity for people under age 50 but has the opposite effect for those over 50. If you estimated a linear regression, you might find no effect at all, or effects that are much weaker than the above suggests (and which were in the wrong direction for one of the groups.) Compute a dummy variable coded = 50+, 0 = Under 50. Then, compute the interaction term DUMMYTV = DUMMY*TV. Then, estimate E(Y) = α + β X + β Dummy Dummy + β DummyTV DummyTV c. An employer believes that workers become more skillful and productive the longer they work on the job. Specifically, she believes that each year a worker is on the job, he or she will be 5% more productive than they were the year before. 0 YEXP 0 0 0 Observed Linear Exponential -0-4 - - - 0 4 X A growth model is called for. A linear regression would cause you to overestimate the amount of growth early on and then underestimate it later. Compute ln(y), where y = productivity. Estimate E(lnY) = α + β X.. [This problem is modified from Hamilton s Statistics with Stata 5. ]. A researcher believes that better students (as measured by Grade Point Average) drink less than do weaker students. However, she suspects that this is less true for men than for women, i.e. the effect of GPA on drinking is smaller for males than it is for females. She tests her ideas with a survey of undergraduate students collected by Ward and Ault (990). DRINKING is measured on a point scale, where higher values indicate higher levels of drinking. GPA is the student s Grade Point Average, centered so as to have a mean of zero (higher values indicate better grades). MALE is coded if the student is male, 0 if Female. MALEGPA = MALE * GPA. She obtains the following results: Sociology 59 Exam Answer Key Page 4

REGRESSION /MISSING LISTWISE/ descriptives /STATISTICS COEFF OUTS R ANOVA CHA /CRITERIA=PIN(.05) POUT(.0) /NOORIGIN /DEPENDENT drink /METHOD=ENTER GPA /METHOD=ENTER male /METHOD=ENTER MaleGpa. Regression Descriptive Statistics DRINK -point drinking scale MALE MALEGPA Mean Std. Deviation N 8.8 6.79 8.0000.4597 8.454.49904 8 -.0407.0988 8 Correlations Pearson Correlation DRINK -point drinking scale MALE MALEGPA DRINK -point drinking scale GPA Grade Point Average MALE MALEGPA.000 -.8.04 -.70 -.8.000 -.78.687.04 -.78.000 -.44 -.70.687 -.44.000 Model Summary Model Adjusted Std. Error of R Square R R Square R Square the Estimate Change F Change df df Sig. F Change.8 a.080.075 6.480.080 8.660 6.000.8 b.46.8 6.57.066 6.70 5.000.84 c.48.6 6.65.00.4 4.5 a. Predictors: (Constant), b. Predictors: (Constant),, MALE c. Predictors: (Constant),, MALE, MALEGPA Change Statistics Sociology 59 Exam Answer Key Page 5

Model Regression Residual Total Regression Residual Total Regression Residual Total ANOVA d Sum of Squares df Mean Square F Sig. 78.590 78.590 8.660.000 a 9070.4 6 4.99 9854.0 7 47.7 78.855 8.64.000 b 846. 5 9.46 9854.0 7 45.879 484.66.46.000 c 8400.44 4 9.5 9854.0 7 a. Predictors: (Constant), b. Predictors: (Constant),, MALE c. Predictors: (Constant),, MALE, MALEGPA d. Dependent Variable: DRINK -point drinking scale Model (Constant) (Constant) MALE (Constant) MALE MALEGPA Coefficients a Unstandardized Coefficients a. Dependent Variable: DRINK -point drinking scale Standardized Coefficients B Std. Error Beta t Sig. 8.8.49 4.88.000-4.8.958 -.8-4.0.000 7.5.578 9.794.000 -.45.940 -.5 -.67.000.56.865.6 4.088.000 7.57.58 9.640.000-4.0.8 -.7 -.9.00.55.867.6 4.00.000..889.056.64.5 Based on the above results, answer the following questions. Be sure to indicate how the printout supports your arguments. a) Is the researcher correct in hypothesizing that better students tend to drink less? Yes. GPA has a significant negative effect on drinking in all models and the GPAdrinking correlation is also negative b) Are there significant differences in the determinants of drinking for men and women? If so, are these differences limited to differences in the intercepts, or does the effect of GPA differ by gender? If differences are found, be specific as to what they are, e.g. how much greater (or weaker) is the effect of GPA on men than it is for women. There are differences in the intercepts, as you can tell from the T value for male (or else the F change statistic) in Model. However, in Model, the interaction term is not significant; ergo, counter to what the researcher hypothesized, it appears the effect of GPA is the same for both men and women. (If the interaction term were significant, it would support the researcher s hypothesis: a positive interaction term implies that the effect of GPA is less for men than it is for women. The problem may be that the sample isn t large enough to detect the predicted differences.) Sociology 59 Exam Answer Key Page 6

c) Of the three models presented, which one would you say is best, and why? Briefly discuss the substantive implications of your preferred model. Model is best; all the variables in it are statistically significant. The model says that those with higher GPAs tend to drink less. Also, men tend to drink more than do comparable women. The effect of GPA on drinking is the same for both men and women.. A researcher is interested in the following model: Y = α + β X + β X + β X + β 4 X 4 + ε Using OLS regression, briefly tell her how to go about testing each of the following hypotheses. You don t have to give exact formulas but you should describe the general procedures (e.g. what variables would you need to compute, what models would you need to estimate) and the types of statistics (e.g. T-Test, Global F-test, Incremental F-test) that are appropriate. For example, you might describe the unconstrained and constrained models that need to be estimated, and then say that an incremental F-test should be used to tell whether the two models significantly differ from each other. If necessary, describe any additional variables you would need to compute, such as interaction terms. β = 00 The simplest strategy is just to look at the confidence interval for B. If it includes 00, do not reject the null hypothesis that B = 00; otherwise you do reject it. Alternatively, you can do a T-Test: t = (b 00)/s b. SPSS won t give you this T-value (it tests the hypotheses that b = 0) but you can compute it easily enough yourself. β = β Estimate a constrained and unconstrained model and then compute an incremental F. The unconstrained model is the model given above, i.e. Y = α + β X + β X + β X + β 4 X 4 + ε This model is unconstrained in that B and B need not equal each other. To estimate the constrained model, we first compute SumXX = X + X. We then estimate Y = α + β SumXX + β X + β 4 X 4 + ε The model is constrained in that the effects of X and X are not allowed to differ. An incremental F test is then used to see whether this constraint is justified. If the incremental F is not statistically significant, we conclude that B = B. β = β 4 = 0 The unconstrained model is again Y = α + β X + β X + β X + β 4 X 4 + ε Sociology 59 Exam Answer Key Page 7

In the constrained model, we drop X and X4, which in effect means we are constraining their effects to be zero: Y = α + β X + β X + ε If the incremental F value is not significant, we conclude that the constraint is legitimate, i.e. B = B4 = 0 Suppose X4 is a dummy variable coded if Black, 0 Otherwise. How would you test the hypothesis that the value of β is different for blacks and whites? Assume that the effects of X and X are the same for both groups. The simplest approach is to compute the interaction term XX4 = X * X4. We then estimate Y = α + β X + β X + β X + β 4 X 4 + β 4 X 4 + ε If the T value for B4 is statistically significant, we conclude that the effect of X is different for blacks and whites; otherwise we conclude the effect is not different. Sociology 59 Exam Answer Key Page 8