Sociology Exam 2 Answer Key March 30, 2012

Similar documents
S o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7,

Sociology 63993, Exam 2 Answer Key [DRAFT] March 27, 2015 Richard Williams, University of Notre Dame,

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis

Sociology 593 Exam 2 Answer Key March 28, 2002

Interaction effects between continuous variables (Optional)

Sociology 593 Exam 2 March 28, 2002

Specification Error: Omitted and Extraneous Variables

Lab 10 - Binary Variables

Sociology Exam 1 Answer Key Revised February 26, 2007

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

Nonlinear relationships Richard Williams, University of Notre Dame, Last revised February 20, 2015

General Linear Model (Chapter 4)

Lecture 7: OLS with qualitative information

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, Last revised April 6, 2015

Sociology 593 Exam 1 February 17, 1995

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised January 20, 2018

Lecture 4: Multivariate Regression, Part 2

Lab 07 Introduction to Econometrics

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Problem Set 10: Panel Data

Comparing Logit & Probit Coefficients Between Nested Models (Old Version)

ECON3150/4150 Spring 2016

STATISTICS 110/201 PRACTICE FINAL EXAM

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Nonrecursive models (Extended Version) Richard Williams, University of Notre Dame, Last revised April 6, 2015

Lecture 4: Multivariate Regression, Part 2

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

ECO220Y Simple Regression: Testing the Slope

THE MULTIVARIATE LINEAR REGRESSION MODEL

Statistical Inference with Regression Analysis

Lecture 12: Interactions and Splines

Econometrics Midterm Examination Answers

Empirical Application of Simple Regression (Chapter 2)

Sociology 593 Exam 1 Answer Key February 17, 1995

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Essential of Simple regression

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Practice exam questions

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lab 6 - Simple Regression

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

ECON Introductory Econometrics. Lecture 17: Experiments

Statistics 5100 Spring 2018 Exam 1

Problem Set 1 ANSWERS

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Econometrics Homework 1

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Linear Regression with Multiple Regressors

1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Lecture#12. Instrumental variables regression Causal parameters III

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

1 A Review of Correlation and Regression

Statistical Modelling in Stata 5: Linear Models

Heteroskedasticity Example

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Course Econometrics I

Correlation and Simple Linear Regression

Equation Number 1 Dependent Variable.. Y W's Childbearing expectations

sociology 362 regression

ECON 497 Final Exam Page 1 of 12

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Problem set - Selection and Diff-in-Diff

Week 3: Simple Linear Regression

Handout 11: Measurement Error

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Hypothesis testing. Data to decisions

9. Linear Regression and Correlation

Applied Statistics and Econometrics

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Binary Dependent Variables

ECON Introductory Econometrics. Lecture 2: Review of Statistics

MORE ON MULTIPLE REGRESSION

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Introduction to Econometrics. Review of Probability & Statistics

2. We care about proportion for categorical variable, but average for numerical one.

Computer Exercise 3 Answers Hypothesis Testing

Unit 2 Regression and Correlation Practice Problems. SOLUTIONS Version STATA

Applied Statistics and Econometrics

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

ECON3150/4150 Spring 2015

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

At this point, if you ve done everything correctly, you should have data that looks something like:

1 Independent Practice: Hypothesis tests for one parameter:

Business Statistics. Lecture 10: Correlation and Linear Regression

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

sociology 362 regression

Problem Set 5 ANSWERS

Do Now 18 Balance Point. Directions: Use the data table to answer the questions. 2. Explain whether it is reasonable to fit a line to the data.

Nonlinear Regression Functions

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

ECON3150/4150 Spring 2016

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Transcription:

Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales that measure health, diet, exercise, income and education. When she regresses health on the other four variables, she finds that the effect of education is 0. This means that, at least when it comes to health, it makes no difference whether you are well educated or poorly educated. False (or at least, not necessarily true). Education could have indirect effects on health, e.g. better educated people may have higher incomes, better diets, and may exercise more. 2. A key advantage of centering independent variables is that centering often eliminates the need to include interaction terms in the model. False. Centering has no impact on whether or not interaction terms are needed in the model. It may make results easier to interpret. 3. A researcher runs the following:. ovtest Ramsey RESET test using powers of the fitted values of health Ho: model has no omitted variables F(3, 10330) = 5.62 Prob > F = 0.0008 This indicates that our model has too many extraneous variables. False. It indicates that polynomial terms (e.g. X 2 ) should be added to the model. 4. A researcher hypothesizes that income positively affects the self-image of men but has a negative effect on the self-image of women. She gets ˆ βincome = 12 ˆ βfemale = 0 ˆ β = 4 Income * Female = 1 if female, 0 if male. The T values for Income and for the interaction term are both highly significant. The evidence supports the researcher s hypothesis. False. The effect of income is smaller for women than it is for men (12 versus 8) but it is still positive. 5. A researcher has regressed Y on X1. However, X2 should also be in the model. As a result of this omission, the coefficient for X1 will definitely be biased. Female False. It won t be biased if X1 and X2 are uncorrelated. Sociology 63993 Exam 2 Page 1

II. Path Analysis/Model specification (25 pts). A sociologist believes that the following model describes the relationship between X1, X2, X3, and X4. All her variables are in standardized form. The estimated value of each path in her model is included in the diagram. u -.7 X2.3 X1.6.5 X4 X3 w v a. (5 pts) Write out the structural equation for each endogenous variable, using both the names for the paths (e.g. β 42 ) and the estimated value of the path coefficient. X2 = β 21 X1 + u = -.7 * X1 + v X3 = β 32 X2 + u =.5 * X2 + v X4 = β 41 X1 + β 42 X2 + w =.6 * X1 +.3* X2 + w b. (10 pts) Part of the correlation matrix is shown below. Determine the complete correlation matrix. (Remember, variables are standardized. You can use either normal equations or Sewell Wright, but you might want to use both as a double-check.) x1 x2 x3 x4 -------------+------------------------------------ x1 1.0000 x2-0.7000 1.0000 x3?? 1.0000 x4.??? 1.0000 The uncensored printout is x1 x2 x3 x4 -------------+------------------------------------ x1 1.0000 x2-0.7000 1.0000 x3-0.3500 0.5000 1.0000 x4 0.3900-0.1200-0.0600 1.0000 Sociology 63993 Exam 2 Page 2

Using normal equations, ρ 12 = β 21 E(X1 2 ) = β 21 = -.7 ρ 13 = β 32 E(X1X2) = β 32 * ρ 12 = -.5*.7 = -.35 ρ 23 = β 32 E(X2 2 ) = β 32 =.5 ρ 41 = β 41 E(X1 2 ) + β 42 E(X1X2) = β 41 + β 42 ρ 12 =.6 + (.3 * -.7) =.39 ρ 42 = β 41 E(X1X2) + β 42 E(X2 2 ) = β 41 ρ 12 + β 42 = (.6 * -.7) +.3 = -.12 ρ 43 = β 41 E(X1X3) + β 42 E(X2X3) = β 41 ρ 13 + β 42 ρ 23 = (.6 * -.35) + (.3 *.5) = -.06 Or, alternatively, ρ = β 21 21 =.7 ρ = β + β β 31 31 21 32 = 0 + (.7*.5) = 0.35 =.35 ρ = β + β β 32 32 31 21 =.5 + (0*.7) =.5 + 0 =.5 ρ = β + β β + β β + β β β 41 41 21 42 31 43 21 32 43 =.6 + (.7*.3) + (0*0) + (.7*.5*0) =.6.21+ 0 + 0 =.39 ρ = β + β β + β β + β β β 42 42 32 43 41 21 43 31 21 =.3 + (.5*0) + (.6*.7) + (0*0*.5) =.3+ 0.42 + 0 =.12 ρ = β + β β + β β + β β β + β β β 43 43 41 31 42 32 41 21 32 42 21 31 = 0 +.(6*0) + (.3*.5) + (.6*.7*.5) + (.3*.7*0) = 0 + 0 +.15.21+ 0 =.06 Sociology 63993 Exam 2 Page 3

c. (5 pts) Decompose the correlation between X1 and X4 into Correlation due to direct effects.6 Correlation due to indirect effects -.21 Correlation due to common causes 0 model: d. (5 pts) Suppose the above model is correct, but instead the researcher believed in and estimated the following X2 X4 w What conclusions would the researcher likely draw? In particular, what would the researcher conclude about the effect of changes in X2 on X4? Discuss the consequences of this mis-specification, and in what ways, if any, the results would be misleading. Why would she make these mistakes? The estimated effect of X2 on X4 would be the same as the correlation between the variables, i.e. -.12. The researchers would therefore conclude that increases in X2 will produce decreases in X4, when in reality the effect of X2 on X4 is positive. The error would occur because there are suppressor effects. The negative correlation produced by the common cause of the omitted X1 gets confounded with the correlation produced by the positive direct effect of X2 on X4. This mistake could cause the researchers to do the exact opposite of what they should do. III. Group comparisons (25 points). The Catholic Bishops are adamantly opposed to proposals that they feel would mandate that Catholic Institutions pay for contraceptive health care. However, they want to know more about how the Catholic laity feel about these issues. In particular, how do Catholic men and women differ in their beliefs? They have therefore collected information from almost 2,300 US Catholics on the following variables. Variable contracept female religious femrel Description Scale that measures support for government-mandated coverage of contraceptive health care. Scale ranges from 0 to 125, where a higher score indicates greater support. Coded 1 if female, 0 otherwise Scale that measures how religious the respondent is. Original scale ranges from 0 (not religious at all) to 75 (extremely religious). The scale used here has been centered to have a mean of zero. female * religious The results of the analysis are as follows:. * Estimate Models. nestreg: reg contracept religious female femrel Sociology 63993 Exam 2 Page 4

Block 1: religious Source SS df MS Number of obs = 2270 -------------+------------------------------ F( 1, 2268) = 540.90 Model 488051.725 1 488051.725 Prob > F = 0.0000 Residual 2046391.94 2268 902.289213 R-squared = 0.1926 -------------+------------------------------ Adj R-squared = 0.1922 Total 2534443.66 2269 1116.98707 Root MSE = 30.038 contracept Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- religious -3.689038.1586182-23.26 0.000-4.000091-3.377986 _cons 61.28458.6304635 97.21 0.000 60.04824 62.52093 Block 2: female Source SS df MS Number of obs = 2270 -------------+------------------------------ F( 2, 2267) = 1317.39 Model 1362299.24 2 681149.622 Prob > F = 0.0000 Residual 1172144.42 2267 517.0465 R-squared = 0.5375 -------------+------------------------------ Adj R-squared = 0.5371 Total 2534443.66 2269 1116.98707 Root MSE = 22.739 contracept Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- religious -1.359155.1327702-10.24 0.000-1.619519-1.098791 female 43.49509 1.057762 41.12 0.000 41.42081 45.56937 _cons 38.09997.7386994 51.58 0.000 36.65138 39.54857 Block 3: femrel Source SS df MS Number of obs = 2270 -------------+------------------------------ F( 3, 2266) = 879.57 Model 1363521.85 3 454507.284 Prob > F = 0.0000 Residual 1170921.81 2266 516.735131 R-squared = 0.5380 -------------+------------------------------ Adj R-squared = 0.5374 Total 2534443.66 2269 1116.98707 Root MSE = 22.732 contracept Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- religious -1.133977.1976051-5.74 0.000-1.521483 -.7464716 female 43.60929 1.060046 41.14 0.000 41.53052 45.68805 femrel -.4102893.2667354-1.54 0.124 -.9333604.1127818 _cons 37.69189.7846873 48.03 0.000 36.15311 39.23067 +-------------------------------------------------------------+ Block Residual Change Block F df df Pr > F R2 in R2 -------+----------------------------------------------------- 1 540.90 1 2268 0.0000 0.1926 2 1690.85 1 2267 0.0000 0.5375 0.3449 3 2.37 1 2266 0.1241 0.5380 0.0005 +-------------------------------------------------------------+ Sociology 63993 Exam 2 Page 5

. * Differences by gender. ttest contracept, by(female) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Male 1060 35.63679.6628397 21.5805 34.33616 36.93742 Female 1210 83.75289.707922 24.62511 82.364 85.14178 combined 2270 61.28458.7014733 33.42136 59.90899 62.66018 diff -48.1161.9782482-50.03446-46.19775 diff = mean(male) - mean(female) t = -49.1860 Ho: diff = 0 degrees of freedom = 2268 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = 0.0000 Pr( T > t ) = 0.0000 Pr(T > t) = 1.0000. ttest religious, by(female) Two-sample t test with equal variances Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Male 1060 1.81229.1085763 3.534988 1.599241 2.025339 Female 1210-1.587626.1049 3.648953-1.793432-1.381819 combined 2270 2.45e-08.0834429 3.975598 -.1636324.1636325 diff 3.399915.1512899 3.103234 3.696596 diff = mean(male) - mean(female) t = 22.4729 Ho: diff = 0 degrees of freedom = 2268 Ha: diff < 0 Ha: diff!= 0 Ha: diff > 0 Pr(T < t) = 1.0000 Pr( T > t ) = 0.0000 Pr(T > t) = 0.0000 Based on the above results, advise the Bishops on the following. When thinking about your answers, keep in mind the various reasons that two groups can differ on some outcome measure. a) (5 pts) In block 3, the coefficient for female is 43.60929. Assuming the model is correct, the analyst for the Bishops thinks this means that, when a man and woman are equally religious, the woman is expected to score 43.6 points higher than the man on the contraception measure. Indicate whether you agree or disagree, and explain why. This is incorrect. The model includes an interaction term, so the expected difference between a man and women who are equally religious depends on how religious they are. Since religious is centered, 43.6 is the expected difference between a man and women of average religiosity. The more religious the man and woman are, the less the gap is between them because, according to the model, religious has a greater negative effect on women than men. By tweaking the problem a little bit we can see the differences displayed graphically. * Redo with factor variables. quietly reg contracept religious i.female i.female#c.religious. quietly margins female, at(religious=(-50(5)50)). marginsplot, noci scheme(sj) Sociology 63993 Exam 2 Page 6

Adjusted Predictions of female Linear Prediction 0 50 100 150-50 -45-40 -35-30 -25-20 -15-10 -5 0 5 10 15 20 25 30 35 40 45 50 religious Male Female b) (10 pts) The researchers begin by estimating a series of models. Which of the models do you think is best, and why? What do these models tell us about how religiosity affects support for mandatory contraceptive care? What ways (if any) do the determinants of support for contraceptive care differ by gender? Model 2 is best. It is a significant improvement over Model 2, while model 3 is not a significant improvement over model 2, i.e. the interaction term is not significant. This model says that the intercepts differ for men and women, but the effects of religious do not. People who are less religious, and also women, tend to have higher levels of support for mandated coverage of contraceptive care (after controlling for the other variable in the model). Put another way, when men and women are equally religious, women tend to support contraceptive care more. c) (10 pts) The researchers then run a series of t-tests. What do these t-tests tell us about how attitudes differ by gender? What additional insights, if any, do these tests give us as to why support for mandatory contraceptive care differs by gender? Mandated health care coverage is much more popular with women than with men women have scores that are 48 points higher on average than do men, and the difference is highly significant. Women also score about 3.4 points lower on the religiosity scale. Since lower levels of religiosity result in higher levels of support for contraceptive care, this compositional difference also contributes to women being more supportive of contraceptive care overall. In short, gender is important for two reasons. First, the intercept is greater for women than it is for men. Second, women tend to be less religious, which in turn causes them to support mandated contraceptive care more. IV. Short answer. Answer both of the following questions. (15 points each, 30 points total.) Each of the following describes a nonlinear or nonadditive relationship between variables. Draw a scatterplot that illustrates the relationship. Describe the harms that might result if you simply regressed Y on X, e.g. would values be over-estimated, under-estimated, or what? Indicate the model you think should be estimated, e.g. E(Y) = α + β 1 X + β 2 X 2 and/or give the Stata command that would estimate the model. Explain what variables you would need to compute in order to actually estimate the model, e.g. logs of variables, interaction terms. Finally, indicate how you would actually test whether or not nonlinearity or nonadditivity actually was a problem. If you find it helpful, you are welcome to present the Stata commands you would use, but the statistical rationale behind the command still needs to be clear. a. Mitt Romney is worried that issue positions that are helping him to win Republican votes in the primaries may hurt him in the general election. Specifically, he believes that, the more he opposes Obama s health care plan, the more Sociology 63993 Exam 2 Page 7

Republicans tend to support him. But, at the same time, he fears that the more he opposes Obama s health care plan, the less support he will get from Independents and Democrats. Interaction effects are called for. The effect of opposition to health care on support for Romney depends on Party Id. Greater opposition has a positive effect on support for Romney for Republicans but a negative effect on support for Democrats and Independents. A graph of the relationship might look something like Ignoring these differences would result in an estimated effect that might be 0 or small positive or small negative, but whatever it was it would be misleading. In Stata you might give a command like regress support opposehc i.partyid i.partyid#c.opposehc You could do an incremental F test or a Wald test of whether the interaction terms are needed, similar to what was done in Problem III. b. Obama too is wondering how much to emphasize his health care program. Based on data from past electoral campaigns, his advisors believe that each additional dollar he spends on ads promoting his health care program, up to $10 million, will steadily increase his support among voters. However, their research suggests that any dollars spent above that on health care advertising will have no effect on his support. Spline effects sound like a good idea. Up to $10 million dollars, each dollar spent promoting health care has a positive effect on support. Any additional amount above $10 million has zero effect. A graph of the relationship might look something like Failing to take this into account would probably underestimate how effective the first $10 million was and overestimate the impact of amounts above that. In Stata you could use the mkspline function, e.g. if advertising is measured in millions of dollars you could do something like mkspline adv1 10 adv2 = adv reg support adv1 adv2 Sociology 63993 Exam 2 Page 8

The coefficient for adv2 would give me the slope for the effect of advertising above $10 million. Given the way the problem is phrased, I would want to test whether the coefficient is 0. I d of course also want to make sure that the coefficient for adv1 was positive and significant. Appendix: Stata code used in this exam * Exam 2, 2012, Soc 63993 version 12.1 * Problem I-3. webuse nhanes2f, clear reg health weight ovtest * Problem II, Path Analysis clear all matrix input corr = (1,-.7,-.35,.39\-.7,1,.5,-.12\-.35,.5,1,-.06\.39,-.12,-.06,1) corr2data x1 x2 x3 x4, n(100) corr(corr) reg x2 x1 reg x3 x1 x2 reg x4 x1 x2 x3 corr * Problem III. use "http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta", clear keep in 1/2270 gen female = male==0 label define female 0 "Male" 1 "Female" label values female female gen religious = 60 - (female *.3 * ed + ed) *** Center religious sum religious replace religious = religious - r(mean) gen femrel = female * religious gen contracept = ((female *.04 * warm + warm + female*1.5) - 1) * 25 *** Results for exam start below *** Estimate Models nestreg: reg contracept religious female femrel *** Differences by gender ttest contracept, by(female) ttest religious, by(female) *** Supplemental - Redo with factor variables quietly reg contracept religious i.female i.female#c.religious quietly margins female, at(religious=(-50(5)50)) marginsplot, noci scheme(sj) Sociology 63993 Exam 2 Page 9