Chapter 15 - Multiple Regression

Similar documents
Chapter 15 - Multiple Regression

Advanced Quantitative Data Analysis

Chapter 19: Logistic regression

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

MULTINOMIAL LOGISTIC REGRESSION

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

y response variable x 1, x 2,, x k -- a set of explanatory variables

Dependent Variable Q83: Attended meetings of your town or city council (0=no, 1=yes)

Introduction to Regression

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.

Binary Logistic Regression

Lecture 12: Effect modification, and confounding in logistic regression

Chapter 19: Logistic regression

LOGISTICS REGRESSION FOR SAMPLE SURVEYS

Correlation and regression

Daniel Boduszek University of Huddersfield

Review of Multiple Regression

Chapter 9 - Correlation and Regression

Simple Linear Regression

Chapter 9. Correlation and Regression

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STATISTICS 110/201 PRACTICE FINAL EXAM

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

Chapter 10-Regression

REVIEW 8/2/2017 陈芳华东师大英语系

Introduction to Linear regression analysis. Part 2. Model comparisons

Intro to Linear Regression

Regression of Inflation on Percent M3 Change

Intro to Linear Regression

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

Logistic Regression. Continued Psy 524 Ainsworth

Do not copy, post, or distribute

Introducing Generalized Linear Models: Logistic Regression

STAT 212 Business Statistics II 1

Model Estimation Example

An Introduction to Mplus and Path Analysis

Multiple Regression Methods

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

An Introduction to Path Analysis

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

Sociology 593 Exam 1 Answer Key February 17, 1995

Investigating Models with Two or Three Categories

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi)

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Final Exam. Name: Solution:

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Variance Partitioning

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Predict y from (possibly) many predictors x. Model Criticism Study the importance of columns

Variance Partitioning

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

Topic 1. Definitions

Inferences for Regression

Regression Models - Introduction

Multiple Comparisons

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Chapter 14 Multiple Regression Analysis

Multiple linear regression S6

Can you tell the relationship between students SAT scores and their college grades?

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

1. BINARY LOGISTIC REGRESSION

bivariate correlation bivariate regression multiple regression

One-sided and two-sided t-test

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS

Chapter 5. Logistic Regression

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Statistical. Psychology

Prob/Stats Questions? /32

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Analysing data: regression and correlation S6 and S7

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)

Univariate ARIMA Models

Business Statistics. Lecture 10: Correlation and Linear Regression

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

LI EAR REGRESSIO A D CORRELATIO

Correlation and Linear Regression

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

Simple Linear Regression Using Ordinary Least Squares

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

MATH 644: Regression Analysis Methods

Review of Statistics 101

Transcription:

15.1 Predicting Quality of Life: Chapter 15 - Multiple Regression a. All other variables held constant, a difference of +1 degree in Temperature is associated with a difference of.01 in perceived Quality of Life. A difference of $1000 in median Income, again all other variables held constant, is associated with a +.05 difference in perceived Quality of Life. A similar interpretation applies to b 3 and b 4. Since values of 0.00 cannot reasonably occur for all predictors, the intercept has no meaningful interpretation. b. Y ˆ 5. 37. 01( 55). 05( 1). 003( 500). 01( 00) 4. 9 c. Y ˆ 5. 37. 01( 55). 05( 1). 003( 100). 01( 00) 3. 7 15.3 The F values for the four regression coefficients would be as follows: F 1 1 s 1 0. 438 0. 397 1. F s 0. 76 0. 5 9. 14 F 3 3 s 3 0. 081 0. 05. 43 F 4 4 s 4 0. 13 0. 05 7. 88 I would thus delete Temperature, since it has the smallest F, and therefore the smallest semi-partial correlation with the dependent variable. 15.5 a. Envir has the largest semi-partial correlation with the criterion, because it has the largest value of t. b. The gain in prediction (from r =.58 to R =.697) which we obtain by using all the predictors is more than offset by the loss of power we sustain as p became large relative to N. 15.7 As the correlation between two variables decreases, the amount of variance in a third variable that they share decreases. Thus the higher will be the possible squared semipartial correlation of each variable with the criterion. They each can account for more previously unexplained variation. 15.9 The tolerance column shows us that NumSup and Respon are fairly well correlated with the other predictors, whereas Yrs is nearly independent of them.

15.11 Using Y and Y from Exercise 15.10: MS residual Y Yˆ N p 1 4.3 4.3 (also calculated by BMDP in Exercise 15.4) 15 4 1 15.13 Adjusted R for 15 cases in Exercise 15.1: R 0. 134. 173 est R 1 ( 1 R ) ( N 1 ) ( N p 1 ) ( 1. 173) ( 14) 1 ( 15 4 1 ). 158 Since a squared value cannot be negative, we will declare it undefined. This is all the more reasonable in light of the fact that we cannot reject H 0 :R* = 0. 15.15 Using the first three variables from Exercise 15.4: a. Figure comparable to Figure 15.1: b. Y = 0.4067Respon + 0.1845NumSup +.354 The slope of the plane with respect to the Respon axis (X 1 ) =.4067 The slope of the plane with respect to the NumSup axis (X ) =.1845 The plane intersects the Y axis at.354 15.17 It has no meaning in that we have the data for the population of interest (the 10 districts).

15.19 It plays a major role through its correlation with the residual components of the other variables. 15.1 Within the context of a multiple-regression equation, we cannot look at one variable alone. The slope for one variable is only the slope for that variable when all other variables are held constant. The percentage of mothers not seeking care until the third trimester is correlated with a number of other variables. 15.3 Create set of data examining residuals. 15.5 Rerun of Exercise 15.4 adding PVTotal. b. The value of R was virtually unaffected. However, the standard error of the regression coefficient for PVLoss increased from 0.105 to 0.178. Tolerance for PVLoss decreased from.981 to.345, whereas VIF increased from 1.019 to.900. (c) PVTotal should not be included in the model because it is redundant with the other variables. 15.7 Path diagram showing the relationships among the variables in the model. SuppTotl -.361.0837 PVLoss.4490 DepressT.1099 -.054 AgeAtLoss 15.9 Regression diagnostics. Case # 104 has the largest value of Cook's D (.137) but not a very large Studentized residual (t = 1.88). When we delete this case the squared multiple correlation is increased slightly. More importantly, the standard error of regression and the standard error of one of the predictors (PVLoss) also decrease slightly. This case is not sufficiently extreme to have a major impact on the data.

15.31 Logistic regression using Harass.dat: The dependent variable (Reporting) is the last variable in the data set. I cannot provide all possible models, so I am including just the most complete. This is a less than optimal model, but it provides a good starting point. This result was given by SPSS. Block 1: Method = Enter Omnibus Tests of Model Coef f icients Step 1 Step Block Model Chi-square df Sig. 35.44 5.000 35.44 5.000 35.44 5.000 Model Summary Step 1 - Log likelihood Cox & Snell R Square Nagelkerke R Square 439.984.098.131 Classification Table a Predicted Observed No REPORT Yes Percentage Correct Step 1 REPORT No 111 63 63.8 Yes 77 9 54.4 Overall Percentage 59. a. The cut value is.500 Step 1 a Variables in the Equation AGE -.014.013 1.16 1.89.986 B S.E. Wald df Sig. Exp(B) MARSTAT -.07.34.095 1.757.930 FEMIDEOL.007.015.8 1.633 1.007 FREQBEH -.046.153.093 1.761.955 OFFENSIV.488.095 6.431 1.000 1.69 Constant -1.73 1.430 1.467 1.6.177 a. Variable(s) entered on step 1: AGE, MARSTAT, FEMIDEOL, FREQBEH, OFFENSIV.

From this set of predictors we see that overall LR = 35.44, which is significant on 5 df with a p value of.0000 (to 4 decimal places). The only predictor that contributes significantly is the Offensiveness of the behavior, which has a Wald of 6.43. The exponentiation of the regression coefficient yields 0.9547. This would suggest that as the offensiveness of the behavior increases, the likelihood of reporting decreases. That s an odd result. But remember that we have all variables in the model. If we simply predicting reporting by using Offensiveness, exp(b) = 1.65, which means that a 1 point increase in Offensiveness multiplies the odds of reporting by 1.65. Obviously we have some work to do to make sense of these data. I leave that to you. 15.33 It may well be that the frequency of the behavior is tied in with its offensiveness, which is related to the likelihood of reporting. In fact, the correlation between those two variables is.0, which is significant at p <.000. (I think my explanation would be more convincing if Frequency were a significant predictor when used on its own.) 15.35 BlamPer and BlamBeh are correlated at a moderate level (r =.5), and once we condition on BlamPer by including it in the equation, there is little left for BlamBeh to explain. 15.37 Make up an example. 15.39 This should cause them to pause. It is impossible to change one of the variables without changing the interaction in which that variable plays a role. In other words, I can t think of a sensible interpretation of holding all other variables constant in this situation. 15.41 Analysis of results from Feinberg and Willer (011). The following comes from using the program by Preacher and Leonardelli referred to in the chapter. I calculated the t values from the regression coefficients and their standard errors and then inserted those t values in the program. You can see that the mediated path is statistically significant regardless of which standard error you use for that path.