EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS

Similar documents
EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Chapter 9 - Correlation and Regression

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi)

Item-Total Statistics. Corrected Item- Cronbach's Item Deleted. Total

Advanced Quantitative Data Analysis

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Simple Linear Regression

SPSS LAB FILE 1

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Using SPSS for One Way Analysis of Variance

1 Correlation and Inference from Regression

Regression. Notes. Page 1. Output Created Comments 25-JAN :29:55

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

1 A Review of Correlation and Regression

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Daniel Boduszek University of Huddersfield

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

Review of Multiple Regression

IT 403 Practice Problems (2-2) Answers

SPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Introduction to Regression

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology

Module 8: Linear Regression. The Applied Research Center

Sociology 593 Exam 1 February 14, 1997

Regression Diagnostics Procedures

Multiple linear regression S6

1 Introduction to Minitab

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Regression Model Building

Key Concepts. Correlation (Pearson & Spearman) & Linear Regression. Assumptions. Correlation parametric & non-para. Correlation

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

STATISTICS 110/201 PRACTICE FINAL EXAM

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Chapter 10-Regression

*************NO YOGA!!!!!!!************************************.

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Correlations. Notes. Output Created Comments 04-OCT :34:52

Correlation and simple linear regression S5

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Simple Linear Regression: One Quantitative IV

Practical Biostatistics

ASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful

Box-Cox Transformations

In Class Review Exercises Vartanian: SW 540

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

M&M Exponentials Exponential Function

Topic 1. Definitions

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Sociology 593 Exam 1 February 17, 1995

Multiple Regression and Model Building (cont d) + GIS Lecture 21 3 May 2006 R. Ryznar

9. Linear Regression and Correlation

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014

Appendix A Summary of Tasks. Appendix Table of Contents

y response variable x 1, x 2,, x k -- a set of explanatory variables

Independent Samples ANOVA

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

SPSS Guide For MMI 409

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

CHAPTER 10. Regression and Correlation

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

Step 2: Select Analyze, Mixed Models, and Linear.

Unit 6 - Introduction to linear regression

Can you tell the relationship between students SAT scores and their college grades?

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Introduction to Regression

Multiple Regression Examples

Multiple Linear Regression II. Lecture 8. Overview. Readings

Multiple Linear Regression II. Lecture 8. Overview. Readings. Summary of MLR I. Summary of MLR I. Summary of MLR I

Taguchi Method and Robust Design: Tutorial and Guideline

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.

Checking model assumptions with regression diagnostics

Linear Regression Measurement & Evaluation of HCC Systems

REVIEW 8/2/2017 陈芳华东师大英语系

Chapter 19: Logistic regression

Steps to take to do the descriptive part of regression analysis:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

MULTIPLE LINEAR REGRESSION IN MINITAB

SPSS and its usage 2073/06/07 06/12. Dr. Bijay Lal Pradhan Dr Bijay Lal Pradhan

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

Topic 18: Model Selection and Diagnostics

Multicollinearity Exercise

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

Introduction and Single Predictor Regression. Correlation

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Washington State Test

Determining the Conductivity of Standard Solutions

Sociology 593 Exam 2 Answer Key March 28, 2002

Investigating Models with Two or Three Categories

Latent Growth Models 1

CAMPBELL COLLABORATION

Statistics 5100 Spring 2018 Exam 1

Polynomial Regression

Transcription:

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS The data used in this example describe teacher and student behavior in 8 classrooms. The variables are: Y percentage of interventions in a 30 minute period that are punitive (PI) X percentage of 30 one-minute time Periods in which the student should be on task but is off task (OTR) X percentage of 30 one-minute time periods during which students are changing activities or are waiting for teacher directions (TT) The data are used to illustrate multiple regression with two independent variables: or in terms of the variables in the analysis Y X X PI TT OTR See pages 0- of the first section for directions to calculate descriptive statistics. Descriptives Descriptive Statistics OTR TT PI Valid N (listwise) Std. N Minimum Maximum Mean Deviation 8 0 40 9. 7.08 8 0 6 4.89 4.49 7 0 75 9.98 6.44 7

See pages 4-6 of the first section for direction to calculate a correlation matrix Correlations Correlations OTR TT PI OTR Pearson Correlation.000 -.079 -.004 Sig. (-tailed)..394.965 N 8 8 7 TT Pearson Correlation -.079.000.35** Sig. (-tailed).394..000 N 8 8 7 PI Pearson Correlation -.004.35**.000 Sig. (-tailed).965.000. N 7 7 7 **. Correlation is significant at the 0.0 level (-tailed). Regression Analysis. When you do the regression analysis you should save both the predicted values and the Studentized residuals. Both are used to constrict residual plots.. If there are missing data, it may be a good idea to use the regression procedure to calculate descriptive statistics and correlation coefficients, even though these have already been calculated. 3. I also want to calculate partial and semi-partial correlations. (In SPSS a semipartial correlation is called a part correlation.) To calculate the statistics mentioned in points and 3, in addition to following the steps for a regression analysis (see pages 8-9 of the first section) also select the options in the following screen:

3

4 Regression Descriptive Statistics PI OTR TT Std. Mean Deviation N 9.98 6.44 7 9.7 7.09 7 4.88 4.5 7 Correlations Pearson Correlation Sig. (-tailed) N PI OTR TT PI OTR TT PI OTR TT PI OTR TT.000 -.004.35 -.004.000 -.078.35 -.078.000..483.000.483..03.000.03. 7 7 7 7 7 7 7 7 7 Model Variables Entered/Removed b Variables Variables Entered Removed Method TT, OTR a. Enter a. All requested variables entered. b. Dependent Variable: PI Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.353 a.5.09 5.5 a. Predictors: (Constant), TT, OTR b. Dependent Variable: PI

5 Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 3905.068 95.534 8.07.00 a 7456.898 4 40.850 336.966 6 a. Predictors: (Constant), TT, OTR b. Dependent Variable: PI Model (Constant) OTR TT a. Dependent Variable: PI Unstandardized Coefficients Standardi zed Coefficien ts Coefficients a Correlations t Sig. Zero-order Partial Part B Std. Error Beta 3.83.909.094.76 5.437E-0.04.03.67.790 -.004.05.03.9.3.354 4.06.000.35.353.353 Case Number 64 Casewise Diagnostics a a. Dependent Variable: PI Std. Residual PI 3.75 75 Note that the correlations are not squared. What SPSS labels a part correlation is a (unsquared) semi-partial correlation. Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: PI Residuals Statistics a Std. Minimum Maximum Mean Deviation N 3.8 38.00 9.98 5.80 7 -.7 4.89.000.000 7.45 7.67.8.98 7.0 4. 9.98 5.79 7-34.3 50.83.E-6 5.38 7 -.05 3.75.000.99 7 -.40 3.38.000.0 7-4. 54.7 6.65E-03 6.0 7 -.474 3.549.006.07 7.07 7.35.983 3.79 7.000.399.04.055 7.000.36.07.03 7

6 Graph Plot of Studentized Residuals Versus Predicted Values 4 3 Studentized Residual 0 - - -3 0 0 0 30 40 Unstandardized Predicted Value Graph Plot of Studentized Residuals Versus OTR 4 3 Studentized Residual 0 - - -3-0 0 0 0 30 40 50 OTR

7 Graph Plot of Studentized Residuals Versus TT 4 3 Studentized Residual 0 - - -3-0 0 0 0 30 TT A Confidence Interval for the Squared Multiple Correlation Coefficient To compute the confidence interval, you must download the file ci.smcc.bisec.sps from the Webcite http://plaze.ufl.edu/algina.index.html. The link to the program is SPSS CI Program for a Squared Multiple Correlation Coefficient. Once you have downloaded the file you run it by following these directions:. Start SPSS.. Click File, Open, Syntax and find the file ci.smcc.bisec.sps It will open into the SPSS syntax editor. The top few lines of the file are comment This program computes a confidence interval for the comment squared multiple correlation coefficient for the design comment in which values of the independent variables are sampled. comment To use the program you input comment n--the sample size comment k--the number of predictors comment rsq--the sample squared multiple correlation coefficient comment conlevel--the confidence level for the interval. NEW file. INPUT PROGRAM.

8 compute n= 7. compute k=. compute rsq=0.45. compute conlev=.95. 3. Click run, all To use the program for other problems, change n, k, and rsq. To change the confidence level, change conlev. Note that a period must end each of these lines and you should not edit any other lines.

9 EDF 7405 Advanced Quantitative Methods in Educational Research POLY.SAS Data are available for 4 participants in a study. Participants practiced a psychomotor task. The number of trials of practice ranged from one to six and was determined by random assignment. Thus four participants had one trial of practice, four had two trials, and so forth. The dependent variable was accuracy on a trial following the last practice trial. See pages 6-9 of the first section for directions to producing case summaries Summarize Case Processing Summary a Cases Included Excluded Total N Percent N Percent N Percent TRIALS 4 00.0% 0.0% 4 00.0% ACCURACY 4 00.0% 0.0% 4 00.0% a. Limited to first 00 cases.

0 Case Summaries a ACCURA TRIALS CY 5 3 9 4 0 5 44 6 4 7 33 8 40 9 3 46 0 3 3 63 3 44 3 4 47 4 4 36 5 4 54 6 4 65 7 5 49 8 5 55 9 5 44 0 5 60 6 5 6 56 3 6 49 4 6 6 Total N 4 4 a. Limited to first 00 cases.

See pages -3 of the first section for directions to construct a scatterplot. Graph 70 Accuracy vs. Trials of Practice 60 50 ACCURACY 40 30 0 0 0-0 0 3 4 5 6 7 TRIALS See pages 8-9 and 3-35 of the first section for directions to conduct a regression analysis. Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method TRIALS a. Enter a. All requested variables entered. b. Dependent Variable: ACCURACY Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.75 a.566.546.80 a. Predictors: (Constant), TRIALS b. Dependent Variable: ACCURACY

Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 4698.604 4698.604 8.698.000 a 360.0 63.78 8300.65 3 a. Predictors: (Constant), TRIALS b. Dependent Variable: ACCURACY Model (Constant) TRIALS Unstandardized Coefficients a. Dependent Variable: ACCURACY Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig..450 5.956.090.048 8.93.59.75 5.357.000 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: ACCURACY Residuals Statistics a Std. Minimum Maximum Mean Deviation N 0.64 6.6 4.3 4.9 4 -.433.433.000.000 4.7 4.63 3.6.80 4 0.89 63.5 4.44 4. 4-0.64 5.97.00.5 4 -.63.030.000.978 4 -.73.077 -.0.0 4-3.75 7.0 -.3 3.67 4 -.89.64 -.00.055 4.08.054.958.837 4.000.6.047.06 4.004.089.04.036 4

3 Graph 3 Residual Plot: Linear Model Studentized Residual 0 - - 0 3 4 5 6 7 TRIALS Graph 70 Accuracy vs. Trials of Practice 60 50 40 30 0 ACCURACY 0 0-0 0 3 4 5 6 7 TRIALS

4

5 EDF 7405 Advanced Quantitative Methods in Educational Research POLY.SAS In this analysis the quadratic model is used to analyze the psychomotor data. The quadratic model is where Y X X Y is accuracy X is number of trials of practice. To use the quadratic model we must use the TRANSFORM procedure to compute and add it to the data set. We have already seen how to compute new variables and add them to the data set. To review use the COMPUTE option within the TRANSFORM option. X

6

7

8 Once the squared trials term is added to the data set, we analyze the data by using multiple regression analysis. See pages 6-9 of the first section for directions to produce case summaries. Summarize Case Processing Summary a Cases Included Excluded Total N Percent N Percent N Percent TRIALS 4 00.0% 0.0% 4 00.0% ACCURACY 4 00.0% 0.0% 4 00.0% TRIALRD 4 00.0% 0.0% 4 00.0% a. Limited to first 00 cases.

9 Case Summaries a ACCURA TRIALS CY TRIALRD.00 5.00 3 9.00 4 0.00 5 44 4.00 6 4 4.00 7 33 4.00 8 40 4.00 9 3 46 9.00 0 3 9.00 3 63 9.00 3 44 9.00 3 4 47 6.00 4 4 36 6.00 5 4 54 6.00 6 4 65 6.00 7 5 49 5.00 8 5 55 5.00 9 5 44 5.00 0 5 60 5.00 6 5 36.00 6 56 36.00 3 6 49 36.00 4 6 6 36.00 Total N 4 4 4 a. Limited to first 00 cases. Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method TRIALRD, TRIALS a. Enter a. All requested variables entered. b. Dependent Variable: ACCURACY

0 Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.848 a.79.693 0.53 a. Predictors: (Constant), TRIALRD, TRIALS b. Dependent Variable: ACCURACY Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 597.568 985.784 6.9.000 a 39.057 0.907 8300.65 3 a. Predictors: (Constant), TRIALRD, TRIALS b. Dependent Variable: ACCURACY Model (Constant) TRIALS TRIALRD Unstandardized Coefficients a. Dependent Variable: ACCURACY Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig. -4.800 9.49 -.57.3 8.630 6.6.69 4.646.000 -.90.86 -.97-3.388.003 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: ACCURACY Residuals Statistics a Std. Minimum Maximum Mean Deviation N 0.9 55.36 4.3 6. 4 -.875.883.000.000 4.9 4.77 3.63.83 4 8.8 56.3 4.9 5.86 4 -.8 8.9 4.4E-5 0.06 4 -.66.77.000.956 4 -.75.83 -.003.05 4-5.5 0.05-6.36E-0.37 4 -.557.97 -.03.060 4.808 3.765.97.344 4.000.77.043.049 4.035.64.083.058 4

We want to construct a residual plot. Since I have continued the analysis from the linear analysis, there are now two residual variables in the data set. Sre_ is the residual variable from the quadratic analysis.

Graph Residual plot: Qudratic Model Studentized Residual 0 - - -3 0 3 4 5 6 7 TRIALS It may be useful to have a scatterplot with the quadratic regression curve drawn on it. To do this we follow the steps used to draw a linear regression line on the plot (see pages - 3 of the first section) until we get to the following screen in the SPSS for Windows Chart Editor:

3

4 Graph 70 Scatter Plot with Quadratic Regression Curve 60 50 ACCURACY 40 30 0 0 0-0 0 3 4 5 6 7 TRIALS

5 cubic model is EDF 7405 Advanced Quantitative Methods in Educational Research POLY3.SAS In this analysis the cubic model is used to analyze the psychomotor data. The Y X X X 3 To use the cubic model we must use the COMPUTE procedure to compute X and and add them to the data set (see pages 5-8 of this section for directions to compute new variables). Then we analyze the data by using multiple regression analysis. 3 X See pages 6-9 of the first section for directions to produce case summaries. Summarize Case Processing Summary a Cases Included Excluded Total N Percent N Percent N Percent TRIALS 4 00.0% 0.0% 4 00.0% ACCURACY 4 00.0% 0.0% 4 00.0% TRIALRD 4 00.0% 0.0% 4 00.0% TRIAL3RD 4 00.0% 0.0% 4 00.0% a. Limited to first 00 cases.

6 Case Summaries a ACCURA TRIALS CY TRIALRD TRIAL3RD.00.00 5.00.00 3 9.00.00 4 0.00.00 5 44 4.00 8.00 6 4 4.00 8.00 7 33 4.00 8.00 8 40 4.00 8.00 9 3 46 9.00 7.00 0 3 9.00 7.00 3 63 9.00 7.00 3 44 9.00 7.00 3 4 47 6.00 64.00 4 4 36 6.00 64.00 5 4 54 6.00 64.00 6 4 65 6.00 64.00 7 5 49 5.00 5.00 8 5 55 5.00 5.00 9 5 44 5.00 5.00 0 5 60 5.00 5.00 6 5 36.00 6.00 6 56 36.00 6.00 3 6 49 36.00 6.00 4 6 6 36.00 6.00 Total N 4 4 4 4 a. Limited to first 00 cases. See pages 8-9 and 3-35 of the first section for directions to conduct a regression analysis. Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method TRIAL3RD, TRIALS, TRIALRD a. Enter a. All requested variables entered. b. Dependent Variable: ACCURACY

7 Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.874 a.763.78 9.9 a. Predictors: (Constant), TRIAL3RD, TRIALS, TRIALRD b. Dependent Variable: ACCURACY Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 6335.657 3.886.495.000 a 964.968 0 98.48 8300.65 3 a. Predictors: (Constant), TRIAL3RD, TRIALS, TRIALRD b. Dependent Variable: ACCURACY Model (Constant) TRIALS TRIALRD TRIAL3RD Unstandardized Coefficients a. Dependent Variable: ACCURACY Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig. -44.667 7.869 -.500.0 66.0 0.360 6.079 3.5.004-5.364 6.55-0.089 -.358.09.85.66 4.850.95.069

8 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: ACCURACY Residuals Statistics a Std. Minimum Maximum Mean Deviation N 7.36 55.43 4.3 6.60 4 -.035.86.000.000 4 3.36 4.86 4.00.64 4 3.68 57.46 4.5 6.63 4-5.66 5.34.96E-5 9.4 4 -.589.548.000.933 4 -.75.645 -.00.007 4-9.00 7.34 -.0E-0 0.8 4-3.403.75 -.0.0 4.688 4.563.875.53 4.000.46.04.056 4.073.98.5.054 4

9 Graph Residual Plot: Cubic Model Studentized Residual 0 - - -3 0 3 4 5 6 7 TRIALS See pages -4 of this section for directions to draw the regression relationship on a scatterplot.

30 Graph 70 Scatter Plot with Cubic Regression Curve 60 50 ACCURACY 40 30 0 0 0-0 0 3 4 5 6 7 TRIALS

3 EDF 7405 Advanced Quantitative Methods in Educational Research RPLOTS.SAS Data available for 9 cities describe the support for social services (SS), the ethnic/racial mix in the city (HI - heterogeneity index), and the degree of migration in and out of the city (MI mobility index). The purpose of the following analysis is to determine whether SS is related to either HI or MI. These data are used to illustrate residual plots in multiple regression. The model is or in terms of the variables in the analysis Y X X SS HI MI See pages 6-9 of the first section for direction to produce case summaries. Summarize Case Processing Summary a Cases Included Excluded Total N Percent N Percent N Percent SS 9 00.0% 0.0% 9 00.0% HI 9 00.0% 0.0% 9 00.0% MI 9 00.0% 0.0% 9 00.0% a. Limited to first 00 cases.

3 Case Summaries a SS HI MI 9 5 7 6 0 3 6 4 4 6 4 5 5 6 7 8 6 5 8 8 7 5 5 8 4 4 4 9 4 9 0 4 3 3 4 40 9 4 3 35 3 4 35 4 4 43 5 3 33 6 6 3 6 4 7 3 9 8 3 6 50 9 46 0 8 7 0 38 0 3 8 3 4 0 9 3 5 0 39 6 9 4 34 7 9 9 3 8 8 7 5 9 7 6 36 Total N 9 9 9 a. Limited to first 00 cases.

33 See pages 8-9 and 3-35 of the first section for directions to conduct a regression analysis. Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method MI, HI a. Enter a. All requested variables entered. b. Dependent Variable: SS Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.593 a.35.30.35 a. Predictors: (Constant), MI, HI b. Dependent Variable: SS Model Regression Residual Total a. Predictors: (Constant), MI, HI b. Dependent Variable: SS ANOVA b Sum of Mean Squares df Square F Sig. 78.06 39.03 7.068.004 a 43.57 6 5.5.633 8 Model (Constant) HI MI a. Dependent Variable: SS Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig..087.77 9.6.000 -.56.057 -.499 -.70.0 -.94.054 -.664-3.60.00

34 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value Graph a. Dependent Variable: SS Residuals Statistics a Std. Minimum Maximum Mean Deviation N 8.90 6.00.88.67 9 -.378.869.000.000 9.46.30.7. 9 7.0 6.00.79.86 9-4.4 4.0-5.36E-5.6 9 -.880.743.000.964 9 -.958.074.07.07 9-4.79 5.80 8.54E-0.58 9 -.079.6.08.060 9.097 7.587.93.90 9.000.597.050.3 9.003.7.069.069 9 3 Student. Residuals vs. Predicted Values Studentized Residual 0 - - -3 8 0 4 6 8 Unstandardized Predicted Value

35 Graph Studentized Residuals vs. HI 3 Studentized Residual 0 - - -3 0 0 30 40 50 HI Graph Studentized Residuals vs. MI 3 Studentized Residual 0 - - -3 0 0 30 40 50 60 MI

36

37 EDF 7405 Advanced Quantitative Methods in Educational Research RPLOTS.SAS The previous analysis is continued to illustrate the use of polynomials when there is more than one conceptual independent variable. We need to add a squared MI term to the model, which becomes Y X X X 3 or in terms of the variables in the analysis SS HI MI MI 3 To add MI to the data we used the COMPUTE option within the TRANSFORM option (see pages 5-8 of this section for directions to computing new variables). See pages 8-9 and 3-35 of the first section for directions to conduct a regression analysis. Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method MIRD, HI, MI a. Enter a. All requested variables entered. b. Dependent Variable: SS Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.743 a.553.499.99 a. Predictors: (Constant), MIRD, HI, MI b. Dependent Variable: SS

38 Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig..477 3 40.86 0.93.000 a 99.56 5 3.966.633 8 a. Predictors: (Constant), MIRD, HI, MI b. Dependent Variable: SS Model (Constant) HI MI MIRD a. Dependent Variable: SS Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig. 3.6 3.576 8.74.000 -.84.049 -.588-3.77.00 -.93.5-3.96-4.38.000.6E-0.004.57 3.346.003 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: SS Residuals Statistics a Std. Minimum Maximum Mean Deviation N 9.63 7.44.88.09 9 -.554.83.000.000 9.45.66.70.5 9 9.7 7.8.89.7 9-3.80 3.35-3.34E-5.88 9 -.906.683.000.945 9 -.995.886 -.003.008 9-4.6 4. -.85E-0.5 9 -.3.996 -.00.03 9.495 8.477.897 3.50 9.000.8.036.048 9.08.660.03.6 9

39 Graph Student. Residual vs. Predicted Values Studentized Residual 0 - - -3 8 0 4 6 8 Unstandardized Predicted Value Graph Studentized Residual vs HI Studentized Residual 0 - - -3 0 0 30 40 50 HI

40 Graph Studentized Residual vs MI Studentized Residual 0 - - -3 0 0 30 40 50 60 MI

4 EDF 7405 Advanced Quantitative Methods in Educational Research For 0 students, data are available on grade point average (GPA) for a particular semester, SAT math scores, and average number of hours spent studying weekly (HRS). A researcher is investigating whether SAT ( X ) scores and study time ( X ) interact in predicting GPA (Y). The regression model is Y X X X X 3 or in terms of the abbreviations for the variables: 3 GPA SAT HRS SAT HRS To estimate the model, the TRANFORM procedure must be used to compute the product term (see pages 5-8 of this section for directions to compute new variables). We are going to add collinearity statistics to our results. Follow the steps to estimate a regression equation (see pages 8-9 of the first section) until you get to the following screen:

4 Regression Model Variables Entered/Removed b Variables Variables Entered Removed Method SATXHRS, HRS, SAT a. Enter a. All requested variables entered. b. Dependent Variable: GPA Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.963 a.97.95.638 a. Predictors: (Constant), SATXHRS, HRS, SAT b. Dependent Variable: GPA Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 33.4 3.038 4.69.000 a.60 97.68E-0 35.76 00 a. Predictors: (Constant), SATXHRS, HRS, SAT b. Dependent Variable: GPA

43 Model (Constant) SAT HRS SATXHRS a. Dependent Variable: GPA Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts Collinearity Statistics B Std. Error Beta t Sig. Tolerance VIF.86.37.39.03-5.35E-05.00 -.00 -.068.946.037 6.779.968E-0.038.068.5.603.044.7.843E-04.000.9 3.89.000.03 75.750 Collinearity Diagnostics a Model Dimension 3 4 a. Dependent Variable: GPA Condition Variance Proportions Eigenvalue Index (Constant) SAT HRS SATXHRS 3.9.000.00.00.00.00 6.06E-0 7.949.0.00.00.0.608E-0 5.65.00.05.06.00 4.464E-04 93.74.98.95.94.99 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: GPA Residuals Statistics a Std. Minimum Maximum Mean Deviation N.075 3.974.4987.5755 0 -.44.483.000.000 0.98E-0 8.434E-0 3.05E-0.47E-0 0.57 3.933.498.5756 0 -.4559.364 -.76E-6.63 0 -.784.993.000.985 0 -.856.03.00.009 0 -.4798.3988 5.040E-04.696 0 -.969.49.000.08 0.38 5.539.970 3.699 0.000.69.03.03 0.004.55.030.037 0

44 Graph Student. Residual vs. Predicted Values 3 Studentized Residual 0 - - -3.0.5.0.5 3.0 3.5 4.0 Unstandardized Predicted Value Graph Student. Residual vs. SAT 3 Studentized Residual 0 - - -3 00 300 400 500 600 700 800 SAT

45 Graph 3 Student. Residual vs. HRS Studentized Residual 0 - - -3 4 6 8 0 4 6 HRS

46

47 EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS This handout illustrates obtaining predicted values and residuals, as well as outlier, leverage, and influence diagnostics. The IQ-MAGE data are used. Follow the usual steps in the regression analysis (see pages 8-9 of the first section) until you see the following screen:

48 I have not included the printout from the analysis. Rather the following shows SPSS for Windows Data Editor. Not all of the added results fit on the screen. Here is a list of the names of the added variables PRE_ predicted values RES_ residuals COO_ Cook s distance LEV_ leverage value SDF_ DFFITS SDB0_ DFBETA for the intercept SDB_ DFBETA for the slope for the first variable in the model (MAGE in our case) SDB_ DFBETA for the slope for the second variable in the model (SES in our case)

49

50 It may be helpful to sort these results. To do so click Data to obtain I sorted the data by MAGE and SES.

5 EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS These results illustrate the use of multiple regression analysis with more than two quantitative independent variables. The dependent variable is IQ measured at age three. The independent variables are four variables measured during or right after birth: Birthweight (BW) the weight of the new born in grams; very large or very small weights may indicate health problems ( X ). APGAR a quick assessment of overall newborn well being. Low numbers may indicate problems ( X ). Intrapartum factors (IF) a measure of the quality of the delivery. Low numbers indicate problems occurred in the delivery ( X 3 ). Neonatal factors (NF) a measure of the health of the newborn. Low numbers The model is may indicate health problems ( X 4 ). Y X X X X 3 3 4 4 or in terms of the abbreviations for the variables IQ BW APGAR IF NF 3 4 See pages 4-4 for directions to conduct a collinearity analysis and pages 47-50 for directions to conduct an influence analysis.

5 Regression Descriptive Statistics IQ BW APGAR IF NF Std. Mean Deviation N 95.9 9.45 99 74.39 896.39 99 8.9.66 99 6.58 9.7 99.38 4.44 99 Pearson Correlation Sig. (-tailed) N IQ BW APGAR IF NF IQ BW APGAR IF NF IQ BW APGAR IF NF Correlations IQ BW APGAR IF NF.000.30.7 -.49 -.78.30.000.85 -. -.549.7.85.000 -.086 -.54 -.49 -. -.086.000 -.087 -.78 -.549 -.54 -.087.000..00.003.070.039.00..00.35.000.003.00..99.000.070.35.99..95.039.000.000.95. 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 Model Variables Entered/Removed b Variables Variables Entered Removed Method NF, IF, APGAR, BW a. Enter a. All requested variables entered. b. Dependent Variable: IQ

53 Model Model Summary b Std. Error Adjusted of the R R Square R Square Estimate.380 a.44.08 8.37 a. Predictors: (Constant), NF, IF, APGAR, BW b. Dependent Variable: IQ Model Regression Residual Total ANOVA b Sum of Mean Squares df Square F Sig. 5347.70 4 336.793 3.96.005 a 379.335 94 337.440 37066.505 98 a. Predictors: (Constant), NF, IF, APGAR, BW b. Dependent Variable: IQ Model (Constant) BW APGAR IF NF a. Dependent Variable: IQ Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts Collinearity Statistics B Std. Error Beta t Sig. Tolerance VIF 55.69 5.559 3.579.00 5.979E-03.003.76.366.00.67.49.775.35.36.053.043.688.454 -.9.0 -.090 -.908.366.930.075.5.80.093.695.489.508.970 Collinearity Diagnostics a Model Dimension 3 4 5 a. Dependent Variable: IQ Condition Variance Proportions Eigenvalue Index (Constant) BW APGAR IF NF 3.78.000.00.00.00.0.0.676.345.00.00.00.45..548.605.00.0.00.4.8 4.886E-0 8.73.0.75.0.0.07 9.453E-03 9.83.98.3.79.09.5

54 Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: IQ Residuals Statistics a Std. Minimum Maximum Mean Deviation N 66.94 08. 95.9 7.39 99-3.838.75.000.000 99.3 9.84 3.87.46 99 56.04 09.37 95.33 7.76 99-44.4 47.6-5.74E-5 7.99 99 -.47.573.000.979 99 -.453.60 -.00.009 99-45.7 48.35-3.90E-0 9.4 99 -.5.687.000.09 99.560 7.4 3.960 4.40 99.000.45.03.03 99.006.77.040.045 99 Graph 3 Student. Residuals vs. Predicted Values Studentized Residual 0 - - -3 65 75 85 95 05 5 Unstandardized Predicted Value

55 Graph Studentized Residuals vs. BW 3 Studentized Residual 0 - - -3 0 000 000 3000 4000 5000 BW Graph Studentized Residuals vs. APGAR 3 Studentized Residual 0 - - -3-3 5 7 9 APGAR

56 Graph Studentized Residuals vs. IF 3 Studentized Residual 0 - - -3-5 5 5 5 35 45 IF Graph Studentized Residuals vs. NF 3 Studentized Residual 0 - - -3-0 0 0 0 30 40 50 60 70 NF

57 The following results were copied from the SPSS Windows Data Editor Studentized deleted residuals: 0 most extreme values sdr_ BW APGAR IF NF IQ Residual 800 9 0 0 53 -.5 400 7 40 45 -.946 000 6 0 35 56 -.843 800 7 55 56 -.790 89 cases deleted 760 0 55 94.764 360 9 35 30.900 370 9 5 39. 360 9 0 4.86 300 9 5 4.33 30 0 5 0 49.687 Leverage: 0 largest values lev_ BW APGAR IF NF IQ Leverage 4600 8 0 0 4.086 380 4 0 0 87.09 800 7 30 5 74.094 800 7 55 56.03 3080 9 35 0 89.06 580 3 0 30 64.3 400 7 40 45.5 970 8 0 65 70.8 3560 0 40 9.9 760 0 55 94.77

58 Cook s distance: 0 largest values Coo_ BW APGAR IF NF IQ Cook s 880 5 0 4 58.036 750 9 3 0 67.037 000 5 3 05.039 30 8 30 5 9.044 300 9 5 4.045 360 9 35 30.057 800 7 55 56.080 970 8 0 65 70.083 400 7 40 45.4 760 0 55 94.45 DFFITS: 0 most extreme values Sdf_ BW APGAR IF NF IQ DFFITS 400 7 40 45 -.856 970 8 0 65 70 -.648 800 7 55 56 -.64 800 9 0 0 53 -.43 750 9 3 0 67 -.43 89 cases deleted 000 5 3 05.444 30 8 30 5 9.473 300 9 5 4.486 360 9 35 30.540 760 0 55 94.9

59 The following table is sorted by BW sdb0_ sdb_ sdb_ sdb3_ sdb4_ BW APGAR IF NF IQ DFBET Int DFBET BW DFBET APGAR DFBET IF DFBET NF 760 0 55 94.790 -.05 -.888 -.49 -.083 800 7 55 56.099.8 -.7.063 -.49 900 9 5 35 9 -.004 -.0.03 -.00.00 950 9 0 0 84 -.03.057 -.003.06.030 970 8 0 65 70.90 -.0 -.33.007 -.539 000 5 3 05.37 -.38 -.53.0 -.079 050 7 45 90.000 -.05.008 -.009.06 0 7 6 0 03.95 -.98 -.48 -.089 -.3 340 9 6 0 0.8 -.46.037 -.050 -.8 340 8 5 35 03 -.08 -.063.060 -.005.085 400 7 40 45 -.68.5.0 -.695.38 450 9 0 5 8 -.077.7.003.06.08 480 7 0 40 85.003.003 -.005 -.006 -.07 550 9 0 0 79 -.07.30 -.04.07.068 580 8 5 6 87 -.00.00.000.00.00 580 3 0 30 64 -.67.086.70.09.00 740 0 0 30 05 -.086 -.08.5.040.096 780 0 0 5 70.7.04 -.89 -.49 -.9 880 5 0 4 58 -.5 -.00.98.8 -.36 900 7 5 86.06 -.04 -.0.08 -.0 900 0 0 00 -.00 -.00.0 -.008.0 940 8 0 0 08.053 -.076 -.00 -.073 -.008 960 9 0 0 9.00 -.09.00.06 -.06 000 6 0 35 56 -.5.06.4.33 -.30 00 5 0 0 69 -.0.077.97.03.0 00 9 5 9.004 -.007.000.003 -.006 00 9 0 5 03.030 -.058.0.00 -.043 0 9 0 40 05 -.06.00.06 -.006.088 40 0 5 86.009.046 -.050 -.0.07 40 9 0 0 03.0 -.039.0 -.036 -.0 350 9 5 0 97.00 -.007.004 -.003 -.003 360 9 35 30 -.375.55.337.3.468 400 9 0 5 07.047 -.060.00 -.059 -.055 480 0 6 0 89 -.009.04 -.08.00.035 560 9 0 5 08.07 -.036.06.09 -.036 600 8 30 5 77 -.004.00.00 -.63.09 60 9 0 5 7.064 -.073.003 -.097 -.08 60 9 5 0 83 -.05.054.009.08.074 70 9 0 0 96.00 -.005.003.0 -.00 750 9 3 0 67.049.00 -.055 -.377.04 780 8 0 00.03 -.0 -.07.065 -.043

790 9 5 0 88.05 -.08 -.053 -.006 -.06 800 7 30 5 74.089 -.3 -.0 -.6 -.34 800 9 0 0 53 -.0.64.053.38.76 850 8 0 0 96 -.00.00.00.00.00 880 8 0 0 08 -.008.05.003 -.033.033 880 9 5 0 4 -..076.099.8.5 890 0 0 0 80 -.03.056 -.06.09.076 900 9 0 0 86.00.000 -.0.049.00 900 9 0 0 85 -.05.037.04.066.073 95 9 5 0 05 -.09.09.06.004.033 970 9 0 0 84 -.053.033.06.07.077 3000 8 0 0 09.073 -.05 -.06.0 -.087 3000 9 5 0 08.005 -.00.006.057 -.037 3080 7 5 5.30.004 -.39 -.043 -.05 3080 9 35 0 89.06 -.0 -.00 -.066 -.00 3080 9 0 5 8 -.0.00 -.006.077.04 300 9 0 5 93 -.007 -.00 -.00.08.04 30 9 05 -.07.07.04.086 -.0 30 8 6 5.049.03 -.050 -.05 -.050 30 8 30 5 9 -.057.3 -.08.45.03 330 7 0 0 83.006 -.06.03 -.033 -.04 340 9 0 0 88 -.033.0.0.053.053 370 9 5 39 -.38.38. -.079.45 390 9 5 0 9 -.03.000.004.0.09 300 9 0 5 03.00.00.00 -.00 -.004 300 9 7 5 4 -.080.08.059.87.03 300 9 5 4 -.3.7.43 -.030.385 30 0 5 5 96.05 -.0 -.03.000 -.007 30 0 5 0 49 -.09.03.77 -.034 -.077 350 9 0 5 4.009.030.009 -.090 -.035 380 4 0 0 87.000.000.000.000.000 3300 9 0 5 30.005.048.0 -.0 -.038 3330 9 5 0 09.00.00 -.004 -.0 -.030 3350 7 0 5 84 -.046 -.09.067 -.03.037 3360 9 6 5 97.004 -.009 -.003.000.00 3360 9 0 0 98 -.005 -.00.003.0.00 3360 7 0 6 84 -.076 -.04.09.060.056 3400 0 0 5 96.03 -.0 -.038.05 -.009 3400 9 0 5 4 -.049.058.035 -.06.05 340 9 0 0 00.006 -.009 -.004.007 -.004 3440 8 0 0 86 -.063 -.04.063.06.070 3460 9 0 3 83 -.005 -.043.00.07.033 3560 0 40 9.039.36 -.54.085.070 360 9 0 4.04. -.09 -.3 -.05 3640 9 0 0.007.07 -.008 -.038 -.06 3650 9 0 0.03.055 -.05 -.073 -.050 60

3700 9 0 0 84 -.009 -.063.05.07.047 370 9 0 5 79.04 -. -.04.078 -.008 3750 8 0 5 69.05 -.88.075 -.079.003 3800 9 0 0 89.06 -.067 -.004 -.034.007 3900 9 5 0 8.6 -.87 -.057 -.0 -.0 390 8 0 0 08 -.03.045.003 -.007.03 4040 0 0 5 83.67 -.95 -.4.05 -.094 4060 9 0 5 -.00.049.005 -.07.0 40 8 0 0 98.0 -.054.0.05 -.05 400 9 0 5 00.04 -.056 -.005.06 -.04 460 9 94.039 -.03 -.00.08 -.04 4600 8 0 0 4 -.054.03.006 -.004.065 6

6 Reading Data into SPSS When you start SPSS the SPSS data editor appears: GO TO THE NEXT PAGE

63 Click file, then open, then data GO TO THE NEXT PAGE

64 You will get a Windows dialog box like this Press the down arrow in the Files of type slot and use the slide on the right hand side to locate the *.dat option as shown in the following display. Click on this option

65 This is the result on my computer: Use the down arrow in the Look in slot to find the folder in which you stored the file hsb.dat. On my computer the results looks like GO TO THE NEXT PAGE

66 Click hsb.dat. The result is the text import wizard GO TO THE NEXT PAGE

67 Click next and display GO TO THE NEXT PAGE

68 Click the fixed width radio button and then click next. You get GO TO THE NEXT PAGE

69 Click next again. You get GO TO THE NEXT PAGE

70 Click next again. You get GO TO THE NEXT PAGE

7 Click on the heading of the V column to highlight the column. In the variable name slot, type over V to change the name V to ACHIEVE. If numeric is not displayed in the data format slot, click on the down arrow to the data format slot, use the slide to locate the numeric option, and highlight the numeric option. Next click the heading of the V column to highlight the column. Then type over V to change the name V to LOC. If numeric is not displayed in the data format slot, click on the down arrow to the data format slot, use the slide to locate the numeric option, and highlight the numeric option.. Finally click the heading of the V3 column to highlight the column. Then type over V3 to change the name V to SELF. If numeric is not displayed in the data format slot, click on the down arrow to the data format slot, use the slide to locate the numeric option, and highlight that option.

7 Once you have completed changing the names and, if necessary selecting the numeric option, you will see GO TO THE NEXT PAGE

73 Click next and then click finish. The data editor will look like this Click, file, and then save. A WINDOWS dialog box like the following will be displayed

74 Type HSB in the file name slot and click save. This saves the file in the *.sav format. This format will allow you to open hsb.sav directly into SPSS without going through the process described in this handout. Now you can use SPSS to do the required analysis.