SPECIAL TOPICS IN REGRESSION ANALYSIS

Size: px
Start display at page:

Download "SPECIAL TOPICS IN REGRESSION ANALYSIS"

Transcription

1 1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented in regression analysis. All methods share the common feature that (G 1) independent variables are needed to code the information contained in the G different groups. (a) dummy variable coding: use (G 1) dichotomous (0-1) variables to represent membership in the G groups; the group that receives a score of 0 on all the independent variables serves as the reference group: e.g., for G = 4: X1 X2 X3 G G G G the intercept of the regression is the mean for the reference group ( y ref ); the partial regression coefficients bi indicate by how much the mean for group i ( y i ) differs from the mean of the reference group, i.e., b 0 y ref b y y i i ref (b) effects coding: similar to dummy variable coding, except that the reference group is scored as a string of 1's;

2 2 e.g., for G = 4: X1 X2 X3 G G G G the intercept of the regression is the mean of the group means; the partial regression coefficients bi represent contrasts between the mean of group i and the mean of the group means, i.e., y b i i 0 y G b y y i i (c) (orthogonal) contrast coding: any contrast among a set of G means can be tested, i.e., c 1 1 c c G G s.t. c i 0 if possible, use (G 1) orthogonal contrasts to represent the information contained in the G different groups; e.g., for G = 4 two possible sets of contrasts are: (1) X1 X2 X3 G1 ½ ½ 0 G2 ½ ½ 0 G3 ½ 0 ½ G4 ½ 0 ½

3 3 (2) X1 X2 X3 G1 ½ ½ ¼ G2 ½ ½ ¼ G3 ½ ½ ¼ G4 ½ ½ ¼ as in effects coding, the intercept of the regression is the mean of the group means; the partial regression coefficients bi indicate the difference in means for the groups involved in the contrast; An example of testing for group differences using different approaches to scaling nominal variables: The dependent variable of interest is attitude toward using coupons for grocery shopping (AA). Group 1 never or almost never uses coupons, group 2 occasionally uses coupons, and group 3 frequently uses coupons. DATA coupon; [read in the data for AA and membership in one of the three groups] if group=2 then d1=1; else d1=0; if group=3 then d2=1; else d2=0; if group=2 then e1=1; else e1=0; if group=3 then e2=1; else e2=0; if group=1 then e1=-1; if group=1 then e2=-1; if group=1 then c1=1/3; else if group=2 then c1=1/3; else if group=3 then c1=-2/3; if group=1 then c2=.5; else if group=2 then c2=-.5; else if group=3 then c2=0; proc sort; by group; proc means; var AA; proc means; var AA; by group;

4 4 proc reg; model AA = D1 D2; proc reg; model AA = E1 E2; proc reg; model AA = C1 C2; run; N Mean Std Dev Minimum Maximum overall group group group (1) Dummy variable coding: [group 1 serves as the reference group; D1 is coded as 1 for group 2, zero otherwise; D2 as 1 for group 3, zero otherwise] Model: MODEL1 Dependent Variable: AA Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP D D

5 5 (2) Effects coding: [group 1 serves as the reference group; E1 is coded as 1 for group 2, E2 as 1 for group 3] Model: MODEL1 Dependent Variable: AA Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP E E (3) Contrast coding: [ C1 is the contrast between groups 1&2 and 3, coded as 1/3, 1/3, -2/3; C2 is the contrast between groups 1 and 2, coded as.5, -.5, 0 ] Model: MODEL1 Dependent Variable: AA Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP <.0001 C <.0001 C <.0001

6 6 N Mean Std Dev Minimum Maximum overall group group group mean of groups means Dummy variable coding: Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP D D Effects coding: Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP E E Contrast coding: Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP C C

7 7 Multiplicative models Consider a model in which a dependent variable Y is a function of two main effects (X1, X2) and their interaction (X1X2). The model is given by Y = a + b1x1 + b2x2 + b3x1x2 + e where a, b1, b2, and b3 are the estimated regression coefficients. Note that the interpretation of these coefficients is somewhat different from the no-interaction case: b1 the effect of X1 on Y when X2 is zero b2 the effect of X2 on Y when X1 is zero b3 the change in the effect of X1 on Y when X2 changes by one unit (or, equivalently, the change in the effect of X2 on Y when X1 changes by one unit) If the interaction is significant, the exact nature of the interaction can be investigated by testing the significance of the slope of Y on X1 (X2) for selected values of X2 (X1). For example, let's assume we are interested in the effect of X1 on Y for X2=x2. An estimate of this effect is given by b1 + b3x2. To get the standard error of this expression, use the fact that Var(b1 + b3x2) = Var(b1) + x2 2 Var(b3) + 2x2Cov(b1, b3) Issues in testing for interaction effects: (1) Mean-center the main effect variables before forming the multiplicative term to reduce potential problems with multicollinearity. (2) Do not standardize the variables if the invariance of relationships across groups is to be tested. (3) Be aware of the damaging effect of measurement error on tests of interaction effects. (4) Investigate the functional form of the interaction. (5) Make sure the variables are measured on an interval scale. (6) Check the statistical power of the test of interaction.

8 8 An example of testing for interactions: Expectancy-value attitude theory assumes that beliefs (BE) and evaluations (EV) combine multiplicatively to influence attitudes. This hypothesis is tested using beliefs about two positive consequences of using coupons, saving money on the grocery bill and thinking about oneself as a thrifty shopper. [mean-centered regression] DATA coupon; SET coupon; proc standard m=0 data=coupon out=couponmc; var be ev; data couponmc; set couponmc; beev=be*ev; proc corr; var aa be ev beev; proc reg; model aa = be ev beev / stb covb; run; quit; (a) BE and EV coded on 1-7 scales: Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum AA BE EV BEEV Pearson Correlation Coefficients / Prob > R under Ho: Rho=0 / N = 250 AA BE EV BEEV AA BE EV BEEV

9 9 Dependent Variable: AA Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP BE EV BEEV Covariance of Estimates COVB INTERCEP BE EV BEEV INTERCEP BE EV BEEV (b) BE and EV mean-centered: Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum AA BE EV BEEV

10 10 Pearson Correlation Coefficients / Prob > R under Ho: Rho=0 / N = 250 AA BE EV BEEV AA BE EV BEEV Dependent Variable: AA Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP BE EV BEEV Covariance of Estimates COVB INTERCEP BE EV BEEV INTERCEP BE EV BEEV

11 11 AA as a function of BE for different levels of EV AA BE

12 12 Mediation Consider three variables X, M, and Y and assume that they are related in the following way: X a M b c Y Path a is the direct effect of X on M, b is the direct effect of M on Y, and c is the direct effect of X on Y. The total effect of X on Y can be shown to equal c + ab, where c is the direct effect of X on Y and ab is the indirect effect of X on Y (via M). A mediator M is a variable that accounts for the relation between a predictor and a criterion. In the figure, M mediates the influence of X on Y if it channels at least some of the total effect of X on Y. Total mediation occurs if there is no direct effect of X on Y (i.e., the total effect of X on Y is completely accounted for by the indirect effect of X on Y through the mediator M). Partial mediation occurs if the total effect of X on Y is due to both a direct and indirect effect of X on Y. Testing for mediation in the traditional Baron and Kenny (1986) framework: (1) Show that, in a regression of Y on X, Y is significantly related to X. This establishes that there is an effect that can be mediated. The regression coefficient from this regression is an estimate of the total effect of X on Y. [Note: This step is problematic when there is inconsistent or competitive mediation.] (2) Show that, in a regression of M on X, M is significantly related to X. If M is to channel the influence of X on Y, it has to be related to X. The regression coefficient from this regression is an estimate of a. (3) Show that when Y is regressed on both X and M, M affects Y (i.e., the regression coefficient for M, which is an estimate of b, is nonzero) and the direct influence of X on Y (the regression coefficient for X is an estimate of c) is smaller than the total effect. For complete mediation, the direct effect of X on Y should be negligible; for partial mediation, the direct effect of X on Y should be smaller than the total effect.

13 13 Testing for mediation based on the significance of the indirect effect: Since the extent of mediation is defined as the difference between the total effect of X on Y and the direct effect of X on Y and since this difference is equal to the product of a and b, mediation can also be checked by testing whether ab is different from zero. This test (often called the Sobel test) is given by b s 2 2 a ab a s 2 2 b s s 2 2 a b where sa and sb are the standard errors of a and b. This ratio can be treated as a standard normal variate. Because of problems with the normality assumption, a bootstrap test is preferable. Issues in testing for mediation: (1) The mediator M should be neither too close to nor too distant from either the predictor variable X or the criterion variable Y. In particular, although X has to be related to M in order for mediation to occur, the relationship should not be too strong, otherwise the independent variables in the last regression will be collinear. (2) It is assumed that X, M, and Y are related as shown in the Figure. If X is a manipulated variable, it is safe to assume that X influences M and Y and not the other way around. However, M and Y are usually measured variables and Y should not influence M. (3) It is assumed that there is no measurement error in the mediator. If this assumption is incorrect, the estimated effects will be biased. (4) It is assumed that no important influences on M and Y have been omitted from the model specification. (5) If X is manipulated, it is exogenous; however, in general it is not safe to assume that the errors of M and Y are uncorrelated. Note: Recent research has (a) investigated under what conditions mediation analyses can be given a causal interpretation; (b) extended mediation to situations in which there are exposuremediator interactions; and (c) considered the case where the mediator and the outcome variable are not continuous (e.g., binary, counts, etc.). See Valeri and VanderWeele (2013) for details.

14 14 An example of testing for mediation: According to the Theory of Reasoned Action (TRA), behavioral intentions (BI) mediate the effects of attitudes (AA) on behavior (BH). This hypothesis is tested in the context of using coupons for grocery shopping. (1) Regression analysis: %include 'd:\m554\programs\process.sas'; DATA mediation; INFILE 'd:\m554\specreg\sem.dat' PAD; INPUT ID BE1 BE2 BE3 BE4 BE5 BE6 BE7 AA1 AA2 AA3 AA4 BI1 BI2 BH; aa=(aa1+aa2+aa3+aa4)/4; bi=(bi1+bi2)/2; proc corr cov; var aa bi bh; proc reg; model bh=aa / stb covb; proc reg; model bi=aa / stb covb; proc reg; model bh=aa bi / stb covb; %process (data=mediation,vars=bh bi aa, y=bh, x=aa, m=bi, model=4, normal=1,varorder=2,total=1,effsize=1,boot=10000,conf=95); RUN; Pearson Correlation Coefficients / Prob > R under Ho: Rho=0 / N = 250 AA BI BH AA BI BH

15 15 Step 1: Dependent Variable: BH Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP AA Step 2: Dependent Variable: BI Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP AA Step 3: Dependent Variable: BH Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total

16 16 Root MSE R-square Dep Mean Adj R-sq C.V Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP AA BI Output of Process macro: ************************* PROCESS Procedure for SAS Release 2.10 ************************ Model and Variables Model = 4 Y = BH X = AA M = BI Sample size: 250 ***************************************************************************************** Outcome: BI Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI Constant AA ***************************************************************************************** Outcome: BH Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant BI AA

17 17 ********************************* TOTAL EFFECT MODEL ********************************* Outcome: BH Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant AA *************************** TOTAL, DIRECT AND INDIRECT EFFECTS *************************** Total effect of X on Y Effect SE t p LLCI ULCI Direct effect of X on Y Effect SE t p LLCI ULCI Indirect effect of X on Y Effect Boot SE BootLLCI BootULCI BI Partially standardized indirect effect of X on Y Effect Boot SE BootLLCI BootULCI BI Completely standardized indirect effect of X on Y Effect Boot SE BootLLCI BootULCI BI Ratio of indirect to total effect of X on Y Effect Boot SE BootLLCI BootULCI BI Ratio of indirect to direct effect of X on Y Effect Boot SE BootLLCI BootULCI BI R-squared mediation effect size Effect Boot SE BootLLCI BootULCI BI

18 18 Preacher and Kelley (2011) Kappa-squared Effect Boot SE BootLLCI BootULCI BI Normal theory test for indirect effect Effect se Z p ****************************** ANALYSIS NOTES AND WARNINGS ****************************** Number of bootstrap samples for bias corrected bootstrap confidence intervals: Level of confidence for all confidence intervals in output: (2) Structural equation modeling: proc calis data=mediation EFFPART; path BI <--- AA = ga11, BH <--- BI AA = be21 ga21; pvar BI = psi11, BH = psi22, AA = ph11; run; The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation PATH List Standard Path Parameter Estimate Error t Value bi <=== aa ga BH <=== bi be BH <=== aa ga Variance Parameters Variance Standard Type Variable Parameter Estimate Error t Value Error bi psi BH psi Exogenous aa ph

19 19 Squared Multiple Correlations Error Total Variable Variance Variance R-Square BH bi Total Effects Effect / Std Error / t Value / p Value bi aa BH <.0001 <.0001 bi <.0001 Direct Effects Effect / Std Error / t Value / p Value bi aa BH < bi <.0001 Indirect Effects Effect / Std Error / t Value / p Value bi aa BH <.0001 bi 0 0

20 20 Moderation A moderator is a variable that affects the direction and/or strength of the relation between a predictor and a criterion. X Y Z Testing for moderation: (1) Both the moderator and the independent variable are categorical: Use ANOVA and check whether the interaction between the moderator and the independent variable is significant. (2) The moderator is categorical, the independent variable is continuous: Use regression analysis with interaction terms between the independent variable and the moderator. If there is differential measurement error in the independent variable across groups, use structural equation modeling. (3) The moderator is continuous, the independent variable is categorical or continuous: The appropriate test depends on the nature of the moderator effect. X Y X Y X Y Z Z Z

21 21 If the effect of X on Y varies linearly as a function of the moderator (case 1), moderation is tested by including the product of X and Z in the regression. If the moderator effect is nonlinear (case 2), higher-order interactions (e.g., XZ 2 ) have to be included in the regression. If the moderator effect takes the form of a stepfunction (case 3), the moderator should be dichotomized at the point where the step occurs. Issues in testing for moderator effects: (1) Multicollinearity may be a problem (esp. when higher-order interactions are included). Mean-center the independent variable and the moderator before forming the interaction term(s). (2) If the interaction is significant, do follow-up tests to investigate the nature of the interaction effect. (3) It is dangerous to compare correlations across groups because (a) the variance of the independent variable may not be constant across groups and (b) measurement error may not be constant across groups. Mediator vs. moderator effects: a. An interest in moderator variables reflects a search for the conditional boundaries of an effect (when question). An interest in mediator variables reflects a concern with the processes underlying an effect (how and why question). b. Finding a moderator of some relationship may stimulate thinking about why or how this occurs (moderation to mediation). Specifying a mediational mechanism between two variables may have implications for when this effect is likely to occur (mediation to moderation). c. Moderated mediation: The mediational effect of some variable varies across levels of the moderator. d. Mediated moderation: A variable mediates the influence of a moderator on another variable. For details of how to test for moderated mediation or mediated moderation, see Edwards and Lambert (2007) and Preacher, Rucker, and Hayes (2007), as well as some of the other papers listed in the syllabus. Also, see the description of the PROCESS macro for an overview of possible models to be tested.

22 22 An example of testing for moderation: According to the theory of action control, people differ in their capacity for action control. People with high self-regulatory capacity are called action-oriented; people with low selfregulatory capacity state-oriented. Action orientation reflects readiness to act; state-orientation inertia to act. Here we test the hypothesis that action-/state-orientation (ASO) moderates the effects of attitudes (AA) and subjective norms (SN) on behavioral intentions (BI). The context is people's usage of coupons for grocery shopping. %include 'd:\m554\programs\process.sas'; OPTIONS LS=100; DATA COUPON; INFILE 'd:\m554\specreg\coupaso.raw' PAD; INPUT obs ASO AA1 AA2 AA3 SN1 SN2 PB BI1 BI2 BH; AA=(AA1+AA2+AA3)/3; SN=(SN1+SN2)/2; BI=(BI1+BI2)/2; PROC STANDARD DATA=COUPON OUT=COUPON M=0; VAR AA SN ASO; proc univariate; var aso; DATA COUPON; SET COUPON; IF ASO<0.208 THEN DASO=0; ELSE DASO=1; AABYASO=AA*ASO; SNBYASO=SN*ASO; AABYDASO=AA*DASO; SNBYDASO=SN*DASO; proc sort; by daso; proc corr; var bi; with aa sn; by daso; PROC REG; MODEL BI = AA SN DASO AABYDASO SNBYDASO / STB COVB; PROC REG; MODEL BI = AA SN ASO AABYASO SNBYASO / STB COVB; run;

23 23 title 'aa as the focal variable, daso as the moderator'; run; %process (data=coupon,vars=bi aa sn daso snbydaso,y=bi,x=aa,m=daso,model=1,plot=1); run; title 'sn as the focal variable, daso as the moderator'; run; %process (data=coupon,vars=bi aa sn daso aabydaso,y=bi,x=sn,m=daso,model=1,plot=1); run; title 'aa as the focal variable, aso as the moderator'; run; %process (data=coupon,vars=bi aa sn aso snbyaso,y=bi,x=aa,m=aso,model=1,center=1,jn=1,plot=1); run; title 'sn as the focal variable, aso as the moderator'; run; %process (data=coupon,vars=bi aa sn aso aabyaso,y=bi,x=sn,m=aso,model=1,center=1,jn=1,plot=1); run; quit; (a) Moderator (ASO) as a dichotomous variable: The UNIVARIATE Procedure Variable: ASO Moments N 149 Sum Weights 149 Mean 0 Sum Observations 0 Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation. Std Error Mean Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range

24 24 Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min DASO= Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum AA SN BI Pearson Correlation Coefficients, N = 64 Prob > r under H0: Rho=0 BI AA SN < DASO= Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum AA SN BI Pearson Correlation Coefficients, N = 85 Prob > r under H0: Rho=0 BI AA <.0001 SN

25 25 Dependent Variable: BI Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V E17 Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP AA SN DASO AABYDASO SNBYDASO Covariance of Estimates COVB INTERCEP AA SN INTERCEP AA SN DASO AABYDASO SNBYDASO COVB DASO AABYDASO SNBYDASO INTERCEP AA SN DASO AABYDASO SNBYDASO

26 26 Output from the Process macro: ************************* PROCESS Procedure for SAS Release 2.10 ************************ Model and Variables Model = 1 Y = BI X = AA M = DASO Statistical controls: SN SNBYDASO Sample size: 149 ***************************************************************************************** Outcome: BI Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant DASO AA INT_ SN SNBYDASO Interactions: INT_1 AA X DASO R-square increase due to interaction(s): R2-chng F df1 df2 p INT_ ***************************************************************************************** Conditional effect of X on Y at values of the moderator(s) DASO Effect se t p LLCI ULCI Values for quantitative moderators are the mean and plus/minus one SD from mean. Values for dichotomous moderators are the two values of the moderator.

27 27 ***************************************************************************************** Data for visualizing conditional effect of X on Y AA DASO yhat Estimates in this table are based on setting covariates to their sample means ****************************** ANALYSIS NOTES AND WARNINGS ****************************** Level of confidence for all confidence intervals in output: ************************* PROCESS Procedure for SAS Release 2.10 ************************ Model and Variables Model = 1 Y = BI X = SN M = DASO Statistical controls: AA AABYDASO Sample size: 149 ***************************************************************************************** Outcome: BI Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant DASO SN INT_ AA AABYDASO Interactions: INT_1 SN X DASO

28 28 R-square increase due to interaction(s): R2-chng F df1 df2 p INT_ ***************************************************************************************** Conditional effect of X on Y at values of the moderator(s) DASO Effect se t p LLCI ULCI Values for quantitative moderators are the mean and plus/minus one SD from mean. Values for dichotomous moderators are the two values of the moderator. ***************************************************************************************** Data for visualizing conditional effect of X on Y SN DASO yhat Estimates in this table are based on setting covariates to their sample means ****************************** ANALYSIS NOTES AND WARNINGS ****************************** Level of confidence for all confidence intervals in output: (b) Moderator (ASO) as a continuous variable: Dependent Variable: BI Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V E17

29 29 Parameter Estimates Parameter Standard T for H0: Standardized Variable DF Estimate Error Parameter=0 Prob > T Estimate INTERCEP AA SN ASO AABYASO SNBYASO Covariance of Estimates COVB INTERCEP AA SN INTERCEP AA SN ASO AABYASO SNBYASO COVB ASO AABYASO SNBYASO INTERCEP AA SN ASO AABYASO SNBYASO Output from the Process macro: ************************* PROCESS Procedure for SAS Release 2.10 ************************ Model and Variables Model = 1 Y = BI X = AA M = ASO Statistical controls: SN SNBYASO Sample size: 149

30 30 ***************************************************************************************** Outcome: BI Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant ASO AA INT_ SN SNBYASO Interactions: INT_1 AA X ASO R-square increase due to interaction(s): R2-chng F df1 df2 p INT_ ***************************************************************************************** Conditional effect of X on Y at values of the moderator(s) ASO Effect se t p LLCI ULCI Values for quantitative moderators are the mean and plus/minus one SD from mean. Values for dichotomous moderators are the two values of the moderator. ******************************** JOHNSON-NEYMAN TECHNIQUE ******************************** Moderator values(s) defining Johnson-Neyman significance region(s) Value % below % above Conditional effect of X on Y at values of the moderator (M) ASO Effect se t p LLCI ULCI

31 ***************************************************************************************** Data for visualizing conditional effect of X on Y AA ASO yhat Estimates in this table are based on setting covariates to their sample means ****************************** ANALYSIS NOTES AND WARNINGS ****************************** Level of confidence for all confidence intervals in output: NOTE: The following variables were mean centered prior to analysis: AA ASO ************************* PROCESS Procedure for SAS Release 2.10 ************************ Model and Variables Model = 1 Y = BI X = SN M = ASO Statistical controls: AA AABYASO Sample size: 149

32 32 ***************************************************************************************** Outcome: BI Model Summary R R-sq F df1 df2 p Model coeff se t p LLCI ULCI constant ASO SN INT_ AA AABYASO Interactions: INT_1 SN X ASO R-square increase due to interaction(s): R2-chng F df1 df2 p INT_ ***************************************************************************************** Conditional effect of X on Y at values of the moderator(s) ASO Effect se t p LLCI ULCI Values for quantitative moderators are the mean and plus/minus one SD from mean. Values for dichotomous moderators are the two values of the moderator. ******************************** JOHNSON-NEYMAN TECHNIQUE ******************************** Moderator values(s) defining Johnson-Neyman significance region(s) Value % below % above Conditional effect of X on Y at values of the moderator (M) ASO Effect se t p LLCI ULCI

33 ***************************************************************************************** Data for visualizing conditional effect of X on Y SN ASO yhat Estimates in this table are based on setting covariates to their sample means ****************************** ANALYSIS NOTES AND WARNINGS ****************************** Level of confidence for all confidence intervals in output: NOTE: The following variables were mean centered prior to analysis: SN ASO

Effect of Centering and Standardization in Moderation Analysis

Effect of Centering and Standardization in Moderation Analysis Effect of Centering and Standardization in Moderation Analysis Raw Data The CORR Procedure 3 Variables: govact negemot Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Label govact 4.58699

More information

This supplement to. Hayes, A. F. (2017). Partial, conditional, and moderated moderated mediation: Quantification,

This supplement to. Hayes, A. F. (2017). Partial, conditional, and moderated moderated mediation: Quantification, This supplement to Hayes, A. F. (2017). Partial, conditional, and moderated moderated mediation: Quantification, inference, and interpretation. Manuscript in review contains versions of Appendices 1, 2,

More information

Moderation & Mediation in Regression. Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education

Moderation & Mediation in Regression. Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education Moderation & Mediation in Regression Pui-Wa Lei, Ph.D Professor of Education Department of Educational Psychology, Counseling, and Special Education Introduction Mediation and moderation are used to understand

More information

Outline

Outline 2559 Outline cvonck@111zeelandnet.nl 1. Review of analysis of variance (ANOVA), simple regression analysis (SRA), and path analysis (PA) 1.1 Similarities and differences between MRA with dummy variables

More information

Failure Time of System due to the Hot Electron Effect

Failure Time of System due to the Hot Electron Effect of System due to the Hot Electron Effect 1 * exresist; 2 option ls=120 ps=75 nocenter nodate; 3 title of System due to the Hot Electron Effect ; 4 * TIME = failure time (hours) of a system due to drift

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

3 Variables: Cyberloafing Conscientiousness Age

3 Variables: Cyberloafing Conscientiousness Age title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable

More information

Handout 1: Predicting GPA from SAT

Handout 1: Predicting GPA from SAT Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data

More information

Mediation question: Does executive functioning mediate the relation between shyness and vocabulary? Plot data, descriptives, etc. Check for outliers

Mediation question: Does executive functioning mediate the relation between shyness and vocabulary? Plot data, descriptives, etc. Check for outliers Plot data, descriptives, etc. Check for outliers A. Nayena Blankson, Ph.D. Spelman College University of Southern California GC3 Lecture Series September 6, 2013 Treat missing i data Listwise Pairwise

More information

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

EXST Regression Techniques Page 1. We can also test the hypothesis H : œ 0 versus H : EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically

More information

Mplus Code Corresponding to the Web Portal Customization Example

Mplus Code Corresponding to the Web Portal Customization Example Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,

More information

Table 1: Fish Biomass data set on 26 streams

Table 1: Fish Biomass data set on 26 streams Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Chapter 8 (More on Assumptions for the Simple Linear Regression) EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually

More information

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES Normal Error RegressionModel : Y = β 0 + β ε N(0,σ 2 1 x ) + ε The Model has several parts: Normal Distribution, Linear Mean, Constant Variance,

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg data one; input id Y group X; I1=0;I2=0;I3=0;if group=1 then I1=1;if group=2 then I2=1;if group=3 then I3=1; IINT1=I1*X;IINT2=I2*X;IINT3=I3*X; *************************************************************************;

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

Chapter 8 Quantitative and Qualitative Predictors

Chapter 8 Quantitative and Qualitative Predictors STAT 525 FALL 2017 Chapter 8 Quantitative and Qualitative Predictors Professor Dabao Zhang Polynomial Regression Multiple regression using X 2 i, X3 i, etc as additional predictors Generates quadratic,

More information

Multicollinearity Exercise

Multicollinearity Exercise Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] You do not need to submit any output, so there

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Statistics 5100 Spring 2018 Exam 1

Statistics 5100 Spring 2018 Exam 1 Statistics 5100 Spring 2018 Exam 1 Directions: You have 60 minutes to complete the exam. Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

In Class Review Exercises Vartanian: SW 540

In Class Review Exercises Vartanian: SW 540 In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 8, 2014 List of Figures in this document by page: List of Figures 1 Popcorn data............................. 2 2 MDs by city, with normal quantile

More information

EXST7015: Estimating tree weights from other morphometric variables Raw data print

EXST7015: Estimating tree weights from other morphometric variables Raw data print Simple Linear Regression SAS example Page 1 1 ********************************************; 2 *** Data from Freund & Wilson (1993) ***; 3 *** TABLE 8.24 : ESTIMATING TREE WEIGHTS ***; 4 ********************************************;

More information

Topic 18: Model Selection and Diagnostics

Topic 18: Model Selection and Diagnostics Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables

More information

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs

More information

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D. Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D. Curve Fitting Mediation analysis Moderation Analysis 1 Curve Fitting The investigation of non-linear functions using

More information

Measuring relationships among multiple responses

Measuring relationships among multiple responses Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

ssh tap sas913, sas

ssh tap sas913, sas B. Kedem, STAT 430 SAS Examples SAS8 ===================== ssh xyz@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm Multiple Regression ====================== 0. Show

More information

Lecture 11 Multiple Linear Regression

Lecture 11 Multiple Linear Regression Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression

More information

Hacking PROCESS for Estimation and Probing of Linear Moderation of Quadratic Effects and Quadratic Moderation of Linear Effects

Hacking PROCESS for Estimation and Probing of Linear Moderation of Quadratic Effects and Quadratic Moderation of Linear Effects Hacking PROCESS for Estimation and Probing of Linear Moderation of Quadratic Effects and Quadratic Moderation of Linear Effects Andrew F. Hayes The Ohio State University Unpublished White Paper, DRAFT

More information

Classification & Regression. Multicollinearity Intro to Nominal Data

Classification & Regression. Multicollinearity Intro to Nominal Data Multicollinearity Intro to Nominal Let s Start With A Question y = β 0 + β 1 x 1 +β 2 x 2 y = Anxiety Level x 1 = heart rate x 2 = recorded pulse Since we can all agree heart rate and pulse are related,

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

14.32 Final : Spring 2001

14.32 Final : Spring 2001 14.32 Final : Spring 2001 Please read the entire exam before you begin. You have 3 hours. No books or notes should be used. Calculators are allowed. There are 105 points. Good luck! A. True/False/Sometimes

More information

Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression

Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression 1 Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 16, 2015 2 The Spirit of

More information

CHAPTER 4: Forecasting by Regression

CHAPTER 4: Forecasting by Regression CHAPTER 4: Forecasting by Regression Prof. Alan Wan 1 / 57 Table of contents 1. Revision of Linear Regression 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation

More information

Regression without measurement error using proc calis

Regression without measurement error using proc calis Regression without measurement error using proc calis /* calculus2.sas */ options linesize=79 pagesize=500 noovp formdlim='_'; title 'Calculus 2: Regression with no measurement error'; title2 ''; data

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

STOR 455 STATISTICAL METHODS I

STOR 455 STATISTICAL METHODS I STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted

More information

Lab # 11: Correlation and Model Fitting

Lab # 11: Correlation and Model Fitting Lab # 11: Correlation and Model Fitting Objectives: 1. Correlations between variables 2. Data Manipulation, creation of squares 3. Model fitting with regression 4. Comparison of models Correlations between

More information

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen April 22, 2008 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed. EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen November 20, 2007 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Biological Applications of ANOVA - Examples and Readings

Biological Applications of ANOVA - Examples and Readings BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 1 ANOVA Pac Biological Applications of ANOVA - Examples and Readings One-factor Model I (Fixed Effects) This is the same example for

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Lecture 8: Instrumental Variables Estimation

Lecture 8: Instrumental Variables Estimation Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano

More information

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

5.3 Three-Stage Nested Design Example

5.3 Three-Stage Nested Design Example 5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Chapter 6 Multiple Regression

Chapter 6 Multiple Regression STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

SAS Example 3: Deliberately create numerical problems

SAS Example 3: Deliberately create numerical problems SAS Example 3: Deliberately create numerical problems Four experiments 1. Try to fit this model, failing the parameter count rule. 2. Set φ 12 =0 to pass the parameter count rule, but still not identifiable.

More information

Model Selection Procedures

Model Selection Procedures Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM) SEM is a family of statistical techniques which builds upon multiple regression,

More information

Interactions among Continuous Predictors

Interactions among Continuous Predictors Interactions among Continuous Predictors Today s Class: Simple main effects within two-way interactions Conquering TEST/ESTIMATE/LINCOM statements Regions of significance Three-way interactions (and beyond

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals

More information

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

Lecture 12 Inference in MLR

Lecture 12 Inference in MLR Lecture 12 Inference in MLR STAT 512 Spring 2011 Background Reading KNNL: 6.6-6.7 12-1 Topic Overview Review MLR Model Inference about Regression Parameters Estimation of Mean Response Prediction 12-2

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO

More information

ST Correlation and Regression

ST Correlation and Regression Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

Specification Error: Omitted and Extraneous Variables

Specification Error: Omitted and Extraneous Variables Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information