Topic 23: Diagnostics and Remedies

Size: px
Start display at page:

Download "Topic 23: Diagnostics and Remedies"

Transcription

1 Topic 23: Diagnostics and Remedies

2 Outline Diagnostics residual checks ANOVA remedial measures

3 Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and adapt them to the ANOVA setting Many things are essentially the same Some things require modifications

4 Residuals Predicted values are cell means Yˆ = Y Residuals are the differences between the observed values and their cell means e = Y Yˆ = Y Y ij ij ij ij i ij i

5 Basic plots Plot the responses vs the factor levels (the values of the explanatory variable X) Plot the residuals vs the factor levels Construct a normal quantile plot and/or histogram of the residuals

6 KNNL p 777 KNNL Example Compare 4 brands of rust inhibitor (X has r=4 levels) Response variable is a measure of the effectiveness of the inhibitor There are 10 units per brand (n i =n=10)

7 Fit ANOVA Model proc glm data=a1; class brand; model eff=brand; output out=a2 r=resid; run;

8 Plots Responses versus the factor Residuals versus the factor Normal quantile plot or histogram of the residuals

9 Plots vs the factor symbol1 v=circle i=none; proc gplot data=a2; plot (eff resid)*brand; run;

10 Data vs the factor Means look different common spread in Y s

11 Residuals vs the factor Odd dist of points

12 Snake shape due to odd (lack of and large)spread Can try nonparametric analysis last slides

13 Looks reasonably Normal

14 Response conditionally assumed to be Normal not marginally. This plot noninformative.

15 General Summary Look for Outliers Variance that depends on level Non-normal errors Plot residuals vs time and other variables if available

16 Homogeneity tests Homogeneity of variance (also called homoscedasticity) H 0 : σ 12 = σ 22 = = σ 2 r H 1 : not all σ i2 are equal Several significance tests are available.

17 Homogeneity tests Text discusses Hartley, modified Levene SAS has several including Bartlett s (essentially the likelihood ratio test) and several versions of Levene

18 Homogeneity tests There is a problem with assumptions ANOVA is robust with respect to moderate deviations from Normality ANOVA results can be sensitive to the homogeneity of variance assumption Some homogeneity tests are sensitive to the Normality assumption

19 Levene s Test Do ANOVA on the squared residuals from the original ANOVA Modified Levene s test uses absolute values of the residuals Modified Levene s test is recommended Another quick and dirty rule of thumb ( 2) ( 2) i i Concern if max s / min s > 2

20 KNNL p 785 KNNL Example Compare the strengths of 5 types of solder flux (X has r=5 levels) Response variable is the pull strength, force in pounds required to break the joint There are 8 solder joints per flux (n=8)

21 Scatterplot Equal spread?

22 Levene s Test proc glm data=a1; class type; model strength=type; means type/ hovtest=levene(type=abs); run;

23 ANOVA Table Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Common variance estimated to be 2.11

24 Output Levene's Test for Homogeneity of strength Variance ANOVA of Absolute Deviations from Group Means Source Sum of DF Squares Mean Square F Value Pr > F type Error We reject the null hypothesis and assume nonconstant variance

25 Means and SDs Level of strength type N Mean Std Dev s / s = >>

26 Remedies Delete outliers Is their removal important? Is their removal valid? Use weights (weighted regression) Transformations Nonparametric procedures

27 What to do here? Not really any obvious outliers Do not see pattern of increasing or decreasing variance or skewed dists Will consider Weighted ANOVA Mixed model ANOVA

28 Weighted least squares We used this with regression Obtain model for how the sd depends on the explanatory variable (plotted absolute value of residual vs x) Then used weights inversely proportional to the estimated variance

29 Weighted Least Squares Here we can 1. Compute the variance for each level of X 2. Use these as weights in PROC GLM We will illustrate with the soldering example

30 Obtain the variances and weights proc means data=a1; var strength; by type; output out=a2 var=s2; data a2; set a2; wt=1/s2; NOTE. Data set a2 has 5 cases

31 Merge and then use the weights in PROC GLM data a3; merge a1 a2; by type; proc glm data=a3; class type; model strength=type; weight wt; lsmeans type / cl; run;

32 Output Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Data have been standardized so all groups have variance of 1

33 LSMEANS Output Because of weights, standard errors simply based on sample variances of each level(but uses df=35!) type strength LSMEAN Standard Error Pr > t 95% Confidence Limits < < < < <

34 Mixed Model ANOVA Relax the assumption of constant variance rather than including a known weight This involves moving to a mixed model procedure Topic will not be on exam but wanted you to be aware of these model capabilities for your own research

35 SAS Code proc glimmix data=a1; class type; model strength=type / ddfm=kr; random residual / group=type; lsmeans type; run; This allows the variance to differ in each level and a degrees of freedom adjustment is used to account for this

36 GLIMMIX OUTPUT Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) CAIC (smaller is better) HQIC (smaller is better) Generalized Chi-Square Gener. Chi-Square / DF 1.00 Covariance Parameter Estimates Standard Cov Parm Group Estimate Error Residual (VC) type Residual (VC) type Residual (VC) type Residual (VC) type Residual (VC) type Really looks like 3 groups of variances though

37 GLIMMIX OUTPUT Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F type <.0001 type Least Squares Means type Estimate Standard Error DF t Value Pr > t < < < < <.0001 Same estimates but now DF have been adjusted for estimating the variances

38 SAS Code proc glimmix data=a1; class type; model strength=type / ddfm=kr; random residual / group=type1; lsmeans type; run; Variable Type1 was created to identify Type 1 and 2, Type 3, and Type 4 and 5 as 3 different groups

39 GLIMMIX OUTPUT Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) CAIC (smaller is better) HQIC (smaller is better) Generalized Chi-Square Gener. Chi-Square / DF 1.00 Covariance Parameter Estimates Standard Cov Parm Group Estimate Error Residual (VC) Grp Residual (VC) Grp Residual (VC) Grp Better BIC so better fitting model but same general conclusions

40 GLIMMIX OUTPUT Type III Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F type <.0001 type Least Squares Means type Estimate Standard Error DF t Value Pr > t < < < < <.0001 Why are df for some groups 14?

41 Transformation Guides When σ i2 is proportional to μ i, use When σ i is proportional to μ i, use log(y) Y When σ i is proportional to μ i2, use 1/y For proportions, use arcsin( Y ) arsin(sqrt(y)) in a SAS data step Box-Cox transformation

42 Example Consider study on KNNL pg 790 Y: time between computer failures X: three locations data a3; infile 'u:\ch18ta05.txt'; input time location interval; proc gplot; plot time*location; run;

43 Scatterplot Outlier or skewed distribution? Can consider transformation first

44 Box-Cox Transformation Can consider regression log( s ) vs log( y ) i i and 1-b 1 is the power to raise Y by Can try various convenient powers Can use SAS directly to calculate the power

45 E(logsig) = logmu Power should be

46 Using SAS proc transreg data=a3; model boxcox(time / lambda=-2 to 2 by.2) = class(location); run;

47 Box-Cox Transformation Information for time Lambda R-Square Log Like Output * * < * < - Best Lambda * - 95% Confidence Interval + - Convenient Lambda

48 Transforming data in SAS data a3; set a3; transtime = time**0.20; symbol1 v=circle i=none; proc gplot; plot transtime*location; run;

49 Much more constant spread in data!

50 Nonparametric approach Based on ranks See KNNL Section 18.7, p 795 See the SAS procedure NPAR1WAY

51 Rust Inhibitor Analysis Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Highly significant F test. Even if there is a violation of Normality, the evidence is overwhelming

52 Nonparametric Analysis Wilcoxon Scores (Rank Sums) for Variable eff Classified by Variable abrand abrand N Sum of Scores Expected Under H0 Std Dev Under H0 Mean Score Average scores were used for ties. Kruskal-Wallis Test Chi-Square DF 3 Pr > Chi-Square <.0001

53 Last slide We ve finished most of Chapters 17 and 18. We used program topic23.sas to generate the output.

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013 Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Topic 32: Two-Way Mixed Effects Model

Topic 32: Two-Way Mixed Effects Model Topic 3: Two-Way Mixed Effects Model Outline Two-way mixed models Three-way mixed models Data for two-way design Y is the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to

More information

Lecture 4. Checking Model Adequacy

Lecture 4. Checking Model Adequacy Lecture 4. Checking Model Adequacy Montgomery: 3-4, 15-1.1 Page 1 Model Checking and Diagnostics Model Assumptions 1 Model is correct 2 Independent observations 3 Errors normally distributed 4 Constant

More information

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X. One-Way Analysis of Variance (ANOVA) Also called single factor ANOVA. The response variable Y is continuous (same as in regression). There are two key differences regarding the explanatory variable X.

More information

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals

More information

Topic 28: Unequal Replication in Two-Way ANOVA

Topic 28: Unequal Replication in Two-Way ANOVA Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

Topic 14: Inference in Multiple Regression

Topic 14: Inference in Multiple Regression Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference

More information

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects Topic 5 - One-Way Random Effects Models One-way Random effects Outline Model Variance component estimation - Fall 013 Confidence intervals Topic 5 Random Effects vs Fixed Effects Consider factor with numerous

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

a = 4 levels of treatment A = Poison b = 3 levels of treatment B = Pretreatment n = 4 replicates for each treatment combination

a = 4 levels of treatment A = Poison b = 3 levels of treatment B = Pretreatment n = 4 replicates for each treatment combination In Box, Hunter, and Hunter Statistics for Experimenters is a two factor example of dying times for animals, let's say cockroaches, using 4 poisons and pretreatments with n=4 values for each combination

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan Topic 22 - Interaction in Two Factor ANOVA - Fall 2013 Outline Strategies for Analysis when interaction not present when interaction present when n ij = 1 when factor(s) quantitative Topic 22 2 General

More information

Topic 18: Model Selection and Diagnostics

Topic 18: Model Selection and Diagnostics Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables

More information

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR= = -/\<>*; ODS LISTING; dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************; *** Moore, David

More information

Analysis of Covariance

Analysis of Covariance Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2

More information

Topic 29: Three-Way ANOVA

Topic 29: Three-Way ANOVA Topic 29: Three-Way ANOVA Outline Three-way ANOVA Data Model Inference Data for three-way ANOVA Y, the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Factor C with levels

More information

One-way ANOVA Model Assumptions

One-way ANOVA Model Assumptions One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

Outline Topic 21 - Two Factor ANOVA

Outline Topic 21 - Two Factor ANOVA Outline Topic 21 - Two Factor ANOVA Data Model Parameter Estimates - Fall 2013 Equal Sample Size One replicate per cell Unequal Sample size Topic 21 2 Overview Now have two factors (A and B) Suppose each

More information

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with

More information

In many situations, there is a non-parametric test that corresponds to the standard test, as described below:

In many situations, there is a non-parametric test that corresponds to the standard test, as described below: There are many standard tests like the t-tests and analyses of variance that are commonly used. They rest on assumptions like normality, which can be hard to assess: for example, if you have small samples,

More information

Lec 3: Model Adequacy Checking

Lec 3: Model Adequacy Checking November 16, 2011 Model validation Model validation is a very important step in the model building procedure. (one of the most overlooked) A high R 2 value does not guarantee that the model fits the data

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

Assessing Model Adequacy

Assessing Model Adequacy Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for inferences. In cases where some assumptions are violated, there are

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

Topic 2. Chapter 3: Diagnostics and Remedial Measures

Topic 2. Chapter 3: Diagnostics and Remedial Measures Topic Overview This topic will cover Regression Diagnostics Remedial Measures Statistics 512: Applied Linear Models Some other Miscellaneous Topics Topic 2 Chapter 3: Diagnostics and Remedial Measures

More information

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Chapter 11. Analysis of Variance (One-Way)

Chapter 11. Analysis of Variance (One-Way) Chapter 11 Analysis of Variance (One-Way) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result

More information

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Asheber Abebe Discrete and Statistical Sciences Auburn University Contents 1 Completely Randomized Design

More information

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric

More information

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS 1 WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS I. Single-factor designs: the model is: yij i j ij ij where: yij score for person j under treatment level i (i = 1,..., I; j = 1,..., n) overall mean βi treatment

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

PARTICLE COUNTS MISLEAD SCIENTISTS: A JMP AND SAS INTEGRATION CASE STUDY INVESTIGATING MODELING

PARTICLE COUNTS MISLEAD SCIENTISTS: A JMP AND SAS INTEGRATION CASE STUDY INVESTIGATING MODELING PARTICLE COUNTS MISLEAD SCIENTISTS: A JMP AND SAS INTEGRATION CASE STUDY INVESTIGATING MODELING Kathleen Kiernan, Diane Michelson, PhD, Annie Zangi, SAS Many researchers and statisticians deal with data

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Checking model assumptions with regression diagnostics

Checking model assumptions with regression diagnostics @graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor

More information

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

Odor attraction CRD Page 1

Odor attraction CRD Page 1 Odor attraction CRD Page 1 dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************;

More information

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek Two-factor studies STAT 525 Chapter 19 and 20 Professor Olga Vitek December 2, 2010 19 Overview Now have two factors (A and B) Suppose each factor has two levels Could analyze as one factor with 4 levels

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Analysis of Variance

Analysis of Variance 1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: CLP 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (2015) chapter 5. We will be examining the extent

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

ANOVA Longitudinal Models for the Practice Effects Data: via GLM Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed. EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

More information

Lecture 7 Remedial Measures

Lecture 7 Remedial Measures Lecture 7 Remedial Measures STAT 512 Spring 2011 Background Reading KNNL: 3.8-3.11, Chapter 4 7-1 Topic Overview Review Assumptions & Diagnostics Remedial Measures for Non-normality Non-constant variance

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Lecture 11 Multiple Linear Regression

Lecture 11 Multiple Linear Regression Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression

More information

Statistics 512: Applied Linear Models. Topic 9

Statistics 512: Applied Linear Models. Topic 9 Topic Overview Statistics 51: Applied Linear Models Topic 9 This topic will cover Random vs. Fixed Effects Using E(MS) to obtain appropriate tests in a Random or Mixed Effects Model. Chapter 5: One-way

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Covariance Structure Approach to Within-Cases

Covariance Structure Approach to Within-Cases Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Lecture 18 Miscellaneous Topics in Multiple Regression

Lecture 18 Miscellaneous Topics in Multiple Regression Lecture 18 Miscellaneous Topics in Multiple Regression STAT 512 Spring 2011 Background Reading KNNL: 8.1-8.5,10.1, 11, 12 18-1 Topic Overview Polynomial Models (8.1) Interaction Models (8.2) Qualitative

More information

STAT 350. Assignment 4

STAT 350. Assignment 4 STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Lecture 4. Random Effects in Completely Randomized Design

Lecture 4. Random Effects in Completely Randomized Design Lecture 4. Random Effects in Completely Randomized Design Montgomery: 3.9, 13.1 and 13.7 1 Lecture 4 Page 1 Random Effects vs Fixed Effects Consider factor with numerous possible levels Want to draw inference

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Modeling Effect Modification and Higher-Order Interactions: Novel Approach for Repeated Measures Design using the LSMESTIMATE Statement in SAS 9.

Modeling Effect Modification and Higher-Order Interactions: Novel Approach for Repeated Measures Design using the LSMESTIMATE Statement in SAS 9. Paper 400-015 Modeling Effect Modification and Higher-Order Interactions: Novel Approach for Repeated Measures Design using the LSMESTIMATE Statement in SAS 9.4 Pronabesh DasMahapatra, MD, MPH, PatientsLikeMe

More information

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1 Psyc 945 Example page Example : Unconditional Models for Change in Number Match 3 Response Time (complete data, syntax, and output available for SAS, SPSS, and STATA electronically) These data come from

More information

Statistical. Psychology

Statistical. Psychology SEVENTH у *i km m it* & П SB Й EDITION Statistical M e t h o d s for Psychology D a v i d C. Howell University of Vermont ; \ WADSWORTH f% CENGAGE Learning* Australia Biaall apan Korea Меяко Singapore

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou. E509A: Principle of Biostatistics (Week 11(2): Introduction to non-parametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189

More information

Topic 16: Multicollinearity and Polynomial Regression

Topic 16: Multicollinearity and Polynomial Regression Topic 16: Multicollinearity and Polynomial Regression Outline Multicollinearity Polynomial regression An example (KNNL p256) The P-value for ANOVA F-test is

More information