Split-Plot Designs. David M. Allen University of Kentucky. January 30, 2014

Size: px
Start display at page:

Download "Split-Plot Designs. David M. Allen University of Kentucky. January 30, 2014"

Transcription

1 Split-Plot Designs David M. Allen University of Kentucky January 30, 2014

2 1 Introduction In this talk we introduce the split-plot design and give an overview of how SAS determines the denominator degrees of freedom for various tests. Back 2

3 2 Drug-Alcohol Study The drug-alcohol study presented here is based on an actual study. It has been scaled down to facilitate more explicit displays. The responses have be changed because the original data are proprietary. See Allen and Cady [1] for more discussion. Back 3

4 Background Tranquilizers are one of the most prescribed classes of drugs. Unfortunately, the combination of tranquilizers and alcohol can compromise a driver s ability to operate a motor vehicle. It is desirable to develop a new tranquilizer that serves its intended purpose but does not combine with alcohol to give an undesirable effect. This trial is to compare effects of drug, effects of alcohol, and the effects of their interaction. The drugs are A a new drug, B a currently popular drug, and C a placebo. The response is the subject s performance on a simulated driving test. While multiple response measurements are recorded, the mean deviation (in feet) from the center of the driving lane is used here. Back 4

5 Randomization Subjects are the whole-plot unit. The alcohol and no alcohol treatments are randomly assigned to the twelve subjects with the restriction that there is the same number of subjects in each treatment group. Separately for each subject, the order of drugs A, B, and C is randomized. There is an adequate interval of time between administration of the different drugs to insure there are no carry-over effects. Back 5

6 The data Drugs Alcohol Subject A B C Yes EAS Yes JBM Yes ARE Yes JBH Yes WJT Yes EEA No JWL No CJW No RDF No RLA No HW No AMR Back 6

7 The model is The model y jk = μ + α + s j + δ k + (αδ) k + ε jk where y jk is the observation on the response variable; μ is the over-all mean; α is the effect of the th level of alcohol; s j is the effect of the jth subject; δ k is the effect of the kth drug; (αδ) k is the effect of the interaction of the th level of alcohol and kth level of drug; and ε jk is a random error. We assume s j N(0, σ 2 s ), ε jk N(0, σ 2 ), and that these effects are mutually independent. All other effects are considered fixed parameters. We have that j = 1 6 for = 1, and j = 7 12 for = 2. Back 7

8 Symbolic data Drugs Alcohol Subject A B C Yes EAS y 1,1,1 y 1,1,2 y 1,1,3 y 1,1, Yes JBM y 1,2,1 y 1,2,2 y 1,2,3 y 1,2, Yes WJT y 1,5,1 y 1,5,2 y 1,5,3 y 1,5, Yes EEA y 1,6,1 y 1,6,2 y 1,6,3 y 1,6, y 1,,1 y 1,,2 y 1,,3 y 1,, No JWL y 2,7,1 y 2,7,2 y 2,7,3 y 2,7, No CJW y 2,8,1 y 2,8,2 y 2,8,3 y 2,8, No HW y 2,11,1 y 2,11,2 y 2,11,3 y 2,11, No AMR y 2,12,1 y 2,12,2 y 2,12,3 y 2,12, y 2,,1 y 2,,2 y 2,,3 y 2,, Back 8

9 Symbolic analysis of variance Degrees of Sum of Mean Expected Source Freedom Squares Square Mean Square Alcohol 1 SS α MS α σ 2 + 3σ 2 s + Q(α, ( Subjects 10 SS s MS s σ 2 + 3σ 2 s Drugs 2 SS δ MS δ σ 2 + Q(δ, (αδ)) Alcohol*Drug 2 SS (αδ) MS (αδ) σ 2 + Q((αδ)) Residual 20 SS ε MS ε σ 2 Back 9

10 Numeric analysis of variance Degrees of Sum of Mean F- Source Freedom Squares Square statistic Alcohol Subjects Drugs Alcohol*Drug Residual Back 10

11 3 Nested factors A factor B is said to be nested within factor A if the levels of factor B are different within each level of factor A. In this case, we say factor A contains factor B. Back 11

12 An example To facilitate explicit displays, we use a smaller version of the drug-alcohol study: Drug Alcohol Subject SubWithin A B Yes dma Yes lwh Yes rla No clw No red No bbs No kmd The levels of Subjects are completely different for the yes Back 12

13 and no levels of Alcohol. We say that Subjects are nested within Alcohol and that Alcohol contains Subjects. Back 13

14 Coding Sometimes a nested factor is coded such that the levels are unique only within levels of the containing factor. For example, the factor SubWithin in the above display is unique only within levels of Alcohol. The remainder of this section deals with building the Z matrix. We assume Alcohol, Subject, and SubWithin are classes variables. Back 14

15 Z = Building the Z matrix We can build Z by putting Subject in a random statement. We call this the direct method. Back 15

16 SAS notation We can build Z by putting either of the equivalent terms, Alcohol*SubWithin or SubWithin(Alcohol), in a random statement. We call this the product method. Back 16

17 Z = = Back 17

18 An editorial The product method has little to recommend it: A variable having a unique subject code must exist, for otherwise the randomization could not have been carried out. Why not use it? If there are unequal numbers of subjects in the alcohol groups, the second method will put one or more columns of all zeros in the design matrix. This increases computational time. Back 18

19 From the computational point of view, the worst possible specification is combine the two methods. For example, Subject(Alcohol) would introduce fourteen columns in the design matrix, and one-half of them would be all zeros. There is an additional consideration: SAS treats models specified by the product and direct methods differently. Back 19

20 4 Satterthwaite procedure In this section we give the simplest form of the Satterthwaite approximation [3]. This approximation may be thought of as synthesizing a mean square. Back 20

21 The setup Suppose a model depends on vector of fixed effects, β, and two variances, σ 2 1 and σ2. Our interest is in a linear 2 function of the fixed effects which we denote by δ. Assume that we have a normally distributed estimator, ˆδ, with variance c 1 σ c 2σ 2 2 where c 1 and c 2 are known constants. Available are SS 1 and SS 2 such that SS 1 σ 2 1 χ2 (ν 1 ) and SS 2 σ 2 2 χ2 (ν 2 ). You may look back to page 9 for an example of SS 1 and SS 2. SS 1, SS 2, and ˆδ are mutually independent. The test statistic for the null hypothesis that δ is equal a specified value δ 0 is t = ˆδ δ 0 c1 SS 1 /ν 1 + c 2 SS 2 /ν 2. Back 21

22 The question is: what is the distribution of t? Back 22

23 Decomposing t The approach used here is to approximate the distribution of t by a t-distribution. That reduces the problem to finding the degrees of freedom of the approximating t-distribution. Define and Z = ˆδ δ 0 c 1 σ c 2σ 2 2 U = c 1 σ 2 SS 1 1 c 2 σ 2 SS 2 2 ν 1 (c 1 σ c 2σ 2 2 ) σ ν 2 (c 1 σ c 2σ 2 2 ) σ 2 2 then t = Z/ U. Under the null hypothesis, the distribution of Z is standard normal. Back 23

24 It remains to approximate the distribution of U by a Chi-square divided by it degrees of freedom, i.e. there exist a ν such that U χ 2 (ν)/ν is approximately satisfied. Back 24

25 Degrees of freedom for approximating distribution By approximately satisfied we mean U and χ 2 (ν)/ν should have the same variance. Now V r(u) = and c 1 σ 2 2 c 1 2 σ 2 2 ν 1 (c 1 σ c 2σ 2 2 ) 2ν 1 + ν 2 (c 1 σ c 2σ 2 2 ) = 2 c2 1 σ4 1 /ν 1 + c 2 2 σ4 2 /ν 2 (c 1 σ c 2σ 2 2 )2 V r χ 2 (ν)/ν = 2 ν. 2 Back 25

26 Equating these two variances and solving for ν gives ν = (c 1σ c 2σ 2 2 )2 c 2 1 σ4 1 /ν 1 + c 2 2 σ4 2 /ν 2 Back 26

27 5 Estimation with balanced data Estimators of linear combinations of fixed effects can be categorized in three ways: 1. estimators that are orthogonal to subjects; 2. estimators that involve only subject totals; and 3. other estimators. We will illustrate a represenitive estimator from each category. The estimators discussed in this section are defined in terms of notation given on page 8. Back 27

28 Drug A versus Drug C A comparison of Drug A with Drug C, averaged over possible interaction effects, is orthogonal to subjects. This is because each drug is used on each subject. The estimator of δ 1 δ 3 is (y 1,,1 + y 2,,1 y 1,,3 y 2,,3 )/2, and its variance is σ 2 /6. The residual mean square is an estimator of σ 2 and is distributed proportional to Chi-square. The t-distribution is used in the usual way for testing or or confidence intervals. A similar result is true for all contrasts among drug effects or among interaction effects. Back 28

29 Alcohol versus no alcohol A comparison alcohol with no alcohol, averaged over any interaction effects, involves only subject totals. The estimator of α 1 α 2 is y 1,, y 2,,, and its variance is (3σ 2 s + σ2 )/9. The subject mean square is an estimator of 3σ 2 s + σ2 and is distributed proportional to Chi-square. The t-distribution is used in the usual way for testing or or confidence intervals. Back 29

30 Response with Drug A and Alcohol The estimated response for a subject on Drug A and Alcohol is y 1,,1, and its variance is σ 2 + σ 2. We estimate s σ 2 + σ 2 s by 1 3 MS s MS ε. Unfortunately, 1 3 MS s MS ε is not distributed proportional to Chi-square, so the usual confidence interval based on the t-distribution not strictly valid. Back 30

31 We use the Satterthwaite procedure to find the degrees of freedom of the approximating Chi-square distribution. The correspondence of notation is σ 2 1 = 3σ2 s + σ2 σ 2 2 = σ2 ν 1 = 10 ν 2 = 20 c 1 = 1/3 c 2 = 2/3 Since the variances are not known, substitute the corresponding mean squares. The result is ν = 15. We proceed with the inference assuming a t-distribution with Back 31

32 fifteen degrees of freedom. Back 32

33 6 SAS degrees of freedom options On the estimate statement one may use the df option to specify the denominator degrees of freedom for the approximate t-distribution. However, except for simple tests with balanced data, most people will want SAS to provide the degrees of freedom. In this section we describe five different methods for determining denominator degrees of freedom that a accessible in SAS. Back 33

34 The containment method The containment method is the default when the RANDOM statement is used. Otherwise, the containment method is invoked with the DDFM = CONTAIN option on the model statement. Denote the fixed effect in question A, and search the RANDOM effect list for the effects that syntactically contain A. Among the random effects that contain A, compute their rank contribution to the [X Z] matrix. The denominator degrees of freedom assigned to A is the smallest of these rank contributions. If A is not found on the random statement, the containment method is not invoked, and the denominator degrees of freedom are the residual degrees of freedom. Back 34

35 Note that for a nested model, specified by the direct method, the containment method will not be invoked. Back 35

36 The between-within method The DDFM = BETWITHIN option is the default for REPEATED statement specifications (with no RANDOM statements). It is computed by dividing the residual degrees of freedom into between-subject and within-subject portions. PROC MIXED then checks whether a fixed effect changes within any subject. If so, it assigns within-subject degrees of freedom to the effect; otherwise, it assigns the between-subject degrees of freedom to the effect. If there are multiple within-subject effects containing classification variables, the within-subject degrees of freedom is partitioned into components corresponding to the subject-by-effect interactions. Back 36

37 The residual degrees of freedom The denominator degrees of freedom are the residual degrees of freedom. This will give exact test for all effects that are orthogonal to the Z matrix; i.e. split-plot treatment and interaction with whole-plot treatment. Back 37

38 The Satterthwaite method The Satterthwaite method is a generalization of the Satterthwaite method described in Section 4. The generalization is discussed in considerable detail in another lecture. Back 38

39 The Kenward-Roger method The Kenward-Roger method implements the method described in [2]. This method is in SAS starting with Version 8. The Kenward-Roger method uses the Satterthwaite method for determining the denominator degrees of freedom, but it modifies the estimator as well. Calling the Kenward-Roger method a denominator degrees of freedom method is a misnomer. Back 39

40 7 Comparison of degrees of freedom In section 5 we looked at three different estimators using traditional methods and taking advantage of the balanced data. In this section, we look at how SAS computes the denominator degrees of freedom for these estimates. We then remove some of the data and repeat the exercise. Back 40

41 Drug-Alcohol data with missing values Drugs Alcohol Subject A B C Yes HW Yes JBM Yes JWL Yes JBH Yes ARE Yes EEA No DCJ No CJW No RDF No RLA No EAS No AMR Back 41

42 We have removed seven observations or 19.4%. Four are from the alcohol group, and three are from the no alcohol group. Three observations are removed from both the Drug A and Drug B groups, and one observation is removed from Drug C. Back 42

43 The SAS code The SAS code used for this demonstration is proc mixed data = balanced; classes Alcohol Subject SubWithin Drug; model y = Alcohol Drug Alcohol*Drug / ddfm = conta random Subject; estimate 1 intercept 1 Alcohol 1 0 Drug 1 Alcoho estimate 2 Alcohol -1 1 ; estimate 3 Drug ; run; The high lighted parts of the code are changed from run to run. We use the balanced data and the data with missing observations. We use all five methods of Back 43

44 computing the denominator degrees of freedom. We use both the direct and product method of specifying the random effect. Back 44

45 Estimate 1 Drug A with no alcohol Denominator degrees of freedom Method Balanced Missing Containment Between-within Residual Satterthwaite Kenward-Roger Back 45

46 Estimate 2 Alcohol versus no alcohol Denominator degrees of freedom Method Balanced Missing Containment 20(10) 13(10) Between-within Residual Satterthwaite Kenward-Roger For the containment method, the first number is for direct specification, and the number in parentheses is for product specification. Back 46

47 Estimate 3 Drug A versus drug C Denominator degrees of freedom Method Balanced Missing Containment Between-within Residual Satterthwaite Kenward-Roger Back 47

48 References [1] David M. Allen and Foster B. Cady. Analyzing Experimental Data by Regression. VanNostrand-Reinhold, Belmont, California, [2] M. G. Kenward and J. H. Roger. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53: , [3] F. E. Satterthwaite. An approximate distribution of estimates of variance components. Biometrics Bulletin, 2: , Back 48

Randomized Complete Block Designs

Randomized Complete Block Designs Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

A Likelihood Ratio Test

A Likelihood Ratio Test A Likelihood Ratio Test David Allen University of Kentucky February 23, 2012 1 Introduction Earlier presentations gave a procedure for finding an estimate and its standard error of a single linear combination

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

LOOKING FOR RELATIONSHIPS

LOOKING FOR RELATIONSHIPS LOOKING FOR RELATIONSHIPS One of most common types of investigation we do is to look for relationships between variables. Variables may be nominal (categorical), for example looking at the effect of an

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: Summary of building unconditional models for time Missing predictors in MLM Effects of time-invariant predictors Fixed, systematically varying,

More information

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test Rebecca Barter April 13, 2015 Let s now imagine a dataset for which our response variable, Y, may be influenced by two factors,

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd Why We Use Analysis of Variance to Compare Group Means and How it Works The question of how to compare the population means of more than two groups is an important one to researchers. Let us suppose that

More information

Topic 21 Goodness of Fit

Topic 21 Goodness of Fit Topic 21 Goodness of Fit Contingency Tables 1 / 11 Introduction Two-way Table Smoking Habits The Hypothesis The Test Statistic Degrees of Freedom Outline 2 / 11 Introduction Contingency tables, also known

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

Topic 22 Analysis of Variance

Topic 22 Analysis of Variance Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

COMPARING SEVERAL MEANS: ANOVA

COMPARING SEVERAL MEANS: ANOVA LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons

More information

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please

More information

STAT 501 EXAM I NAME Spring 1999

STAT 501 EXAM I NAME Spring 1999 STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your

More information

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Logistic Regression Analysis

Logistic Regression Analysis Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

Mathematical statistics

Mathematical statistics November 15 th, 2018 Lecture 21: The two-sample t-test Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Chapter 10: Inferences based on two samples

Chapter 10: Inferences based on two samples November 16 th, 2017 Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 1: Descriptive statistics Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

STK4900/ Lecture 3. Program

STK4900/ Lecture 3. Program STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies

More information

Regression With a Categorical Independent Variable: Mean Comparisons

Regression With a Categorical Independent Variable: Mean Comparisons Regression With a Categorical Independent Variable: Mean Lecture 16 March 29, 2005 Applied Regression Analysis Lecture #16-3/29/2005 Slide 1 of 43 Today s Lecture comparisons among means. Today s Lecture

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance

More information

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Review. One-way ANOVA, I. What s coming up. Multiple comparisons Review One-way ANOVA, I 9.07 /15/00 Earlier in this class, we talked about twosample z- and t-tests for the difference between two conditions of an independent variable Does a trial drug work better than

More information

Linear Mixed Models: Methodology and Algorithms

Linear Mixed Models: Methodology and Algorithms Linear Mixed Models: Methodology and Algorithms David M. Allen University of Kentucky January 8, 2018 1 The Linear Mixed Model This Chapter introduces some terminology and definitions relating to the main

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

http://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance ANOVA) Compare several means Radu Trîmbiţaş 1 Analysis of Variance for a One-Way Layout 1.1 One-way ANOVA Analysis of Variance for a One-Way Layout procedure for one-way layout Suppose

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

PLSC PRACTICE TEST ONE

PLSC PRACTICE TEST ONE PLSC 724 - PRACTICE TEST ONE 1. Discuss briefly the relationship between the shape of the normal curve and the variance. 2. What is the relationship between a statistic and a parameter? 3. How is the α

More information

Figure 9.1: A Latin square of order 4, used to construct four types of design

Figure 9.1: A Latin square of order 4, used to construct four types of design 152 Chapter 9 More about Latin Squares 9.1 Uses of Latin squares Let S be an n n Latin square. Figure 9.1 shows a possible square S when n = 4, using the symbols 1, 2, 3, 4 for the letters. Such a Latin

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Statistical Inference: The Marginal Model

Statistical Inference: The Marginal Model Statistical Inference: The Marginal Model Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Inference for fixed

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

Analysis of Variance and Co-variance. By Manza Ramesh

Analysis of Variance and Co-variance. By Manza Ramesh Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

Mixed Designs: Between and Within. Psy 420 Ainsworth

Mixed Designs: Between and Within. Psy 420 Ainsworth Mixed Designs: Between and Within Psy 420 Ainsworth Mixed Between and Within Designs Conceptualizing the Design Types of Mixed Designs Assumptions Analysis Deviation Computation Higher order mixed designs

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

ST505/S697R: Fall Homework 2 Solution.

ST505/S697R: Fall Homework 2 Solution. ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)

More information

Answer to exercise: Blood pressure lowering drugs

Answer to exercise: Blood pressure lowering drugs Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:

More information

Approximations to Distributions of Test Statistics in Complex Mixed Linear Models Using SAS Proc MIXED

Approximations to Distributions of Test Statistics in Complex Mixed Linear Models Using SAS Proc MIXED Paper 6-6 Approximations to Distributions of Test Statistics in Complex Mixed Linear Models Using SAS Proc MIXED G. Bruce Schaalje, Department of Statistics, Brigham Young University, Provo, UT Justin

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs) The One-Way Repeated-Measures ANOVA (For Within-Subjects Designs) Logic of the Repeated-Measures ANOVA The repeated-measures ANOVA extends the analysis of variance to research situations using repeated-measures

More information

THE PEARSON CORRELATION COEFFICIENT

THE PEARSON CORRELATION COEFFICIENT CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

:the actual population proportion are equal to the hypothesized sample proportions 2. H a

:the actual population proportion are equal to the hypothesized sample proportions 2. H a AP Statistics Chapter 14 Chi- Square Distribution Procedures I. Chi- Square Distribution ( χ 2 ) The chi- square test is used when comparing categorical data or multiple proportions. a. Family of only

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

10.2: The Chi Square Test for Goodness of Fit

10.2: The Chi Square Test for Goodness of Fit 10.2: The Chi Square Test for Goodness of Fit We can perform a hypothesis test to determine whether the distribution of a single categorical variable is following a proposed distribution. We call this

More information

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018 Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Logistic Regression. Continued Psy 524 Ainsworth

Logistic Regression. Continued Psy 524 Ainsworth Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population

More information

Lecture 18 Miscellaneous Topics in Multiple Regression

Lecture 18 Miscellaneous Topics in Multiple Regression Lecture 18 Miscellaneous Topics in Multiple Regression STAT 512 Spring 2011 Background Reading KNNL: 8.1-8.5,10.1, 11, 12 18-1 Topic Overview Polynomial Models (8.1) Interaction Models (8.2) Qualitative

More information