Stat 511 HW#10 Spring 2003 (corrected)

Similar documents
STAT Final Practice Problems

Analysis of Variance and Design of Experiments-I

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Outline Topic 21 - Two Factor ANOVA

Factorial designs. Experiments

STAT22200 Spring 2014 Chapter 8A

Two-Way Factorial Designs

3. Factorial Experiments (Ch.5. Factorial Experiments)

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Orthogonal contrasts for a 2x2 factorial design Example p130

Stat 579: Generalized Linear Models and Extensions

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

STAT 705 Chapter 19: Two-way ANOVA

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

Chap The McGraw-Hill Companies, Inc. All rights reserved.

MAT3378 ANOVA Summary

If we have many sets of populations, we may compare the means of populations in each set with one experiment.

Statistics For Economics & Business

Two-Way Analysis of Variance - no interaction

STAT 705 Chapter 19: Two-way ANOVA

Unbalanced Data in Factorials Types I, II, III SS Part 1

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

STAT 8200 Design of Experiments for Research Workers Lab 11 Due: Friday, Nov. 22, 2013

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes

Stat 6640 Solution to Midterm #2

Nested Designs & Random Effects

Chapter 15: Analysis of Variance

Power & Sample Size Calculation

Stat 217 Final Exam. Name: May 1, 2002

Fractional Factorial Designs

Factorial and Unbalanced Analysis of Variance

Increasing precision by partitioning the error sum of squares: Blocking: SSE (CRD) à SSB + SSE (RCBD) Contrasts: SST à (t 1) orthogonal contrasts

Definitions of terms and examples. Experimental Design. Sampling versus experiments. For each experimental unit, measures of the variables of

CS 147: Computer Systems Performance Analysis

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

STAT 506: Randomized complete block designs

Homework 3 - Solution

Chapter 11: Factorial Designs

Correlation Analysis

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

Factorial Within Subjects. Psy 420 Ainsworth

Lec 5: Factorial Experiment

Lecture 4. Random Effects in Completely Randomized Design

Key Features: More than one type of experimental unit and more than one randomization.

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

These are multifactor experiments that have

Kruskal-Wallis and Friedman type tests for. nested effects in hierarchical designs 1

The Random Effects Model Introduction

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Two-Factor Full Factorial Design with Replications

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

Chapter 5 Introduction to Factorial Designs Solutions

STAT 430 (Fall 2017): Tutorial 8

Factorial ANOVA. Psychology 3256

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Inferences for Regression

n i n T Note: You can use the fact that t(.975; 10) = 2.228, t(.95; 10) = 1.813, t(.975; 12) = 2.179, t(.95; 12) =

V. Experiments With Two Crossed Treatment Factors

Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine

Statistics for Managers using Microsoft Excel 6 th Edition

Part 5 Introduction to Factorials

What is Experimental Design?

Allow the investigation of the effects of a number of variables on some response

Graduate Lectures and Problems in Quality Control and Engineering Statistics: Theory and Methods

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

IE 361 Module 21. Design and Analysis of Experiments: Part 2

Unbalanced Data in Factorials Types I, II, III SS Part 2

3. Design Experiments and Variance Analysis

G. Nested Designs. 1 Introduction. 2 Two-Way Nested Designs (Balanced Cases) 1.1 Definition (Nested Factors) 1.2 Notation. 1.3 Example. 2.

Sleep data, two drugs Ch13.xls

Lecture #26. Gage R&R Study. Oct. 29, 2003

Design of Engineering Experiments Chapter 5 Introduction to Factorials

Battery Life. Factory

Analysis of Variance

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

MATH Notebook 3 Spring 2018

Ch 2: Simple Linear Regression

Lec 1: An Introduction to ANOVA

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

Two or more categorical predictors. 2.1 Two fixed effects

1 One-way analysis of variance

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

IE 361 Module 5. Gauge R&R Studies Part 1: Motivation, Data, Model and Range-Based Estimates

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

Statistical Techniques II EXST7015 Simple Linear Regression

Unbalanced Designs & Quasi F-Ratios

Stat 579: Generalized Linear Models and Extensions

A discussion on multiple regression models

Chapter 4: Randomized Blocks and Latin Squares

Notes on Maxwell & Delaney

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Solution to Final Exam

Transcription:

Stat 511 HW#10 Spring 003 (corrected) 1. Below is a small set of fake unbalanced -way factorial (in factors A and B) data from (unbalanced) blocks. Level of A Level of B Block Response 1 1 1 9.5 1 1 11.3 1 3 1 8.4 1 3 1 8.5 1 1 6.7 1 9.4 3 1 6.1 1 1 1.0 1 1 1. 1 13.1 1 3 10.7 1 8.9 10.8 3 7.4 3 8.6 Consider an analysis of these data based on the model y = µ + α + β + αβ + τ + ε (*) ijk i j ij k ijk first where (*) specifies a fixed effects model (say under the sum restrictions) and then where where the k N 0, τ N 0,σ random errorsε ijk. τ are iid ( σ ) random block effects independent of the iid ( ) Begin by adding the MASS and nlme libraries to your R session. a) Create vectors A,B,Block,y in R, corresponding to the columns of the data table above. Make from the first 3 of these objects that R can recognize as factors by typing > A<-as.factor(A) > B<-as.factor(B) > FBlock<-as.factor(Block) Tell R you wish to use the sum restrictions in linear model computations by typing > options(contrasts=c("contr.sum","contr.sum")) Then note that a fixed effects linear model with equation (*) can be fit and some summary statistics viewed in R by typing > lmf.out<-lm(y~a*b+fblock) > summary(lmf.out) 1

What is the estimate ofσ in this model? b) The linear mixed effects model (*) may be fit and some summaries viewed by typing > lmef.out<-lme(y~a*b,random=~1 Block) > summary(lmef.out) In terms of inferences for the fixed effects, how much difference do you see between the outputs for the analysis with fixed block effects and the analysis with random block effects? How do the estimates ofσ compare for the two models? Type > intervals(lmef.out) to get confidence intervals for the mixed effects model. Compare a 95% confidence interval forσ in the fixed effects model to what you get from this call. In the fixed effects analysis of a), one get estimates ofτ = τ 1. Are these consistent with the estimate ofσ produced here? Explain. c) Type τ > coef(lmf.out) > coef(lmef.out) > fitted(lmf.out) > fitted(lmef.out) What are the BLUE and (estimated) BLUP of µ + α1+ β1+ αβ11+ τ1in the two analyses? Notice that in the random blocks analysis, it makes good sense to predict a new observation from levels 1 of A and 1 of B, from an as yet unseen third block. This would be µ + α1+ β1+ αβ11+ τ3. What value would you use to predict this? Why does it not make sense to do predict a new observation with mean µ + α1+ β1+ αβ11+ τ3in the fixed effects context?. Below is a very small set of fake unbalanced level nested data. Consider the analysis of these under the mixed effects model y = µ + α + β + ε ijk i ij ijk where the only fixed effect is µ, the random effects are all independent with the i iid N( 0, α ) the βij iid N( 0, σ β ), and the εijk iid N( 0, σ ). Level of A Level of B within A Response 1 1 6.0, 6.1 8.6, 7.1,6.5,7.4 1 9.4,9.9 9.5,7.5 3 6.4,9.1,8.7 α σ,

Again after adding the MASS and nlme libraries to your R session (and specifying the use of the sum restrictions), do the following. a) Enter your data by typing > y<-c(6.0,6.1,8.6,7.1,6.5,7.4,9.4,9.9,9.5,7.5,6.4,9.1,8.7) > A<-c(1,1,1,1,1,1,,,,,,,) > B<-c(1,1,,,,,1,1,,,3,3,3) Run and view some summaries of a random effects/mixed effects analysis of these data by typing > lmef.out<-lme(y~1,random=~1 A/B) > summary(lmef.out) Get approximate confidence intervals for the parameters of the mixed model by typing > intervals(lmef.out) Compute an exact confidence interval forσ based on the mean square error (or pooled variance from the 5 samples of sizes,4,,, and 3). How do these limits compare to what R provides in this analysis? b) A fixed effects analysis can be made here and some summaries viewed here by typing > AA<-as.factor(A) > BB<-as.factor(B) > lmf.out<-lm(y~1+aa/bb) > summary(lmf.out) (Note that the estimate ofσ produced by this analysis is exactly the one based on the mean square error referred to in a).) Type > predict(lmef.out) > predict(lmf.out) Identify the predictions from the fixed effects model as simple functions of the data values. (What are these predictions?) Notice that the predictions from the mixed effects analysis are substantially different from those based on the fixed effects model. Where do the mixed effects predictions fall relative to the fixed effects predictions and the mixed effects estimate of µ? 3. Consider the scenario of the example beginning on page 1190 of Neter, Wasserman, and friends. This supposedly concerns sales of pairs of athletic shoes under two different advertising campaigns in several test markets. The data from Table 9.10 are below. 3

Time Period A Ad Campaign B Test Market C(B) Locality Response (Sales?) 1 1 1 1 958 1 1 1,005 1 1 3 3 351 1 1 4 4 549 1 1 5 5 730 1 1 6 780 1 7 9 1 3 8 883 1 4 9 64 1 5 10 375 1 1 1 1,047 1 1,1 1 3 3 436 1 4 4 63 1 5 5 784 1 6 897 7 75 3 8 964 4 9 695 5 10 436 3 1 1 1 933 3 1 986 3 1 3 3 339 3 1 4 4 51 3 1 5 5 707 3 1 6 718 3 7 0 3 3 8 817 3 4 9 599 3 5 10 351 Enter these data into R as 5 vectors of length 30 called time,ad,test,loc,y. Notice that each combination of ad campaign level and an index value for test market for that campaign is associated with a different level of the locality variable. (Making the locality vector is the only device Vardeman was able to come up with to get the right mixed effects model run.) We consider an analysis of these data based on the split plot / repeated measures mixed effects model y ijk = µ + αi + βj + αβij + γ jk + εijk for yijk the response at timei from the kth effects are the γ 's and the ε 's. jk ijk test market under ad campaign j, where the random Again after adding the MASS and nlme libraries to your R session (and specifying the use of the sum restrictions), do the following: 4

Type > TIME<-as.factor(time) > AD<-as.factor(ad) > TEST<-as.factor(test) > LOC<-as.factor(loc) a) Neter, Wasserman, and friends produce a special ANOVA table for these data. An ANOVA table with the same sums of squares and degrees of freedom can be obtained in R using fixed effects calculations by typing > lmf1.out<-lm(y~1+time*(ad/test)) > anova(lmf1.out) Identify the sums of squares and degrees of freedom in this table by the names used in class ( SSA, SSB, SSAB, SSC( B), SSE,etc.). Then run a fictitious 3-way full factorial analysis of these data by typing > lmf.out<-lm(y~1+time*ad*test) > anova(lmf.out) Verify that the sums of squares and degrees of freedom in the first ANOVA table can be gotten from this second table. (Show how.) b) A mixed model analysis in R can be gotten by typing > lmef.out<-lme(y~1+time*ad,random=~1 LOC) > summary(lmef.out) > intervals(lmef.out) > predict(lmef.out) What are 95% confidence intervals for σγ and σ read from this analysis? c) Make exact 95% confidence limits forσ based on the chi-square distribution related to MSE. Then use the Cochran-Satterthwaite approximation to make 95% limits forσ. How do these two sets of limits compare to what R produced in part b)? d) Make 95% confidence limits for the difference in Time 1 and Time main effects. Make 95% confidence limits for the difference in Ad Campaign 1 and Ad Campaign main effects. How can an interval for this second quantity be obtained nearly directly from the intervals(lmef.out) call made above? e) How would you predict the Time sales result for Ad Campaign in another (untested) market? (What value would you use?) 4. A data set in Statistical Quality Control in the Chemical Laboratory by G. Wernimont (that was reprinted in Quality Engineering in 1989) describes measured copper contents (in %) derived from each of specimens cut from each of 11 bronze castings. An ANOVA table for the resulting balanced hierarchical data set follows. γ 5

Source SS df Castings 3.031 10 Specimens 1.9003 11 Determinations.0351 Total 5.1385 43 Use the same mixed effects model as in problem and do the following. a) Make approximate 95% confidence intervals for each of the 3 standard deviations σ, σ and σ. Based on these, where do you judge the largest part of variation in measured α β copper content to come from? (From differences between determinations on a given specimen? From differences between specimens from a given casting? From differences between castings?) b) The grand average of all 44 copper content measurements was in fact... 85.568 y = (%). Use this fact and make a 95% confidence interval for the model parameter µ. 6