Chapter 6. Supplemental Text Material

Similar documents
Chapter 6. Supplemental Text Material. Run, i X i1 X i2 X i1 X i2 Response total (1) a b ab

Chapter 11: Simple Linear Regression and Correlation

x = , so that calculated

Chapter 13: Multiple Regression

Comparison of Regression Lines

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Kernel Methods and SVMs Extension

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Statistics for Economics & Business

Basically, if you have a dummy dependent variable you will be estimating a probability.

Negative Binomial Regression

Chapter 8 Indicator Variables

STAT 3008 Applied Regression Analysis

Linear Approximation with Regularization and Moving Least Squares

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Linear Regression Analysis: Terminology and Notation

Economics 130. Lecture 4 Simple Linear Regression Continued

Lecture 3 Stat102, Spring 2007

Chapter 9: Statistical Inference and the Relationship between Two Variables

Statistics Chapter 4

x i1 =1 for all i (the constant ).

/ n ) are compared. The logic is: if the two

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Errors for Linear Systems

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

F statistic = s2 1 s 2 ( F for Fisher )

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

Foundations of Arithmetic

Chapter 5 Multilevel Models

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

28. SIMPLE LINEAR REGRESSION III

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Learning Objectives for Chapter 11

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

ANOVA. The Observations y ij

18. SIMPLE LINEAR REGRESSION III

This column is a continuation of our previous column

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

4.3 Poisson Regression

Formulas for the Determinant

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

10-701/ Machine Learning, Fall 2005 Homework 3

Lecture Notes on Linear Regression

STATISTICS QUESTIONS. Step by Step Solutions.

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 4 Hypothesis Testing

The topics in this section concern with the second course objective. Correlation is a linear relation between two random variables.

Basic Business Statistics, 10/e

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Measurement Uncertainties Reference

Analytical Chemistry Calibration Curve Handout

Regression Analysis. Regression Analysis

CHAPTER 14 GENERAL PERTURBATION THEORY

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

First Year Examination Department of Statistics, University of Florida

Lecture 6: Introduction to Linear Regression

Introduction to Regression

Section 3.6 Complex Zeros

Statistics II Final Exam 26/6/18

Statistics MINITAB - Lab 2

Lecture 12: Discrete Laplacian

Important Instructions to the Examiners:

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

A Robust Method for Calculating the Correlation Coefficient

Global Sensitivity. Tuesday 20 th February, 2018

Polynomial Regression Models

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Linear Feature Engineering 11

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

SIMPLE LINEAR REGRESSION

Homework Assignment 3 Due in class, Thursday October 15

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Lecture 4. Instructor: Haipeng Luo

Midterm Examination. Regression and Forecasting Models

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

STAT 511 FINAL EXAM NAME Spring 2001

Methods of Detecting Outliers in A Regression Analysis Model.

Composite Hypotheses testing

Module 9. Lecture 6. Duality in Assignment Problems

Solution Thermodynamics

Lecture 2: Prelude to the big shrink

Transcription:

Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo. Also, t has been ponted out that n the regresson model representaton of the factoral, the regresson coeffcents are exactly one-half the effect estmates. t s straghtforward to show that the model coeffcents (and hence the effect estmates) are least squares estmates. Consder a factoral. The regresson model s y β + β x + β x + β x x + ε The data for the experment s shown n the followng tle: Run, X X X X Response total - - () - - a 3 - - b The least squares estmates of the model parameters β are chosen to mnmze the sum of the squares of the model errors: L y β β x β x β x x b t s straghtforward to show that the least squares normal equatons are Now snce β + β x + β x + β x x () + a+ b+ β x + β x + β x x + β x x () + a b+ β x + β x x + β x + β x x () a+ b+ x x + x x + x x + β β β β x x () a b+ x x x x x x x x orthogonal, the normal equatons reduce to a very smple form: g because the desgn s

The soluton s β () + a+ b+ β () + a b+ β () a+ b+ β () a b+ [( ) + a+ b+ ] β [ ( ) ] β + a b + [ ( ) ] β a + b + [( ) a b+ ] β These regresson model coeffcents are exactly one-half the factor effect estmates. Therefore, the effect estmates are least squares estmates. We wll show ths n a more general manner n Chapter. S6-. Yates s Method for Calculatng Effect Estmates Whle we typcally use a computer program for the statstcal analyss of a desgn, there s a very smple technque devsed by Yates (937) for estmatng the effects and determnng the sums of squares n a factoral desgn. The procedure s occasonally useful for manual calculatons, and s best learned through the study of a numercal example. Consder the data for the 3 desgn n Example 6-. These data have been entered n Tle below. The treatment combnatons are always wrtten down n standard order, and the column leled "Response" contans the correspondng observaton (or total of all observatons) at that treatment combnaton. The frst half of column () s obtaned by addng the responses n adjacent pars. The second half of column () s obtaned by changng the sgn of the frst entry n each of the pars n the Response column and addng the adjacent pars. or example, n column () we obtan for the ffth entry 5 -(- ) +, for the sxth entry 6 -(-) + 5, and so on. Column () s obtaned from column () just as column () s obtaned from the Response column. Column (3) s obtaned from column () smlarly. n general, for a desgn we would construct columns of ths type. Column (3) [n general, column ()] s the contrast for the effect desgnated at the begnnng of the row. To obtan the estmate of the effect, we dvde the entres n column (3) by n - (n our example, n - 8). nally, the sums of squares for the effects are obtaned by squarng the entres n column (3) and dvdng by n (n our example, n () 3 6).

Tle.Yates's Algorthm for the Data n Example 6- Estmate Sum of Treatment of Effect Squares Combnaton Response () () (3) Effect (3) n - (3) n - () - -3 6 --- --- a 5 A 3. 36. b - 8 B.5.5 5 3 3 6 A B.75.5 c - 5 7 C.75.5 ac 3 6 A C.5.5 bc B C.5. c 9 5 ABC.5. The estmates of the effects and sums of squares obtaned by Yates' algorthm for the data n Example 6- are n agreement wth the results found there by the usual methods. Note that the entry n column (3) [n general, column ()] for the row correspondng to () s always equal to the grand total of the observatons. n spte of ts apparent smplcty, t s notorously easy to mae numercal errors n Yates's algorthm, and we should be extremely careful n executng the procedure. As a partal chec on the computatons, we may use the fact that the sum of the squares of the elements n the jth column s j tmes the sum of the squares of the elements n the response column. Note, however, that ths chec s subject to errors n sgn n column j. See Daves (956), Good (955, 958), Kempthorne (95), and Rayner (967) for other error-checng technques. S6-3. A Note on the Varance of a Contrast n analyzng factoral desgns, we frequently construct a normal problty plot of the factor effect estmates and vsually select a tentatve model by dentfyng the effects that appear large. These effect estmates are typcally relatvely far from the straght lne passng through the remanng plotted effects. Ths method wors ncely when () there are not many sgnfcant effects, and () when all effect estmates have the same varance. t turns out that all contrasts computed from a desgn (and hence all effect estmates) have the same varance even f the ndvdual observatons have dfferent varances. Ths statement can be easly demonstrated. Suppose that we have conducted a desgn and have responses varance of each observaton be,,, a lnear combnaton of the observatons, say Effect y, y,, y and let the respectvely. Now each effect estmate s cy

where the contrast constants c are all ether or +. Therefore, the varance of an effect estmate s V( Effect) ( ) ( ) c V( y ) c ( ) because c. Therefore, all contrasts have the same varance. f each observaton y n the ove equatons s the total of n replcates at each desgn pont, the result stll holds. S6-. The Varance of the Predcted Response Suppose that we have conducted an experment usng a factoral desgn. We have ft a regresson model to the resultng data and are gong to use the model to predct the response at locatons of nterest n sde the desgn space x +,,,,. What s the varance of the predcted response at the pont of nterest, say x [ x, x,, x ]? Problem 6-3 ass the reader to answer ths queston, and whle the answer s gven n the nstructors Resource CD, we also gve the answer here because t s useful nformaton. Assume that the desgn s balanced and every treatment combnaton s replcated n tmes. Snce the desgn s orthogonal, t s easy to fnd the varance of the predcted response. We consder the case where the expermenters have ft a man effects only model, say y ( x) y β + βx Now recall that the varance of a model regresson coeffcent s V ( β), n N where N s the total number of runs n the desgn. The varance of the predcted response s V[ y ( x )] V β + β x V( β ) + V( β x ) V( β ) + x V( β ) + N N + N x x

n the ove development we have used the fact that the desgn s orthogonal, so there are no nonzero covarance terms when the varance operator s appled The Desgn-Expert software program plots contours of the standard devaton of the predcted response; that s the square root of the ove expresson. f the desgn has already been conducted and analyzed, the program replaces wth the error mean square, so that the plotted quantty becomes MS V[ y ( ) E x + N f the desgn has been constructed but the experment has not been performed, then the software plots (on the desgn evaluaton menu) the quantty V[ y ( x) + N whch can be thought of as a standardzed standard devaton of predcton. To llustrate, consder a wth n 3 replcates, the frst example n Secton 6-. The plot of the standardzed standard devaton of the predcted response s shown below. x x DESGN-EXPERT Plot StdErr of Desgn X A: A Y B: B Desgn Ponts..5 StdErr of Desgn 3 3.33.33 B. -.5.337.33.385.33 -. 3 3 -. -.5..5. A

The contours of constant standardzed standard devaton of predcted response should be exactly crcular, and they should be a maxmum wthn the desgn regon at the pont x ± and x ±. The maxmum value s V[ y ( x ) + () + () c h 3 5. Ths s also shown on the graph at the corners of the square. Plots of the standardzed standard devaton of the predcted response can be useful n comparng desgns. or example, suppose the expermenter n the ove stuaton s consderng addng a fourth replcate to the desgn. The maxmum standardzed predcton standard devaton n the regon now becomes V[ y ( x ) + () + () c h 3 6. 33 The plot of the standardzed predcton standard devaton s shown below. DESGN-EXPERT Plot StdErr of Desgn X A: A Y B: B Desgn Ponts..5 StdErr of Desgn.3.3.37.37 B. -.5.8.3.37.37.3.3.3 -. -. -.5..5. A

Notce that addng another replcate has reduced the maxmum predcton varance from (.5).5 to (.33).875. Comparng the two plots shown ove reveals that the standardzed predcton standard devaton s unformly lower throughout the desgn regon when an addtonal replcate s run. Sometmes we le to compare desgns n terms of scaled predcton varance, defned as Ths allows us to evaluate desgns that have dfferent numbers of runs. Snce addng replcates (or runs) to a desgn wll generally always mae the predcton varance get smaller, the scaled predcton varance allows us to examne the predcton varance on a per-observaton bass. Note that for a factoral and the man effects only model we have been consderng, the scaled predcton varance s + ( + ρ ) where ρ s the dstance of the desgn pont where predcton s requred from the center of the desgn space (x ). Notce that the desgn acheves ths scaled predcton varance regardless of the number of replcates. The maxmum value that the scaled predcton varance can have over the desgn regon s x Max NV [ y ( x)] ( + ) t can be shown that no other desgn over ths regon can acheve a smaller maxmum scaled predcton varance, so the desgn s n some sense an optmal desgn. We wll dscuss optmal desgns more n Chapter. S6-5. Usng Resduals to dentfy Dsperson Effects We llustrated n Example 6- (Secton 6-5 on unreplcated desgns) that plottng the resduals from the regresson model versus each of the desgn factors was a useful way to chec for the possblty of dsperson effects. These are factors that nfluence the varlty of the response, but whch have lttle effect on the mean. A method for computng a measure of the dsperson effect for each desgn factor and nteracton that can be evaluated on a normal problty plot was also gven. However, we noted that these resdual analyses are farly senstve to correct specfcaton of the locaton model. That s, f we leave mportant factors out of the regresson model that descrbes the mean response, then the resdual plots may be unrelle. To llustrate, reconsder Example 6-, and suppose that we leave out one of the mportant factors, C Resn flow. f we use ths ncorrect model, then the plots of the resduals versus the desgn factors loo rather dfferent than they dd wth the orgnal, correct model. n partcular, the plot of resduals versus factor D Closng tme s shown below.

DESGN-EXPERT Plot Defects Resduals vs. Closng tme 3. Studentzed Resduals.5. -.5-3. - Closng tm e Ths plot ndcates that factor D has a potental dsperson effect. The normal problty * plot of the dsperson statstc n gure 6-8 clearly reveals that factor B s the only factor that has an effect on dsperson. Therefore, f you are gong to use model resduals to search for dsperson effects, t s really mportant to select the rght model for the locaton effects. S6-6. Center Ponts versus Replcaton of actoral Ponts n some desgn problems an expermenter may have a choce of replcatng the corner or cube ponts n a factoral, or placng replcate runs at the desgn center. or example, suppose our choce s between a wth n replcates at each corner of the square, or a sngle replcate of the wth n c center ponts. We can compare these desgns n terms of predcton varance. Suppose that we plan to ft the frst-order or man effects only model y ( x) y β + βx f we use the replcated desgn the scaled predcton varance s (see Secton 6- ove): + ( + ρ ) x

Now consder the predcton varance when the desgn wth center ponts s used. We have V[ y ( x )] V β + β x V( β ) + V( β x ) V( β ) + x V( β ) + 8 + 8 ( + ρ ) 8 Therefore, the scaled predcton varance for the desgn wth center ponts s x x ( + ρ ) Clearly, replcatng the corners n ths example outperforms the strategy of replcatng center ponts, at least n terms of scaled predcton varance. At the corners of the square, the scaled predcton varance for the replcated factoral s ( + ρ ) ( + ) 3 whle for the factoral desgn wth center ponts t s ( + ρ ) ( + ( ) ) 5 However, predcton varance mght not tell the complete story. f we only replcate the corners of the square, we have no way to judge the lac of ft of the model. f the desgn has center ponts, we can chec for the presence of pure quadratc (second-order) terms, so the desgn wth center ponts s lely to be preferred f the expermenter s at all uncertan out the order of the model he or she should be usng.

S6-7. Testng for Pure Quadratc Curvature usng a t-test n Secton 6-6 of the textboo we dscuss the addton of center ponts to a factoral desgn. Ths s a very useful dea as t allows an estmate of pure error to be obtaned even thought the factoral desgn ponts are not replcated and t permts the expermenter to obtan an assessment of model adequacy wth respect to certan second-order terms. Specfcally, we present an -test for the hypotheses H : β + β + β H : β + β + β An equvalent t-statstc can also be employed to test these hypotheses. Some computer software programs report the t-test nstead of (or n addton to) the -test. t s not dffcult to develop the t-test and to show that t s equvalent to the -test. Suppose that the approprate model for the response s a complete quadratc polynomal and that the expermenter has conducted an unreplcated full factoral desgn wth n desgn ponts plus n C center ponts. Let y and y represent the averages of the responses at the factoral and center ponts, respectvely. Also let ˆ be the estmate of the varance obtaned usng the center ponts. t s easy to show that and Therefore, E y n n n n ( ) n ( ) β + β + β + + β β + β + + β E y C ( C) n ( n ) C Cβ β E( y yc) β + β + + β and so we see that the dfference n averages y of the pure quadratc model parameters. Now the varance of y V( y yc) + n nc y C s an unbased estmator of the sum y C s Consequently, a test of the ove hypotheses can be conducted usng the statstc t y y C ˆ + n nc whch under the null hypothess follows a t dstrbuton wth n C degrees of freedom. We would reject the null hypothess (that s, no pure quadratc curvature) f t > t α /, nc. Ths t-test s equvalent to the -test gven n the boo. To see ths, square the t-statstc ove:

t ( y yc) ˆ + n nc nn( y y) C C ( n ) ˆ + nc Ths rato s computatonally dentcal to the -test presented n the textboo. urthermore, we now that the square of a t random varle wth (say) v degrees of freedom s an random varle wth numerator and v denomnator degrees of freedom, so the t-test for pure quadratc effects s ndeed equvalent to the -test. Supplemental References Good,. J. (955). The nteracton Algorthm and Practcal ourer Analyss. Journal of the Royal Statstcal Socety, Seres B, Vol., pp. 36-37. Good,. J. (958). Addendum to The nteracton Algorthm and Practcal ourer Analyss. Journal of the Royal Statstcal Socety, Seres B, Vol., pp. 37-375. Rayner, A. A. (967). The Square Summng Chec on the Man Effects and nteractons n a n Experment as Calculated by Yates Algorthm. Bometrcs, Vol. 3, pp. 57-573.