Chater 1. Sulemental Tet Material S1-1. The Covariance Matri of the Regression Coefficients In Section 1-3 of the tetbook, we show that the least squares estimator of β in the linear regression model y= Xβ + ε 1 β = ( XX ) Xy is an unbiased estimator. We also give the result that the covariance matri of σ ( XX ) 1 Consider β is (see Equation 1-18). This last result is relatively straightforward to show. V( 1 β ) = V[( XX ) Xy ] The quantity ( XX ) 1 X is just a matri of constants and y is a vector of random variables. Now remember that the variance of the roduct of a scalar constant and a scalar random variable is equal to the square of the constant times the variance of the random variable. The matri equivalent of this is V( 1 β ) = V[( XX ) Xy ] 1 1 = ( XX ) X V ( y)[( XX ) X ] Now the variance of y is σ I, where I is an n n identity matri. Therefore, this last equation becomes V( 1 β) = V[( XX ) Xy ] 1 1 = ( XX ) X V ( y)[( XX ) X ] 1 1 = σ ( XX ) X [( XX ) X ] = σ ( XX ) XXXX ( ) 1 1 = σ ( XX ) 1 We have used the result from matri algebra that the transose of a roduct of matrices is just the roduce of the transoses in reverse order, and since ( XX ) is symmetric its transose is also symmetric. S1-. Regression Models and Designed Eeriments In Eamles 1- through 1-5 we illustrate several uses of regression methods in fitting models to data from designed eeriments. Consider Eamle 1-, which resents the regression model for main effects from a 3 factorial design with three center runs. Since the ( XX ) 1 matri is symmetric because the design is orthogonal, all covariance terms between the regression coefficients are zero. Furthermore, the variance of the regression coefficients is
V ( β ) = σ / 1 =. 833σ V( β ) = σ / 8 = 15. σ, i = 1,, 3 i In Eamle 1-3, we reconsider this same roblem but assume that one of the original 1 observations is missing. It turns out that the estimates of the regression coefficients does not change very much when the remaining 11 observations are used to fit the first-order model but the ( XX ) 1 matri reveals that the missing observation has had a moderate effect on the variances and covariances of the model coefficients. The variances of the regression coefficients are now larger, and there are some moderately large covariances between the estimated model coefficients. Eamle 1-4, which investigated the imact of inaccurate design factor levels, ehibits similar results. Generally, as soon as we deart from an orthogonal design, either intentionally or by accident (as in these two eamles), the variances of the regression coefficients will increase and otentially there could be rather large covariances between certain regression coefficients. In both of the eamles in the tetbook, the covariances are not terribly large and would not likely result in any roblems in interretation of the eerimental results. S1-3. Adjusted R In several laces in the tetbook, we have remarked that the adjusted R statistic is referable to the ordinary R, because it is not a monotonically non-decreasing function of the number of variables in the model. From Equation (1-7) note that R Adj L SSE dfe = 1 N M / O SST df T Q P / MSE = 1 SS / df Now the mean square in the denominator of the ratio is constant, but MS E will change as variables are added or removed from the model. In general, the adjusted R will increase when a variable is added to a regression model only if the error mean square decreases. The error mean square will only decrease if the added variable decreases the residual sum of squares by an amount that will offset the loss of one degree of freedom for error. Thus the added variable must reduce the residual sum of squares by an amount that is at least equal to the residual mean square in the immediately revious model; otherwise, the new model will have an adjusted R value that is larger than the adjusted R statistic for the old model. T T S1-4. Stewise and Other Variable-Selection Methods in Regression In the tetbook treatment of regression, we concentrated on fitting the full regression model. Actually, in moist alications of regression to data from designed eeriments the eerimenter will have a very good idea about the form of the model he or she wishes
to fit, either from an ANOVA or from eamining a normal robability lot of effect estimates. There are, however, other situations where regression is alied to unlanned studies, where the data may be observational data collected routinely on some rocess. The data may also be archival, obtained from some historian or library. These alications of regression frequently involve a moderately-large or large set of candidate regressors, and the objective of the analysts here is to fit a regression model to the best subset of these candidates. This can be a comle roblem, as these unlanned data sets frequently have outliers, strong correlations between subsets of the variables, and other comlicating features. There are several techniques that have been develoed for selecting the best subset regression model. Generally, these methods are either stewise-tye variable selection methods or all ossible regressions. Stewise-tye methods build a regression model by either adding or removing a variable to the basic model at each ste. The forward selection version of the rocedure begins with a model containing none of the candidate variables and sequentially inserts variables into the model one-at-a-time until a final equation is roduced. In backward elimination, the rocedure begins with all variables in the equation, and then variables are removed one-at-a-time to roduce a final equation. Stewise regression usually consists of a combination of forward and backward steing. There are many variations of the basic rocedures. In all ossible regressions with K candidate variables, the analyst eamines all K ossible regression equations to identify the ones with otential to be a useful model. Obviously, as K becomes even moderately large, the number of ossible regression models quickly becomes formidably large. Efficient algorithms have been develoed that imlicitly rather than elicitly eamine all of these equations. For more discussion of variable selection methods, see tetbooks on regression such as Montgomery and Peck (199) or Myers (199). S1-5. The Variance of the Predicted Resonse In section 1-5. we resent Equation (1-4) for the variance of the redicted resonse at a oint of interest = [ 1,,, k ]. The variance is V[ y ( )] = σ ( X where the redicted resonse at the oint is found from Equation (1-39): It is easy to derive the variance eression: y ( ) = β V[ y ( )] = V( β ) = V ( β) 1 1 = σ ( X
Design-Eert calculates and dislays the confidence interval on the mean of the resonse at the oint using Equation (1-41) from the tetbook. This is dislayed on the oint rediction otion on the otimization menu. The rogram also uses Equation (1-4) in the contour lots of rediction standard error. S1-6. Variance of Prediction Error Section 1-6 of the tetbook gives an equation for a rediction interval on a future observation at the oint = [ 1,,, k ]. This rediction interval makes use of the variance of rediction error. Now the oint rediction of the future observation y at is and the rediction error is The variance of the rediction error is y ( ) = β e y y ( ) = V( e ) = V[ y y ( )] = V( y ) + V[ ( y )] because the future observation is indeendent of the oint rediction. Therefore, V( e ) = V( y ) + V[ ( y )] 1 = σ + σ ( X 1 = σ 1+ ( X The square root of this quantity, with an estimate of σ = σ = (1-4) defining the rediction interval. MS E, aears in Equation S1-7. Leverage in a Regression Model In Section 1-7. we give a formal definition of the leverage associated with each observation in a design (or more generally a regression data set). Essentially, the leverage for the ith observation is just the ith diagonal element of the hat matri or H= X( X 1 X h ii = i ( X 1 i where it is understood that is the ith row of the X matri. i There are two ways to interret the leverage values. First, the leverage h ii is a measure of distance reflecting how far each design oint is from the center of the design sace. For eamle, in a k factorial all of the cube corners are the same distance k from the
design center in coded units. Therefore, if all oints are relicated n times, they will all have identical leverage. Leverage can also be thought of as the maimum otential influence each design oint eerts on the model. In a near-saturated design many or all design oints will have the maimum leverage. The maimum leverage that any oint can have is h ii = 1. However, if oints are relicated n times, the maimum leverage is 1/n. High leverage situations are not desirable, because if leverage is unity that oint fits the model eactly. Clearly, then, the design and the associated model would be vulnerable to outliers or other unusual observations at that design oint. The leverage at a design oint can always be reduced by relication of that oint.