Stat 5031 Quadratic Response Surface Methods (QRSM) Sanford Weisberg November 30, 2015

Similar documents
MODELS WITHOUT AN INTERCEPT

Diagnostics and Transformations Part 2

Lecture 10. Factorial experiments (2-way ANOVA etc)

Multiple Linear Regression (solutions to exercises)

Stat 401B Exam 2 Fall 2016

ST430 Exam 2 Solutions

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users

Stat 5102 Final Exam May 14, 2015

STAT 3022 Spring 2007

Dealing with Heteroskedasticity

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

Holiday Assignment PS 531

Chapter 5 Exercises 1

Stat 401B Final Exam Fall 2015

Multiple Regression: Example

General Linear Statistical Models - Part III

Nonstationary time series models

36-707: Regression Analysis Homework Solutions. Homework 3

Workshop 7.4a: Single factor ANOVA

Handout 4: Simple Linear Regression

Density Temp vs Ratio. temp

STAT 572 Assignment 5 - Answers Due: March 2, 2007

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004)

Chapter 8 Conclusion

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Introduction and Background to Multilevel Analysis

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Week 7 Multiple factors. Ch , Some miscellaneous parts

STAT 350: Summer Semester Midterm 1: Solutions

1 Use of indicator random variables. (Chapter 8)

Tests of Linear Restrictions

Technical note: Curve fitting with the R Environment for Statistical Computing

Using R in 200D Luke Sonnet

Design and Analysis of Experiments

Chapter 12: Linear regression II

STAT 571A Advanced Statistical Regression Analysis. Chapter 8 NOTES Quantitative and Qualitative Predictors for MLR

Regression and Models with Multiple Factors. Ch. 17, 18

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

lm statistics Chris Parrish

Lecture 2. The Simple Linear Regression Model: Matrix Approach

22s:152 Applied Linear Regression

Chaper 5: Matrix Approach to Simple Linear Regression. Matrix: A m by n matrix B is a grid of numbers with m rows and n columns. B = b 11 b m1 ...

> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE))

GPA Chris Parrish January 18, 2016

Swarthmore Honors Exam 2015: Statistics

General Linear Statistical Models

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R

Chapter 3 - Linear Regression

MATH 644: Regression Analysis Methods

Example: 1982 State SAT Scores (First year state by state data available)

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Estimated Simple Regression Equation

Yet More Supporting Data Analysis for Unifying Life History Analysis for Inference of Fitness and Population Growth By Ruth G. Shaw, Charles J.

Design and Analysis of Experiments

> Y ~ X1 + X2. The tilde character separates the response variable from the explanatory variables. So in essence we fit the model

Multiple Regression Introduction to Statistics Using R (Psychology 9041B)

Homework 9 Sample Solution

Linear Regression is a very popular method in science and engineering. It lets you establish relationships between two or more numerical variables.

ST430 Exam 1 with Answers

Linear Model Specification in R

Introduction to Linear Regression

Consider fitting a model using ordinary least squares (OLS) regression:

Regression. Marc H. Mehlman University of New Haven

SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester

Regression and the 2-Sample t

Math 2311 Written Homework 6 (Sections )

Multiple Predictor Variables: ANOVA

Variance Decomposition and Goodness of Fit

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Inference for Regression

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Stat 5303 (Oehlert): Analysis of CR Designs; January

R Demonstration ANCOVA

Exercise 2 SISG Association Mapping

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

SCHOOL OF MATHEMATICS AND STATISTICS

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Coefficient of Determination

1 Introduction 1. 2 The Multiple Regression Model 1

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

R 2 and F -Tests and ANOVA

Stat 401B Exam 2 Fall 2015

1 The Classic Bivariate Least Squares Model

Homework 04. , not a , not a 27 3 III III

MS&E 226: Small Data

Garvan Ins)tute Biosta)s)cal Workshop 16/6/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia

Stat 401B Final Exam Fall 2016

Introduction and Single Predictor Regression. Correlation

Chapter 9. Polynomial Models and Interaction (Moderator) Analysis

Pumpkin Example: Flaws in Diagnostics: Correcting Models

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1

2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling

Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions.

Transcription:

Stat 5031 Quadratic Response Surface Methods (QRSM) Sanford Weisberg November 30, 2015 One Variable x = spacing of plants (either 4, 8 12 or 16 inches), and y = plant yield (bushels per acre). Each condition repeated 4 times, for a total of 12 observations. boxplot(y ~ x, main="crop Yield", xlab="spacing (inches)", ylab="yield (bu/acre)") Crop Yield Yield (bu/acre) 130 135 140 145 150 4 6 8 12 Spacing (inches) # center and scale x m1 <- lm(y ~ factor(x), data) anova(m1) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) 1

factor(x) 3 852.10 284.032 92.059 1.501e-08 *** Residuals 12 37.02 3.085 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 tapply(data$y, data$x, mean) 4 6 8 12 131.0944 142.6521 149.4833 148.4001 tapply(data$y, data$x, sd) 4 6 8 12 1.649285 1.525152 1.728480 2.075422 View x.scaled as continuous to get a quadratic response surface: data$x.scaled <- 2 * ((data$x - min(data$x))/(max(data$x) - min(data$x))) -1 m2 <- lm(y ~ 1 + x.scaled + I(x.scaled^2), data) plot(y ~ x.scaled, data, xlab="scaled Spacing", ylab="yield") new.x <- seq(-1, 1, length=100) lines(new.x, predict(m2, data.frame(x.scaled=new.x)), lwd=3) Yield 130 135 140 145 150 1.0 0.5 0.0 0.5 1.0 Scaled Spacing Where is this estmiated response curve maximized? 2

coef(m2) (Intercept) x.scaled I(x.scaled^2) 149.452409 8.663128-9.710327 What is the minimum number of distinct values of x required to fit a quadratic? Two variable QRSM y = β 0 + β 1 x 1 + β 2 x 2 + β 11 x 2 1 + β 22 x 2 2 + β 12 x 1 x 2 + ɛ If β 11, β 22 are negative and β 2 12 4β 11 β 22 flat spot is a maximum If β 11, β 22 are postive and β 2 12 4β 11 β 22 flat spot is a minimum Otherwise, flat spot is a saddle point, meaning there is a maximum for one of the xs and a minimum for the other. We hope the first two cases come up when we are trying to maximize/minimize y; if they don t then the optimum will be off in the wild blue yonder. Sometimes, the QRSM suggests that the place we want to be is a long way from where we gathered the data. In that case, we can use the QRSM to suggest a direction in which we should move to a new / region and go and gather some fresh and more relevant data. Special cases: β 12 = 0 β 11 = β 22 = 0 If β 11 = β 22 = 0 but β 12 0 model is bilinear. (This does not come up much) Estimation and design Estimate via OLS Possible design: 3 2. Requires 9 design points for a model with 1 intercept + 2 linear terms + 2 quadratics + 1 interaction, for 6 parameters Replication required for pure error estmiate of error, but one could only replicate the center point if wanted. Alternative is a central composite design. Also uses 9 design points; also requires replication often of just the center point. As elsewhere, we need to check constancy of variance as best we can, and normality, and may want to transform y. Example 14.2 from the Textbook (description: Example 13.8) Etch Rate infile <- "http://stat.umn.edu/hawkins/5031/example_14_2_qrsm.txt" getdata <- read.table(infile, header=true) getdata 3

line ETCH UNIFORM 1 1-1.000-1.000 1054 80 2 2 1.000-1.000 936 81 3 3-1.000 1.000 1179 79 4 4 1.000 1.000 1417 98 5 5-1.414 0.000 1049 76 6 6 1.414 0.000 1287 88 7 7 0.000-1.414 927 79 8 8 0.000 1.414 1345 92 9 9 0.000 0.000 1151 90 10 10 0.000 0.000 1150 88 11 11 0.000 0.000 1177 89 12 12 0.000 0.000 1196 90 etchfit <- lm(etch ~ + + I(^2) + I(^2) + *, getdata) summary(etchfit) Call: lm(formula = ETCH ~ + + I(^2) + I(^2) + *, data = getdata) Residuals: Min 1Q Median 3Q Max -35.548-20.870 2.748 23.405 41.043 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 1168.501 17.595 66.411 7.84e-10 *** 57.075 12.442 4.587 0.00374 ** 149.654 12.442 12.028 2.00e-05 *** I(^2) -1.625 13.913-0.117 0.91085 I(^2) -17.629 13.913-1.267 0.25207 : 89.000 17.595 5.058 0.00231 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 35.19 on 6 degrees of freedom Multiple R-squared: 0.9698, Adjusted R-squared: 0.9447 F-statistic: 38.58 on 5 and 6 DF, p-value: 0.000174 Strange model: ETCH looks to be effectively bilinear, with no curvature. The fitted surface in 2D and can be examined in a coutour plot using the contour function. The contour function requires 3 arguments, a grid of n values for the x-axis (x1.new), a grid of m values for the y-axis (x2.new), and a matrix of values of the function to be plotted, so the value in the i-th row and j column of the matrix is the fitted value at (x1.new[i], x2.new[j]). Usually n = m and here I ve set n = m = 21: x1.new <- x2.new <- seq(-2, 2, length=21) # values for x and y axis The expand.grid function will create a data.frame with two columns of length mn that consists of all possible combinations of levels of x1.new and x2.new. 4

new.data <- expand.grid(=x1.new, =x2.new) head(new.data) 1-2.0-2 2-1.8-2 3-1.6-2 4-1.4-2 5-1.2-2 6-1.0-2 Predictions are then computed at each of these nm values of the predictors using the predict function, and this becomes an n m matrix using the matrix function. fit <- matrix( predict(etchfit, new.data), nrow=21) par(mfrow=c(1, 2)) contour(x1.new, x2.new, fit, main="montgomery etch rate", xlab="", ylab="") persp(x1.new, x2.new, fit, main="montgomery etch rate", xlab="", ylab="", zlab="predicted") Montgomery etch rate Montgomery etch rate 2 1 0 1 2 1000 1300 800 1000 1200 1500 1700 1600 1400 1100 900 700 2 1 0 1 2 Predicted Example: Suppose the target ETCH rate were 1100. 5

Uniformity unifit <- update(etchfit, UNIFORM ~.) summary(unifit) Call: lm(formula = UNIFORM ~ + + I(^2) + I(^2) + :, data = getdata) Residuals: Min 1Q Median 3Q Max -1.2499-0.4252 0.1027 0.5292 1.0524 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 89.2499 0.4889 182.543 1.82e-12 *** 4.6217 0.3457 13.367 1.08e-05 *** 4.2984 0.3457 12.432 1.65e-05 *** I(^2) -3.4381 0.3866-8.893 0.000113 *** I(^2) -1.6875 0.3866-4.365 0.004745 ** : 4.5000 0.4889 9.204 9.28e-05 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.9779 on 6 degrees of freedom Multiple R-squared: 0.9882, Adjusted R-squared: 0.9784 F-statistic: 100.8 on 5 and 6 DF, p-value: 1.054e-05 fit <- matrix(predict(unifit, new.data), nrow=21) par(mfrow=c(1, 2)) contour(fit, main="montgomery uniformity", xlab="", ylab="") persp(x1.new, x2.new, fit, main="montgomery uniformity", xlab="", ylab="", zlab="predicted") 6

Montgomery uniformity Montgomery uniformity 0.0 0.2 0.4 0.6 0.8 1.0 60 65 70 75 80 75 85 90 70 95 65 100 0.0 0.2 0.4 0.6 0.8 1.0 60 Predicted The contour for which ETCH 1100: contour(x1.new, x2.new, fit, xlab="", ylab="", main="montgomery uniformity with Etch rate contour") tarx1 <- (-18-150*x2.new) / (57 + 89*x2.new) sele <- abs(tarx1) < 2 lines(tarx1[sele], x2.new[sele], lty=2, lwd=4) 7

Montgomery uniformity with Etch rate contour 2 1 0 1 2 55 60 65 70 75 75 70 60 100 95 80 65 90 85 2 1 0 1 2 8