Response Surface Methods

Similar documents
First Derivative Test

7. Response Surface Methodology (Ch.10. Regression Modeling Ch. 11. Response Surface Methodology)

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

POLI 8501 Introduction to Maximum Likelihood Estimation

Response Surface Methodology IV

J. Response Surface Methodology

Unit 12: Response Surface Methodology and Optimality Criteria

Remedial Measures Wrap-Up and Transformations Box Cox

Comparing Several Means: ANOVA

Contents. 2 2 factorial design 4

Lecture 8 Optimization

Introduction to gradient descent

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

OPTIMIZATION OF FIRST ORDER MODELS

Lecture 17: Numerical Optimization October 2014

14.0 RESPONSE SURFACE METHODOLOGY (RSM)

Inference with Simple Regression

Electromagnetic Theory Prof. D. K. Ghosh Department of Physics Indian Institute of Technology, Bombay

2 Introduction to Response Surface Methodology

Nonlinear Regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

Machine Learning and Data Mining. Linear regression. Kalev Kask

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

CS 147: Computer Systems Performance Analysis

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

POL 681 Lecture Notes: Statistical Interactions

A Re-Introduction to General Linear Models (GLM)

Non-linear least squares

CSE446: Clustering and EM Spring 2017

Linear Models for Regression CS534

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Gradient Descent. Dr. Xiaowei Huang

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Machine Learning and Data Mining. Linear regression. Prof. Alexander Ihler

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Numerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore

On the other hand, if we measured the potential difference between A and C we would get 0 V.

Value Function Methods. CS : Deep Reinforcement Learning Sergey Levine

Contents. TAMS38 - Lecture 10 Response surface. Lecturer: Jolanta Pielaszkiewicz. Response surface 3. Response surface, cont. 4

Interactions and Factorial ANOVA

Response Surface Methodology

CSC321 Lecture 8: Optimization

Intro to Linear Regression

Interactions and Factorial ANOVA

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur

Physics 331 Introduction to Numerical Techniques in Physics

Intro to Linear Regression

Remedial Measures, Brown-Forsythe test, F test

Linear Regression. Udacity

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Inference in Regression Analysis

Data Set 1A: Algal Photosynthesis vs. Salinity and Temperature

CPSC 540: Machine Learning

Analysis of Covariance

1 Degree distributions and data

Contents. Response Surface Designs. Contents. References.

Gradient Descent. Sargur Srihari

Regression Analysis IV... More MLR and Model Building

Generative v. Discriminative classifiers Intuition

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

CSC 411: Lecture 04: Logistic Regression

Linear Models for Regression

Diagnostics and Remedial Measures

28. SIMPLE LINEAR REGRESSION III

Response-surface illustration

MIT Manufacturing Systems Analysis Lecture 14-16

Generalized Linear Models

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Bias-Variance Tradeoff

Gradient and Directional Derivatives October 2013

Least Mean Squares Regression. Machine Learning Fall 2018

Regression Estimation Least Squares and Maximum Likelihood

Directional Derivatives and Gradient Vectors. Suppose we want to find the rate of change of a function z = f x, y at the point in the

Lecture 2: Linear regression

6.867 Machine Learning

Machine Learning CS 4900/5900. Lecture 03. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Formal Statement of Simple Linear Regression Model

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Week 11 Sample Means, CLT, Correlation

Optimization and Gradient Descent

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Lecture 24: Weighted and Generalized Least Squares

Landscapes & Algorithms for Quantum Control

Statistical View of Least Squares

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

Iterative Methods for Solving A x = b

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Math (P)Review Part II:

Lecture 30. DATA 8 Summer Regression Inference

R 2 and F -Tests and ANOVA

RESPONSE SURFACE MODELLING, RSM

Lecture 12. Functional form

1 Review of the dot product

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

DESIGN OF EXPERIMENT ERT 427 Response Surface Methodology (RSM) Miss Hanna Ilyani Zulhaimi

Data Analysis and Statistical Methods Statistics 651

Transcription:

Response Surface Methods 3.12.2014

Goals of Today s Lecture See how a sequence of experiments can be performed to optimize a response variable. Understand the difference between first-order and second-order response surfaces. 1 / 31

Introductory Example: Antibody Production Large amounts of antibodies are obtained in biotechnological processes. Mice produce antibodies. Among other factors, pre-treatment by radiation and the injection of an oil increase yield. Of course we want to maximize yield. Due to the complexity of the underlying process we can not simply set-up a (non)linear model. We can say that ( ) Y i = h x (1) i, x (2) i + E i, where x (1) = radiation dose and x (2) = waiting time between radiation and injection of the oil. However, we do not have a parametric form for the function h. 2 / 31

Remarks In general, we first have to identify those factors that really have an influence on yield. If we have many factors, one of the easiest approaches is to use two levels for each factor (without replicates). This is also known as a 2 k -design, where k is the number of factors. If this is still too much effort, only a fraction of all possible combinations can be considered, leading to so called fractional designs. Identification of relevant factors can be done using analysis of variance (ANOVA) (regression). We will not discuss this here. 3 / 31

In the following, we assume that we have already identified the relevant factors. Back to our example The simplest model with an optimum would be a quadratic function. If we would know h, we could identify the optimal setting [x (1) of the process factors. h ( x (1), x (2)) is also called the response surface. Typically, h must be estimated from data. 0, x(2) 0 ] 4 / 31

Hypothetical Example: Reaction Analysis Assume yield [grams] of a process depends on Reaction time [min] Temperature [ Celsius] An experiment is performed as follows: For a temperature of T = 220 C, different reaction times are used: 80, 90, 100, 110 and 120 minutes. The maximum seems to be at 100 minutes (see plot on next slide). Hence, we fix reaction time at 100 minutes and vary temperature at 180, 200, 220, 240 and 260 C. 5 / 31

The optimum seems to be at 220 C. Can we now conclude that the (global) optimal value is at [100min, 220 C]? 80 80 Yield [grams] 70 60 50 Yield [grams] 70 60 50 40 40 80 90 100 110 120 Time [min] 180 200 220 240 260 Temperature [Celsius] Left: Temperature T = 220 C / Right: Reaction time t = 100 minutes. 6 / 31

True response curve 300 68 65 60 Temperature [Celsius] 280 260 240 220 200 180 50 70 80 90 100 110 120 Time [min] 50 We miss the global optimum if we simply vary the variables one-by-one. The reason is that the maximal value of one variable is usually not independent of the other one! 7 / 31

First-Order Response Surfaces As we do not know the true response surface, we need to get an idea about it by doing experiments at the right design points. Where do we start? Knowledge of the process may tell us that reasonable values are a temperature of 140 C and a reaction time of 60 minutes. Moreover, we need to have an idea about a reasonable increment, say we want to vary reaction time by 10 minutes and temperature by 20 C (this comes from your understanding of the process). 8 / 31

We start with a so-called first-order response surface. The idea is to approximate the response surface with a plane, i.e. Y i = θ 0 + θ 1 x (1) i + θ 2 x (2) i + E i. This is a linear regression problem! We need some data to estimate the model parameters. By choosing the design points right, we get the best possible parameter estimates (with respect to accuracy). 9 / 31

Here, the first-order design is as follows: Variable Variable Yield in original units in coded units Run Temperature [ C] Time [min] Temperature Time Y [grams] 1 120 50 1 1 52 2 160 50 +1 1 62 3 120 70 1 +1 60 4 160 70 +1 +1 70 5 140 60 0 0 63 6 140 60 0 0 65 First-order design and measurement results for the example Reaction Analysis. The single experiments (runs) were performed in random order: 5, 4, 2, 6, 1, 7, 3. 10 / 31

A visualization in coded units illustrates the special layout of the design points. 1 1 1 1 11 / 31

The fitted plane, the so called first-order response surface, is given by ŷ = θ 0 + θ 1 x (1) + θ 2 x (2). This is only an approximation of the real response surface. Note: The plane can not give us an optimal value. But it enables us to determine the direction of steepest ascent. We denote it by [ ] c (1). c (2) This will be the fastest way to get large values of the response variable. Hence, we should follow that direction. 12 / 31

This means we should perform additional experiments along [ ] x (1) [ ] 0 c (1) x (2) + j c (2), 0 where [x (1) 0, x(2) 0 ]T is the point in the center of our experimental design and j = 1, 2,... Remarks Observations in the center of the design have no influence on the parameter estimates of the slopes θ 1 and θ 2 (and hence no influence on the gradient). This is due to the special arrangement of the other points. 13 / 31

Why is it useful to have multiple observations in the center? They can be used to estimate the measurement error without relying on any assumptions. They make it possible to detect curvature. If there is no (local) curvature, the average of the response values in the center and the average of the response values of the other design points should be more or less equal. It is possible to perform a statistical test to check for curvature. 14 / 31

If there is (significant) curvature, we don t want to use a plane but a quadratic function (see later). Otherwise we move along the direction of the gradient. Randomization Randomization is crucial for all experiments. Hence, we should really perform the experiments in a random sequence, even though it would be usually more convenient to e.g. do the two experiments in the center right after each other. 15 / 31

Back to our example The first-order response surface is ŷ = 62 + 5 x (1) + 4 x (2) = 3 + 0.25 x (1) + 0.4 x (2), where [ x (1), x (2) ] T are the coded x-values (taking values ±1). The steepest ascent direction is given by the gradient. If we calculate the gradient for the coded variables this leads to the direction vector [ ] 5, 4 that corresponds to [ 100 40 in original units. ] 16 / 31

We get a different result if we directly take the gradient for the original variables. The reason is because we do a coordinate transformation. Working with coded units in general makes sense because the scale of the predictors is chosen such that increasing a variable by one unit should have more or less the same effect on the response for all variables. 17 / 31

Further experiments are now performed along [ ] [ ] 140 25 + j, j = 1, 2,... 60 10 which corresponds to the steepest ascent direction with respect to the coded x-values. This leads to new data Temperature [ C] Time [min] Y 1 165 70 72 2 190 80 77 3 215 90 79 4 240 100 76 5 265 110 70 Experimental design and measurement results for experiments along the steepest ascent direction for the example Reaction Analysis. 18 / 31

We reach a maximum at a temperature of 215 C and at a reaction time of 90 minutes. Now we could do a further first-order design around this new optimal values to get a new gradient and then iterate many times. In general, the hope is that we are already close to the optimal solution. We therefore use a quadratic approach to model a peak. 19 / 31

Second-Order Response Surfaces Once we are close to the optimal solution, the plane will be rather flat. We expect that the optimal solution will be somewhere in our experimental set-up (a peak). Hence, we expect curvature. We can directly model such a peak by using a second-order polynomial Y i = θ 0 + θ 1 x (1) i + θ 2 x (2) i + θ 11 (x (1) i ) 2 + θ 22 (x (2) i ) 2 + θ 12 x (1) i x (2) i + E i. Remember: This is still a linear model! We now have more parameters, hence we need more observations to fit the model. 20 / 31

Two Examples of Second-Order Response Surfaces 8 7 9 13 12 11 x (2) 10 x (2) 10 9 8 7 x (1) x (1) Left: Situation with a maximum / Right: Saddle 21 / 31

We therefore expand our design by adding additional points. 1 0-1 0 1-1 This is a so called rotatable central composite design. All points have the same distance from the center. 22 / 31

Hence, we can simply add the points on the axis (they have coordinates ± 2) to our first order design. Again, by choosing the points that way the model has some nice theoretical properties. New data can be seen on the next slide. We use this data and fit our second-order polynomial model (with linear regression!), leading to the second-order response surface ŷ = 278 + 2.0 x (1) + 3.2 x (2) + 0.0060 (x (1) ) 2 + 0.026 (x (2) ) 2 + 0.006 x (1) x (2). 23 / 31

New Data Variables Variables Yield in original units in coded units Run Temperature [ C] Time [min] Temperature Time Y [grams] 1 195 80 1 1 78 2 235 80 +1 1 76 3 195 100 1 +1 72 4 235 100 +1 +1 75 5 187 90 2 0 74 6 243 90 + 2 0 76 7 215 76 0 2 77 8 215 104 0 + 2 72 9 215 90 0 0 80 Rotatable central composite design and measurement results for the example Reaction Analysis. 24 / 31

Given the second-order response surface, it s possible to find an analytical solution for the optimal value by taking partial derivatives and solving the resulting linear equation system for x (1) 0 and x (2) 0, leading to x (1) 0 = θ 12 θ2 2 θ 22 θ1 4 θ 11 θ22 θ 2 12 (see lectures notes for derivation). x (2) 0 = θ 12 θ1 2 θ 11 θ2 4 θ 11 θ22 θ 2 12. 25 / 31

Summary Picture 110 Time [min] 100 90 80 70 80 75 78 70 60 50 150 200 250 Temperature [Celsius] 26 / 31

Back to our Original Example A radioactive dose of 200 rads and a time of 14 days is used as starting point. Around this point we use a first-order design. Variables Variables Yield in original units in coded units Run RadDos [rads] Time [days] RadDos Time Y 1 100 7 1 1 207 2 100 21 1 +1 257 3 300 7 +1 1 306 4 300 21 +1 +1 570 5 200 14 0 0 630 6 200 14 0 0 528 7 200 14 0 0 609 First-order design and measurement results for the example Antibody Production. 27 / 31

What can we observe? Yield decreases at the borders! If we check for significant curvature, we get a 95%-confidence interval for the difference (µ c µ f ) of Y c Y f ± q t nc 1 0.975 s 2 (1/n c + 1/2 k ) = 589 335 ± 4.30 41.1 = [77, 431]. As the confidence interval doesn t cover zero there is significant curvature. We therefore expand our design to a rotatable central composite design by doing additional measurements. 28 / 31

New data: Variables Variables Yield in original units in coded units Run RadDos [rads] Time [days] RadDos Time Y 8 200 4 0 2 315 9 200 24 0 + 2 154 10 59 14 2 0 100 11 341 14 + 2 0 513 Rotatable central composite design and measurement results for the example Antibody Production. The estimated response surface is Ŷ = 608.4 + 5.237 RadDos + 77.0 Time 0.0127 RadDos 2 3.243 Time 2 + 0.0764 RadDos Time. We can identify the optimal conditions in the contour plot (see next slide) or analytically: RadDos opt 250 rads and time opt 15 days. 29 / 31

Contour Plot of Second-Order Response Surface 25 20 Time [days] 15 10 590 610 550 500 400 5 300 300 50 100 150 200 250 300 350 RadDos [rads] 30 / 31

Summary Using a sequence of experiments, we can find the optimal setting of our variables. If we are far away, we use first-order designs, leading to a first-order response surface (a plane). As soon as we are close to the optimum, a rotatable central composite design can be used to estimate the second-order response surface. The optimum on that response surface can be determined analytically. 31 / 31