Useful books: 1. Grady Hanrahan, Environmental Chemometrics, CRC Press, Statistics for Environmental Engineers, CRC Press LLC, 2002

Size: px
Start display at page:

Download "Useful books: 1. Grady Hanrahan, Environmental Chemometrics, CRC Press, Statistics for Environmental Engineers, CRC Press LLC, 2002"

Transcription

1 1 Environmetrics II (Part 1: DOE) Environmetrics is one of the newest so-called metrics sciences, e.g. psychometrics, econometrics, chemometrics, and (for some reason) biometry. These are methodological sciences for analysing measurement data, and designing experimental set-ups in different fields of science. Thus environmetrics deals with measurement data of all environmental sciences. Therefore data-analytical problems related to monitoring our environment are typical material of environmetrical publications. However, the emphasis is different in environmental engineering: designing new systems or processes, or monitoring existing ones, plays the key role. As a consequence of this, only a selection of environmetrical tools are treated in this course. Very many of them are similar to the ones in technometrics or chemometrics. Useful books: 1. Grady Hanrahan, Environmental Chemometrics, CRC Press, Statistics for Environmental Engineers, CRC Press LLC, 2002 The course is divided into two parts: In the first part we shall study the problem of how to design experiments so that number of experiments remains reasonable, and that at the same time we get reliable information about the system under study. In the second part, we shall learn the elements of multivariate data-analysis which is typical needed in monitoring or fault detection problems, or in any problems with large number of variables. As an example of the latter, consider a spectral measurement of an environmental sample. The measured spectrum consists actually of typically thousands of variables because each measured absorbance can be considered a single variable. Note that the lab exercises and examples carried out using Excel, Matlab or R are an essential part of this course. It is also recommended print out e.g. R history files or R command files.

2 2 Statistical design of experiments (DOE) The role of experimentation Experiments are needed if we want to find out something that cannot be found out otherwise. For example, building a new wastewater treatment unit doesn t necessarily require experiments. There is plenty of both theoretical and empirical knowledge of common wastewater treatment units, and using this knowledge it is possible to build a new one, and to expect it to work. However, if we are going to make something completely new, the only alternatives are to use either theoretical knowledge or experimentation. Using theoretical knowledge in design, scale-up or optimization in engineering means simulation based on mathematically described models converted into computer programs. Opposite to this, one might expect that experimentation could be modelling-free. But, that is not the case and hopefully this course will explain why. The objectives of part 1 are to show why proper design of experiments is crucial in getting reliable and meaningful results from experiments to give understanding in the basic principles and concepts of DOE to introduce some of the most common types of designs to give understanding of the complexity of real applications The most important of the objectives listed above is the first one. It is not realistic to expect anyone to become a specialist in the field after an introductory course. DOE is as much an art as a science, and its mastery requires a lot of experience and continuous studies. However, it is possible to understand its importance and main principles by going through some typical simple examples. In this introduction, unnecessary mathematics is avoided, but basic knowledge about engineering mathematics is needed. Also basic concepts of probability and statistics, e.g. normality, repeatability or measurement uncertainty, covered in Environmetrics I, are essential in understanding the principles of DOE. Many commercial software packages, which facilitate making good designs and analysing the results, are available. This introduction is independent of such packages and all the calculations of the examples are carried out using Matlab, R or Excel. However, for an infrequent user of DOE tools, commercial software (e.g. JMP, MODDE, SPSS,...) can be of great benefit. Exercise 1 Suppose we are planning experiments in order see how pressure and temperature effect the quality of the product. Experiments are very expensive and we want do as few experiments as possible. Below are three designs of a case where the design variables are pressure (p) and temperature (T):

3 3 Design 1 p T Design p T Design p T Which is the worst and which is the best design of these three ones and why? 1. The interplay between learning and experimentation All scientific knowledge is based on successive steps of deductive and inductive reasoning. In the beginning, there is a hypothesis (a theory) about a problem to be studied. This hypothesis allows us to conclude the expected outcome of our experiments which is deductive reasoning. After the experiments have been conducted, the results either confirm or contradict the hypothesis. If there is a contradiction, we have to reject the original hypothesis and there is a need for a new hypothesis to explain the contradiction. The process of finding new possible hypotheses is inductive reasoning. The new hypothesis allows new deductions and planning of new experiments to check its validity. This cycle can be repeated until the theory is satisfactory. It is easy to see the role of such cycles e.g. in the development of Newtonian mechanics to the theory of relativity. It is good to realize the importance of such cycles in all experimental research, and especially in DOE.

4 4 2. Experimental error Whenever experiments are carried out, the role of experimental error has to be considered. Any real life experiment involves several steps that influence the final result. In practice, none of these steps can be exactly repeated due to inevitable random variation in conditions related to the step. Therefore results of the repeated experiments will always vary even if one tries to keep the conditions as constant as possible. Thus we shall always encounter the problem of distinguishing systematic variation from random variation. On of the key issues in DOE is to get an estimate of the magnitude of random variation and reliable conclusions are impossible without this knowledge. For meaningful conclusions, we need to know the basic laws of random variation. The most important such laws are the central limit theorem, the law of large numbers and the propagation of independent random errors. The first of these states that an error that is a sum of independent random errors obeys the normal (Gaussian) distribution. Its importance relies on the fact that the total error of any well conducted experiment is a sum of small independent errors of different sources, e.g. weighing, dilutions, control of different variables such as temperature or pressure and many others of human origin or due to the equipment. The normal distribution is characterized by two parameters: the expected value which is the theoretical mean value and the variance (or its square root, the standard deviation ). The nature of the normal distribution is best realized by recalling the following rules of thumb: 68.3% of normally distributed measurement results are in the interval 95.4% of normally distributed measurement results are in the interval 99.7% of normally distributed measurement results are in the interval As a consequence, if a result deviates more than 3 from the expected value, it is reasonable to suspect that the true expected value of this result is not. Rather, it may be an error or a systematic change has occurred. This is the basic logic of e.g. statistical control charts or statistical tests. We shall introduce only the most important rules of propagation of errors. First of all, it should be noted that these rules are valid only for statistically independent measurements. The first of these states that the standard deviation of a mean ( ) of n results is smaller than the standard deviation of the individual results. To be more exact:. (1) The most important consequence of this fact is that we can increase the accuracy simply by increasing the number of replicate measurements. The other one states that the standard deviation of the difference of two replicate measurements, say x and y, is larger than the standard deviation of the individual results. To be more exact:

5 5. (2a) If x and y have different standard deviations, i.e. they are not replicates, but still independent, the equation is. (2b) This is a fact that we have to take into account whenever we compare two measurements that are assumed to have the same expected value, i.e. in statistical tests concerning differences between mean values. Rough conclusions about the experimental results can be obtained using these simple rules. In real cases, however, we have to take into account the uncertainty of our estimate of, because in practise we have to use the sample standard deviation S instead of. We come to this point later. Example 1 Now let us take a simple example: Six experiments, 3 with catalyst 1 and 3 with catalyst 2 have been made in random order. The results (purity) are The question is whether we can consider the catalyst 1 to be better than the catalyst 2, or not. The mean of the purities with catalyst 1 is.8 and the mean of the purities with catalyst 2 is The logic of the comparison goes in the following way: If we knew the repeatability, i.e. the standard deviation, of the measurements, we could calculate the standard deviation of the difference of the means (8.8), using the rules given above. Assuming that there is no difference in the catalysts, the expected value of the difference would be zero. After this we could simply use the rules of thumb of the normal distribution for estimating the plausibility of the observed difference under this assumption. If the observed difference would be highly improbable, we would conclude that there must be difference between the catalysts. The problem now is that we don t know the standard deviation. However, we can estimate it; actually we can get two estimates from the two sets of three replicates. These are S 1 = 2.4 and S 2 = 2.6. Considering the small difference between these two estimates, we can assume that the type of the catalyst doesn t affect the repeatability. Therefore we can get a more reliable estimate by averaging them. One of the rules of propagation of errors states that standard deviations must be averaged quadratically, i.e. 2.5.

6 6 Now the standard deviation of the difference, using the rules given above, is 2.5. Using the rules of thumb, we can conclude that the difference is more than 3 s and, consequently, it is plausible that the catalysts differ and the first one gives better purities. The problem with this reasoning is that we did not take into account the uncertainty of using S instead of the true. To overcome this, we could make a socalled permutation test or a formal statistical test. Two sample permutation test (applied to the example above) If the catalysts would perform equally, the only variation in the data would be random variation. Therefor labelling the experiments by 1 or 2 should have no effect, other than a random one, into the difference between what label 1 and label 2 means. The six labels can be assigned to the 6 experiments in 6! = 720 ways. Many of these give the same sequence, and there are actually only 20 different permutations of 1,1,1,2,2,2 (the design was 1,1,2,1,1,2). Only one of these different permutations gives a difference of means that is greater than 8. If this was only due to chance we were quite lucky, and maybe it is more reasonable to think that there was a cause for such a high difference, namely the different catalysts. Carrying out permutation tests in practise requires programming skills in Matlab, Octave, R or any similar software. If the number experiments is high, it is impossible to go through all permutations (e.g. 11! = ). In such cases a good alternative is to make enough, say one million, randomly chosen permutations and gather the statistics from them. However, if we can assume the experimental errors to be statistically independent and approximately normally distributed, we can carry out formal statistical tests. Two sample t-test (and some general concepts related to statistical tests) In statistical tests, we have to make two alternative hypotheses: The null hypothesis and the alternative (research) hypothesis. The null hypothesis assumes that there is no difference between the two subjects to be compared. The alternative hypothesis is the opposite to this. However, the alternative can be one-sided or two-sided. The two-sided alternative states that there is a difference, but the one-sided states that either of the subjects gives better results than the other. It should be noted that one should use onesided hypothesis only if there is a reason for it, prior to the experiments. If we assume that no such prior knowledge exists, the hypothesis of example 1 are : and : After this, we have to decide the maximum risk of an erroneous rejection of the null hypothesis. This is called the level of significance ( ) and the erroneous rejection of the null hypothesis is called type I error. Typically is set to 0.05, 0.01 or The more severe would the consequences of type I error be, the smaller should we select. Note the kind of a contradiction in terminology: small means high level of significance. Let us choose =0.05.

7 7 Next we have to calculate the so called test statistic. This is something whose statistical distribution is known under the null hypothesis, letting us calculate the probability of type I error. In this case the test statistic is calculated as. A more general formula for two samples X and Y would be where Now we have to know the statistical distribution of t under the null hypothesis, which is the so called Student s t-distribution, which has only one parameter, the so called degrees of freedom which in this case are = = 4. Now we can calculate the probability of type I error assuming that the null hypothesis is true. This is called the p-value of the test. It can be calculated e.g. using MS-Excel s TDIST function. After this, making the conclusion is easy: if the p-value is smaller than, the null hypothesis is rejected. Actually, one doesn t have to carry out the calculations above in practise, because Excel contains a macro for this particular test, and we shall show the results of it. It is even simpler to carry out the test in R using the R-function t.test. But before that, we have to consider a few more important facts about statistical tests. Let us consider the case if the null hypothesis is not rejected: It is important to understand that this doesn t mean that we have proved that the null hypothesis is true. It simply means that there is not enough evidence to reject it. A good analogy of statistical tests is a trial. If a person is announced not-guilty, it does not prove that he or she hasn t committed the crime. It simply means that there isn t evidence enough ( beyond any reasonable doubt ) to show that he or she is guilty. In the case of statistical testing, the question is related to two important concepts: practical significance and the power of a test. The power of a test is defined as the probability of rejecting the null hypothesis when the alternative is true. It is one minus the probability of the type II error, i.e. the error that is made, when the null hypothesis is not rejected when the alternative is true. The figure below depicts the probabilities of type I and type II errors, i.e. the level of significance and power. If H 0 were true and the true mean were 8, then the shaded red area gives the level of significance. If H 1 were true and the true mean were 12 the green shaded area gives the probability of the type II error (remember that the power is 1- P(type II error)). In both cases we assume that 10 is the rejection limit. Now it is easy to see that moving the rejection limit towards the null hypothesis will increase the type I probability and consequently lower the level of significance, but simultaneously it will increase the power. There deciding the rejection limit is always a compromise between level of significance and power.

8 8 In analysing designed process experiments, we tend to use lower levels of significance than in e.g. pharmaceutical testing. The reason is that we want good power for detecting possible improvements. However, common sense should be used, and if the rejection of the null hypothesis will introduce great financial or human risks, small enough level of significance must be used H 0 true H 1 true In order to be able to calculate the power, we need to know what is the true difference in the subjects to be compared, denoted by. In practise we never know the true difference. Therefore, it is reasonable to calculate the power assuming the difference to be the smallest one to have any practical significance. Practical and statistical significance are two different concepts. The former means that the difference has e.g. financial or other practical relevance as the latter means that we can be confident that such difference exists in reality, however small it is. Consequently, if we detect a practically significant difference that is not statistically significant, i.e. the null hypothesis is not rejected, the reasonable means to be taken is to make more experiments to achieve a test that has enough power to detect such difference. An approximate formula (assuming to be 0.05 and power to be 0.95) is to estimate the number of experiments to achieve the power wanted is, where is the number of subjects to be compared (2 in our example). Accurate formulae exist for specific tests (found in most statistical software), but this approximation is good enough if n is large enough. Here n means the total number of experiments, i.e. n/2 with both catalysts.

9 9 Let us use the formula in our example assuming that the smallest practically significant difference is 2 %-units. The formula gives (4*2*2.5/2)^2 = 100, i.e. 50 experiments with both catalysts. If we assume the that true difference is the same as the observed difference (8.8), the formula gives (4*2*2.5/8.8)^2 5.2, i.e. 3 experiments with both catalysts, which is actually the number experiments made. If the calculations are carried out exactly, e.g. in R, we get 41 and 4 experiments with both catalysts. Thus the approximate formula, called the Wheeler s formula over-estimates the number of experiments when it is high and under-estimates the number of experiments when it is low. In spite of the approximate nature of the formula, it gives the correct order of magnitude of the number of experiments needed. Now, let us perform the test using Excel. After the data has been type in, you should click Tools/Data Analysis.../t-Test: Two-Sample Assuming Equal Variances and the following output is obtained: For the conclusion, we need only the p-value for a two-sided test which is ca Thus we reject the null hypothesis at the 0.05 level of significance. Though we have considered so far only one statistical test, it is good to know that the basic principles behind all statistical tests are exactly the same. The only things that vary are the calculation of the test statistic and its p-value. Later we shall see other applications of statistical tests. It is also good to know that all tests are based on some general assumptions. Practically all tests require that the experimental errors are statistically independent and approximately normally distributed. The latter can normally be safely assumed, if there are no gross errors, due to the central limit theorem. The former, in turn, can usually be guaranteed by randomizing the order of the experiments. Randomization is crucial in good experimentation!

10 10 Exercise 2 Carry out the same test using R or Matlab. Experimental error in DOE The Wheeler s formula clearly shows how the power of the test, with a fixed number of experiments, gets lower as the standard error of the replicates gets higher. Therefore, it is essential to know the degree of repeatability in order to make any meaningful (significant) conclusions. By experimental error we mean the total uncertainty related to the result of an experiment. The only objective way of getting information of the mean experimental error is to make replicate experiments. Consider experiments where we have varied the ph in a reaction with the following results: ph: yield: If we knew that the mean experimental error is negligible, we would expect that the optimal yield is obtained at ph between 8 and 9. However, if we knew that the mean experimental error is ca. 3, it might be quite possible that the yield would be higher at even higher ph values, i.e a linear trend would be quite plausible. In this case, fitting a quadratic model to the data, would mean over-fitting. On the other hand, if the mean experimental error would be, say 1, a linear model (a straight line) would suffer from lack-of-fit. Consequently, without the knowledge about the mean experimental error, it is impossible to assess the reliability of a model describing the dependencies in the data. Therefore, if there is no prior knowledge of the mean experimental error, a good design has to contain replicated experiments. A good model is neither over-fitted, nor does it suffer from lack-of-fit. If the mean experimental error is known, the degree of lack-of-fit can be statistically tested. In summary, the use of empirical models can lead to totally erroneous conclusion if we don t keep in mind the experimental error lurking behind every experimental result. We shall further elucidate this important fact with the following simulated example of a known model ( ) for the yield of a chemical process. Suppose that the standard deviation of the error is 5 and the errors are statistically independent and normally distributed. The four figures below show four independent series of three measurements at ph s 6.0, 6.5 and 7.0.

11 11 Replicate 1 Replicate Replicate Replicate If one made conclusions based on non-replicated measurements they might completely wrong, especially if they rely on over-fitted models. A typical example of an over-fitted model would be fitting a parabola, i.e. a quadratic model to the plots above. The fourth replicate series would give completely wrong predictions about the direction of improvement. The figure below shows the same data with added linear models, i.e. straight lines and quadratic models, i.e. parabolas: Replicate 1 Replicate Replicate Replicate Note that none of the linear models is completely wrong: they all show positive slopes between 7.7 and The reason is that in fitting a straight line the residuals have one degree of freedom as in fitting the parabola the residuals have zero degrees of freedom. In general, the more degrees of freedom are left for the residuals, the more reliable the model is in statistical sense.

12 12 If the experiment had been designed to have four replicates, the four series could have been plotted in a single graph: 'st 2'nd 3'rd 4'th mean values All replicates Now, the straight line above, fitted to the mean values of the replicates, almost coincides with the true straight line ( ), showing beautifully the statistical law of central tendency, i.e. the uncertainty of a mean is times smaller than the uncertainty of a single measurement (n is the number of independent replicate measurements). Exercise 3 Note also that the best way to guarantee statistical independence of the errors, is to randomize the order of experiments, for example in the case above the order might have been Next we shall discuss how to model dependencies. Study how to randomize the rows of a table (matrix) in Excel (use RAND and Sort), in R (use sample) and in Matlab (use randperm).

13 13 3. Empirical mathematical models In this introductory course, we shall discuss only so called empirical mathematical models. It is easier to understand the concept if we first consider their opposite, mechanistic models (also called theoretical or first principles models). A mechanistic model is such that its functional form can deducted from physical or chemical theories. A simple example would be e.g. the Arrhenius equation whose basis is in thermodynamics. In principle, one should use mechanistic model whenever it is possible. However, in many practical situations, the theory is either too complicated,or not even known, to describe the phenomenon under study. The rational approach then is to approximate the underlying unknown functional relationship using some convenient function. The basic principle is that any function, not close to its optimum, is approximately linear in a limited region. Close to an optimum, a quadratic function is a good approximation. For these reasons, the following functional forms are the most common approximations used. Such models are called empirical models: These are called linear, linear+interactions and quadratic models. The term interaction needs some clarification. The products in the second and third model are called pair-wise interactions. The interpretation of an interaction between two variables, say and is that the effect of on y, the response variable, is affected by. This means that the slope with respect to depends on the value of. An interaction can be antagonist or synergetic depending on whether the other variable decreases or increases the slope with respect to the other. It should be noted that interactions are very common in chemistry (just think about the ideal gas law!). It is easy to see that the number of unknown parameters increases quickly with the number of independent variables (also called explanatory or design variables), especially if pair-wise interaction are included into the model. Naturally higher order interactions may exist as well, but luckily these are seldom significant. Interactions play an important role in DOE. There existence is the reason why the so called one-variable-at-time (OVAT) designs fail in finding optima. This is best illustrated graphically. You can try with the simulated yield surface of a chemical reaction on p. 27. Just start at any point and maximize first in the time direction and then temperature direction (or vice versa), and see what happens. Actually, OVAT designs are inefficient for another reason as well, but shall come to that point later.

14 14 Variable types Before going into experimental designs, we have to consider different variable types. In typical experimental setups we have two types of variables: categorical (also called qualitative) or continuous. A categorical variable has discrete values assigning the object to a category. Typical examples would the type of a catalyst, the type of an impeller etc. All ordinary variables, such as temperature, pressure, concentration etc. are continuous variables. Quite often categorical variables are called factors, especially in connection with so-called analysis of variance (ANOVA). The type of a variable dictates the type of a model that we can apply. Naturally, e.g. powers or logarithms have no meaning with categorical variables. In general, if the model contains only categorical variables, the experiments are analysed by ANOVA, and if the model contains only continuous variables, the experiments are analysed by regression analysis. Actually ANOVA-models can be considered as a special case of regression where the categorical variable are coded by so called binary coding, a topic that is outside the scope of this course. In the sequel, we shall focus on models with continuous variables, but before that we ll introduce the subclasses of factors. Factors are classified on two different bases: 1) fixed vs. random, 2) crossed vs. nested. Though designs with qualitative factors is beyond our scope, it is important to understand these concepts. One must know that different kind of combinations of factor subclasses require different ANOVA tests. This is important, because a wrong test typically leads to wrong conclusions. Therefore, anybody who needs to analyse designs with such factors needs to study more about ANOVA, or consult a statistician. In addition, these terms appear quite often in environmental research, e.g. like in the following excerpt These factors are sludge type (fixed factor, qualitative, 3 terms) and seasonal evolution (fixed factor, qualitative, 4 terms). The geographical factor (random factor, qualitative, 4 terms) could not have been statistically tested because of the absence of repetition. from Biotechnol. Agron. Soc. Environ (2), Fixed vs. random The levels (values) of a fixed factor are exactly the ones that we are interested in. The focus of interest is in differences between the factor levels, not the overall variance caused by variation in the levels. The levels of random factor represent a random sample of a larger set of possible level values. The focus of interest is the overall variance caused by variation in the factor levels. Comparing different stirrers in some process development problem would be a typical case of a fixed factor (the type of the stirrer). In studying the effect of raw material variation in some process, raw material batch would be a typical random factor. Crossed vs. nested If we have at least 2 factors, we have to consider their pair-wise relationships. A factor is said to be nested within another factor if the levels of this factor may have different interpretations depending on the values of other factor. If the interpretation of factor

15 15 levels is unequivocal, factors are called crossed. This may need some further explanation. Consider a case where 3 analysts have each taken 3 samples of the same material. Now the samples are nested in the analysts, because for example the sample #1 of the analyst #1 is not physically the same sample as the sample #1 of the analyst #2. If the experiment had been organised so that the first 3 samples had been taken, and then the 3 analysts had analysed each sample, the factors (Analyst and Sample) would have been crossed. Modelling with factors Models with (qualitative) factors look different, because for example 2*Analyst doesn t make any sense. There are different ways of writing models with factors, but we ll get acquainted only with the most common way of using indexed variables. Let us take example 1 where we had two catalysts, i.e. the factor is Catalyst. We could model the yield by, where i refers to the catalyst (i = 1,2), and j refers to the replicate (j = 1,2,3). refers to the mean yield and 's refer to deviation of the mean caused by the i'th catalyst. The epsilon term refers to the experimental error of the j'th replicate experiment with the i'th catalyst. For example = After the parameters have been estimated, usually using the least squares principle analysis of variance (ANOVA), we can estimate the true values for the yield (the fitted values) by, where the hats mean estimated values. Naturally, if we have more factors, we need more indices. For example, for two factors the model would be or,. In the latter, the 4'th term is called the interaction between the factors. Analysing models with (qualitative) factors only (ANOVA) Models that contain factors only, are analysed by the so-called analysis of variance (ANOVA). We are not going to study the mathematics behind ANOVA, but in spite of that we can learn to apply ANOVA in analysing environmental data. For that we need some knowledge of 1) the models behind ANOVA, 2) the logic of statistical (hypothesis) testing, and 3) use of R (or some other software). We shall go through a couple of examples in the labs.

16 16 Factorial designs Consider a case where we have two independent variables (ph and T) and one response variable (yield). 18 experiments have been made at different levels of ph and T. In the figure below, the yield has been plotted against ph and T: yield 86 yield ph T Can we conclude that yield increases when ph increases and T increases? At first glance, the question may sound odd and without proper thinking many of us would answer immediately affirmatively. This example shows the kind of problems that arise when the experiments are not designed factorially or, more generally, orthogonally. A factorial design is such that all possible combinations of all variables at fixed levels are included in the design. For example, consider 3 catalysts (A, B and C) and two concentration levels (1 and 2 ppm). A factorial design would be: A 1 A 2 B 1 B 2 C 1 C 2 In order to guarantee the independence of the results these experiments should be carried out in random order! Note that catalyst is a qualitative (categorical) and concentration is a quantitative (and also continuous) variable. Factorial designs are applicable to both categorical and continuous design variables (as above). Factorial designs allow estimation of empirical models that contain linear terms (also called main effects) and interaction up to the order equal to the number of variables. The problem with factorial designs is that the number of experiments increases rapidly with the number of variable, For example, if we have 5 variables with

17 17 3, 4, 2, 3 and 5 levels, we need altogether 3*4*2*3*5 = 3 experiments. In R, it is easy to construct factorial designs using the function expand.grid. For example, if factor A has levels 6, 7 and 8, and factor B has levels low, medium and high, the design is obtained giving the following R-commands: > A <- 6:8; B <- c('low','medium','high') > design <- expand.grid(a,b) > design Var1 Var2 1 6 low 2 7 low 3 8 low 4 6 medium 5 7 medium 6 8 medium 7 6 high 8 7 high 9 8 high > In Matlab you can use the function mton from the Data Analysis toolbox. In Matlab it is easiest to use numerical codes for all factors. In Excel you have to learn rather complicated expressions or just to use copying in constructing factorial tables. Exercise 4 a) Suppose we are making a factorial design with the following variables: 1) stirring speed at levels 200 and 300, 2) temperature at levels 40, 50 and and 3) ph at levels 6, 6.5 and 7. Construct a factorial design, both using Excel and R or Matlab. b) Write down a model assuming all variables as qualitative factors, and each combination of variable is replicated twice. A huge number of experiments is a typical problem with factorial designs. However, for continuous design variables, there is a remedy. Namely, if we assume that we are far from a possible optimum, the dependencies can be assumed to be well described by linear and interaction effects only. For a continuous variable, a linear effect can be estimated using only two levels, as the determination of a slope requires only two data points. For this reason, the two level factorial designs, i.e. 2 N -designs are the basic designs for continuous variables, and using only 2 levels yields a reasonable number of experiments for moderate numbers of experiments.

18 18 Two level factorial designs (2 N -designs) Of different factorial designs, the two level designs play the most important role. The reason for this is that these design are very cost effective. They can be used for both qualitative and quantitative variables, and the results can analysed by regression in both cases. However, many extensions of 2 N -designs, e.g. adding centre points, or axial points, are applicable only for quantitative variables (these topics will be discussed later in the text). 2 N -designs are planned and analysed using so called coded units, i.e. the lower level of any variable is denoted by -1 and the upper level of any variable is denoted by +1. There are two good reasons for doing so: 1) the design can be tabulated without knowing the actual variables, and 2) the model based on coded variables allows meaningful calculation for determining the direction of maximal improvement (the gradient). Of course the table in coded units must be transformed into physical units before carrying out the experiments. The formulae for transforming from coded to physical units and vice versa are (3a) (3b) where capital letters denote coded variable levels (±1's), i denotes the difference of the actual variable levels and bar denotes the average of the two levels. Now, let us take a simple example. Suppose we want to maximize the yield of a batch reaction with respect to the reaction time (t) and temperature (T) and that we have decided the variable levels to be 100 and 1 min for t and and C for T. The design in coded units would be: x 1 x In physical units the table would be: t T

19 19 Again, the experiments should be carried out in random order. Now, suppose that we have obtained the following yields: t T y Before estimating a linear+interactions model, let us look at the data by plotting the yield against time and temperature t = 100 t = 1 The figure shows that the yield increases with temperature with both reaction times. However, there seems to be a clear interaction between variables, as the slope is clearly smaller with the longer reaction time. We also see that the yield increases with reaction time at the lower temperature, but not with the higher temperature. An extrapolation would suggest better yields increasing the temperature with the shorter reaction time. However, a better view of the dependencies is obtained when we estimate the model. This can be accomplished using ordinary linear regression analysis which can be carried out also in Excel. But before that, we must remember that at this point we have no idea about the repeatability. Exercise 5 Make a similar plot using R or Matlab. So, before regression, let us carry out some replicate experiments. We shall place them at the centre point of the design, i.e. t = 130 and T = (0 and 0 in coded units). There are good reasons to design the replicates at the centre point, and we shall come to this point later. The results of these experiments are the following:

20 20 t T y The standard deviation of the replicate yields is ca. 1.0 from which we can conclude that the changes in yield in the original design vary significantly. Before the regression analysis we shall add these to the design in coded units and calculate also the product corresponding to the interaction term. Thus, the table for the regression analysis looks like: X1 X2 X1X2 y The output of Excel s regression macro looks like:

21 21 The important figures in this table are highlighted and next we shall give them meaningful interpretations. R 2 (coefficient of determination) This is usually expressed in percentages,.i.e. 94%. It tells the proportion of the variance of yield explained by the model. Adjusted R 2 (adjusted coefficient of determination) Also this is usually expressed in percentages,.i.e. 90%. It tells the proportion of the variance of yield explained by the model taking into account the degrees of freedom. In general, it is a more realistic estimate of the goodness of fit (cf. eg. Wikipedia). Standard error The title in Excel for this is not very good. A better one would be the standard error of the residuals. In a good model this is close to the repeatability standard error. A larger value than the residual standard error is a symptom of lack of fit and a smaller value is a symptom of over-fit. In this example 2.7 is higher than 1.0 showing some lack of fit. However its significance should be tested using a lack of fit test. Significance F Again the title is not a good one. This is actually a p-value in a test whose null hypothesis is that the yields vary randomly around a mean value. In our case this hypothesis would be rejected even at the 0.01 level of significance. Thus we can conclude that the variables have a significant effect to the yield. Coefficients P-value These are the least squares estimates of the model parameters. Thus our model is The model supports our graphical interpretations by just looking at the signs of the slopes. Note also that the interaction effect is larger than the slope of the (coded) time.. These are the p-values in a test whose null hypothesis is that the regression coefficient (slope) is zero. Only the intercept and the slope of the temperature are significant at 0.05 level of significance. However 0.07 is quite close to 0.05 and it is a matter of taste whether to keep these terms in the model or not. Any model of two design variables can be represented as a 3-dimensional surface or a contour plot. Let us see what kind of a surface our model is.

22 yield X X yield X X1 This kind of a surface is called as a saddle surface. Note that surface has been plotted with extrapolated values. The model suggests impossibly high yields which is quite common with empirical model. But this is not important, the important thing is whether the model tells the right direction of improvement or not! Now, it is easy to see that the model suggests that best results are achieved on the upper left corner, i.e. with shorter reaction time and higher temperature. The point (-2,2) is ( min, 90 C) in physical units, so let us make a new experiment with these values. The result is 86.5% which is slightly better than any of the previous yields, so the model was not totally wrong. Let us proceed into the same direction and make an experiment at (40 min, 100 C). Now we get 86.0%, which is poorer, but not significantly. So let us make one more experiment in the direction: (10 min, 110 C). Now we get 67.7% which means significantly lower yield.at this point, we have to plan a new design taking into account that the surface is now known to be curved. We could have estimated the degree of curvature already around the original design, thanks to the centre point we added to the design. This is based on a mathematical fact: the height of a linear surface (a plane in 3-D) at a centre point of a set of points is equal to the mean value of the heights at these points. In our case the mean value of yields at the corner points is 72,4 and the yield at the centre point 76,1 (taken as the average) and thus the difference is 3.7. Now let us use the rules for the propagation of errors. According to Eq. 1, the standard deviation of the corner points mean is 1.0/ = 0.50 and the standard deviation of the centre point mean is 1.0/ = 0.45, and the standard deviation for the difference (Eq. 2b) is = 0.67.

23 23 Now the difference is over 5 times larger than its standard error. According the rules of thumb of the normal distribution, we can be pretty sure that such a big difference cannot be explained by normal random variation. Of course we could have used a formal t-test as well, but in clear cases like this one, it is not necessary. Now let us introduce some designs that allow estimation of the parameters of quadratic models. These designs are needed whenever we know that the underlying model is a (significantly) curved surface. It should be noted that we use the word surface rather liberally; with more than 2 design variables we should, in mathematical terms, talk about hyper-surfaces. However in DOE, these will be called response surfaces. We shall study see, how well a OVAT-design would have performed in our example in one of our lab sessions. In R, the regression analysis, using DOE-tools by VMT, is carried out in the following way: > X = mton(2,2,5) # 2^2-design with 5 centre points > y = c(55.3,83.4,67.7,83.2,76,75.6,76.2,77.8,75.1) # yields > model = quad.model.fit(x,y,1.5) > args(quad.model.fit) function (X, y, opt = 1, terms = NULL, model = NULL, blockvar = NULL) NULL > summary(model) Call: y ~ X1 + X2 + I(X1 * X2) Residuals: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-09 *** X X *** I(X1 * X2) Signif. codes: 0 *** ** 0.01 * Residual standard error: on 5 degrees of freedom Multiple R-squared: 0.94, Adjusted R-squared: F-statistic: on 3 and 5 DF, p-value: > In addition to what is shown by the summary function, the function quad.model.fit calculates some additional useful statistics: > model[13:17] $nonlin.test t p $LOF.test FLOF plof $CV CVpred CVresid CV.Stud.resid

24 $Q2 [1] The first of these (13. field of the object model ) gives the test statistic and the p-value of the so-called non-linearity test which is possible only if the design contains centre points. The p-value suggests that we have to reject the null hypothesis of linearity within the experimental region. The second gives the so-called lack-of-fit test statistic and the corresponding p-value. Again the null hypothesis is rejected, i.e. the model suffers from significant lack-of-fit. The next two fields (= list elements) give cross-validation information from the socalled leave-one-out cross-validation. The idea of leave-one-out cross-validation is simple: each observation in turn is left out of the data, and a model is fitted using the rest. Then, using this model, the value of left out response is predicted, and the prediction is recorded. This is repeated for all observations. Such predictions resemble more true predictions because the observation being predicted is not used in building the model. Therefore, usually cross-validated predictions give more realistic picture of the reliability of the model. The results of this example clear show that this model should not be used for predictions. The -value is the -value for the predictions, whereas is calculated using fitted values. You can easily make contour plots using R as well. The function for this is quad.plot. See the help in DOE_functions_v4.pdf and try it out. Second and higher order designs Obviously, optimization is not possible with linear or linear+interactions models, because any optimal point means curvature in its neighbourhood. Linear models can detect directions towards possible optima, but locating any optimum requires higher order models, and consequently designs that contain more than two levels. We shall next study the most common ones of such designs. A very natural approach for multi-level factorial designs would be to extend the 2 N - designs to 3 N -design (or M N -designs). The problem is that the number of experiments comes very quickly unbearable, for example for 5 variables 3 5 -design has 243 experiments. Yet another approach would be to use several superimposed 2 N -designs, e.g. one using coded values -2 and 2 and another using coded values -1 and 1. Unfortunately, it can be shown that such designs do not permit estimation of quadratic models. For these reasons, some other designs like central composite designs and Box- Behnken designs are used. Now we shall illustrate the use of second order response surfaces and the most popular second order design, the central composite design (CC). The structure of CC designs is very simple. The design consists of a 2 N -design plus a centre point (possibly replicated) and so called axial points.

25 25 The axial points are points whose all coordinates are at the centre, i.e. are zero, except for one. Thus there are 2 N axial points for N variables. The most common choice for the non-zero coordinate value is 2 N/4. We shall use the same example as above, and just accomplish the original 2 N -design with 4 axial points. In our example, the coordinate value in coded units, i.e. the distance from the origin is 2 2/ Thus the table of axial points in coded units is x 1 x and in physical units t T If we add the axial points, including the previously made centre point replicated, into our design, and make the experiments, it looks like t T y

26 26 Now, let us estimate and analyse a quadratic model in coded units for the yield. In Excel, one has to add 2 new columns for the quadratic terms. The table fo the regression analysis looks like: The output of the regression macro looks like:

27 27 Note that the R 2 value is very good, the overall significance is high and all coefficients are significant at the 0.05 level of significance, the quadratic term of the temperature being the weakest. The extrapolated response surface of this quadratic model is shown in the two figures below 100 yield time temp yield temp time The figure clearly shows that we should have raised the temperature more than we did, based on the 2 2 design. According to the figure, a good guess in coded units would have been (-2,3), i.e. ( min,100 C) in physical units. This, indeed, would have given a better yield, but we shall not give the results in order not to spoil our hands-on exercise. The same analyses using R are made in the labs. Response surface analysis The response surface above shows a clear rising ridge. The direction of the ridge can be derived by so called canonical analysis, but this needs some more advanced mathematical techniques, namely the use of so called eigenvalues which is beyond the scope of this introductory course. However, it is important to realize that graphical tools have severe limitations when the number of the variables in higher than 2. In such cases we have to rely on computational tools. Another computational tool for analysing response surfaces is the calculation of the so called stationary point. A stationary point of a quadratic surface, if it exists, can be either a minimum, maximum or a saddle point. Mathematically it is the point where the gradient, i.e. the vector of partial derivatives of the response variable with respect to the design variables is a vector of zeros.

28 28 Let us calculate the stationary point of the quadratic model above. The derivatives of the model in coded units are. Finding the stationary point simply means solving the pair of equations setting the derivatives to zero. This is easily done by hand or in Excel using Excel s matrix functions as shown below. The stationary point is physically impossible, because the reaction time cannot be negative. However, we have calculated the predicted value at the stationary point to verify that it is a maximum point. (A rigorous proof would require use of eigenvalues.) In cases like this, there is no sense in making experiments at the stationary point. Instead, we have two alternatives: 1) to design points along the so called gradient path or 2) to use an optimizer to find best points outside the design area. Naturally, if the stationary point is a maximum and it would lie inside the design region, one should make an experiment at the stationary point in order to verify if, the model predicts the best point correctly. For some reason, usually only the second alternative is available in commercial DOE software. It would be possible in Excel as well, using the Solver tool. However, in Excel it is faster to get new points using the gradient path method. We shall illustrate the latter approach. First it must be noted that the gradient path can be curved. There for we must calculate the points densely enough, and then use only more sparsely selected points. Note that both methods work also for linear and linear+interactions models!

29 29 It is easy to understand why to aim at processes that operate at optimal conditions. However, there s an aspect that is not always realized. Namely, around the optimum, the process is less sensitive against changes in process variables. A process operating at optimal conditions does not only produce the best product, but also the least varying product. The scaling factor in the above calculations means that the scaled gradient is 0.25 times the original gradient vector. This is then added to the previous point and the calculation are repeated for each new point. The physical units are obtained using the coding formulae. If we plot the path, it looks like: temperature time

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Design & Analysis of Experiments 7E 2009 Montgomery

Design & Analysis of Experiments 7E 2009 Montgomery 1 What If There Are More Than Two Factor Levels? The t-test does not directly apply ppy There are lots of practical situations where there are either more than two levels of interest, or there are several

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum To print higher-resolution math symbols, click the Hi-Res Fonts for Printing button on the jsmath control panel. PHY 13 Lab 1 - Error and Uncertainty and the Simple Pendulum Important: You need to print

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

POL 681 Lecture Notes: Statistical Interactions

POL 681 Lecture Notes: Statistical Interactions POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship

More information

Chapter 8 Handout: Interval Estimates and Hypothesis Testing

Chapter 8 Handout: Interval Estimates and Hypothesis Testing Chapter 8 Handout: Interval Estimates and Hypothesis esting Preview Clint s Assignment: aking Stock General Properties of the Ordinary Least Squares (OLS) Estimation Procedure Estimate Reliability: Interval

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Introducing Proof 1. hsn.uk.net. Contents

Introducing Proof 1. hsn.uk.net. Contents Contents 1 1 Introduction 1 What is proof? 1 Statements, Definitions and Euler Diagrams 1 Statements 1 Definitions Our first proof Euler diagrams 4 3 Logical Connectives 5 Negation 6 Conjunction 7 Disjunction

More information

Finite Mathematics : A Business Approach

Finite Mathematics : A Business Approach Finite Mathematics : A Business Approach Dr. Brian Travers and Prof. James Lampes Second Edition Cover Art by Stephanie Oxenford Additional Editing by John Gambino Contents What You Should Already Know

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Uncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error

Uncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error Uncertainty, Error, and Precision in Quantitative Measurements an Introduction Much of the work in any chemistry laboratory involves the measurement of numerical quantities. A quantitative measurement

More information

Using Microsoft Excel

Using Microsoft Excel Using Microsoft Excel Objective: Students will gain familiarity with using Excel to record data, display data properly, use built-in formulae to do calculations, and plot and fit data with linear functions.

More information

Just Enough Likelihood

Just Enough Likelihood Just Enough Likelihood Alan R. Rogers September 2, 2013 1. Introduction Statisticians have developed several methods for comparing hypotheses and for estimating parameters from data. Of these, the method

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Lecture 2: Linear regression

Lecture 2: Linear regression Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued

More information

Treatment of Error in Experimental Measurements

Treatment of Error in Experimental Measurements in Experimental Measurements All measurements contain error. An experiment is truly incomplete without an evaluation of the amount of error in the results. In this course, you will learn to use some common

More information

A Scientific Model for Free Fall.

A Scientific Model for Free Fall. A Scientific Model for Free Fall. I. Overview. This lab explores the framework of the scientific method. The phenomenon studied is the free fall of an object released from rest at a height H from the ground.

More information

Chemometrics Unit 4 Response Surface Methodology

Chemometrics Unit 4 Response Surface Methodology Chemometrics Unit 4 Response Surface Methodology Chemometrics Unit 4. Response Surface Methodology In Unit 3 the first two phases of experimental design - definition and screening - were discussed. In

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample

More information

A-Level Notes CORE 1

A-Level Notes CORE 1 A-Level Notes CORE 1 Basic algebra Glossary Coefficient For example, in the expression x³ 3x² x + 4, the coefficient of x³ is, the coefficient of x² is 3, and the coefficient of x is 1. (The final 4 is

More information

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet WISE Regression/Correlation Interactive Lab Introduction to the WISE Correlation/Regression Applet This tutorial focuses on the logic of regression analysis with special attention given to variance components.

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of Factoring Review for Algebra II The saddest thing about not doing well in Algebra II is that almost any math teacher can tell you going into it what s going to trip you up. One of the first things they

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

The First Derivative Test

The First Derivative Test The First Derivative Test We have already looked at this test in the last section even though we did not put a name to the process we were using. We use a y number line to test the sign of the first derivative

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Introduction to Design of Experiments

Introduction to Design of Experiments Introduction to Design of Experiments Jean-Marc Vincent and Arnaud Legrand Laboratory ID-IMAG MESCAL Project Universities of Grenoble {Jean-Marc.Vincent,Arnaud.Legrand}@imag.fr November 20, 2011 J.-M.

More information

DISTRIBUTIONS USED IN STATISTICAL WORK

DISTRIBUTIONS USED IN STATISTICAL WORK DISTRIBUTIONS USED IN STATISTICAL WORK In one of the classic introductory statistics books used in Education and Psychology (Glass and Stanley, 1970, Prentice-Hall) there was an excellent chapter on different

More information

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel Statistics Primer A Brief Overview of Basic Statistical and Probability Principles Liberty J. Munson, PhD 9/19/16 Essential Statistics for Data Analysts Using Excel Table of Contents What is a Variable?...

More information

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT 350 Final (new Material) Review Problems Key Spring 2016 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Using SPSS for One Way Analysis of Variance

Using SPSS for One Way Analysis of Variance Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

3 Non-linearities and Dummy Variables

3 Non-linearities and Dummy Variables 3 Non-linearities and Dummy Variables Reading: Kennedy (1998) A Guide to Econometrics, Chapters 3, 5 and 6 Aim: The aim of this section is to introduce students to ways of dealing with non-linearities

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Design of Engineering Experiments Part 5 The 2 k Factorial Design

Design of Engineering Experiments Part 5 The 2 k Factorial Design Design of Engineering Experiments Part 5 The 2 k Factorial Design Text reference, Special case of the general factorial design; k factors, all at two levels The two levels are usually called low and high

More information

Uncertainty and Graphical Analysis

Uncertainty and Graphical Analysis Uncertainty and Graphical Analysis Introduction Two measures of the quality of an experimental result are its accuracy and its precision. An accurate result is consistent with some ideal, true value, perhaps

More information

Introduction to Computer Tools and Uncertainties

Introduction to Computer Tools and Uncertainties Experiment 1 Introduction to Computer Tools and Uncertainties 1.1 Objectives To become familiar with the computer programs and utilities that will be used throughout the semester. To become familiar with

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Lecture - 24 Radial Basis Function Networks: Cover s Theorem

Lecture - 24 Radial Basis Function Networks: Cover s Theorem Neural Network and Applications Prof. S. Sengupta Department of Electronic and Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 24 Radial Basis Function Networks:

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis. Hypothesis Testing Today, we are going to begin talking about the idea of hypothesis testing how we can use statistics to show that our causal models are valid or invalid. We normally talk about two types

More information

Chapter 5: HYPOTHESIS TESTING

Chapter 5: HYPOTHESIS TESTING MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate

More information

RESPONSE SURFACE MODELLING, RSM

RESPONSE SURFACE MODELLING, RSM CHEM-E3205 BIOPROCESS OPTIMIZATION AND SIMULATION LECTURE 3 RESPONSE SURFACE MODELLING, RSM Tool for process optimization HISTORY Statistical experimental design pioneering work R.A. Fisher in 1925: Statistical

More information

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

Ordinary Least Squares Linear Regression

Ordinary Least Squares Linear Regression Ordinary Least Squares Linear Regression Ryan P. Adams COS 324 Elements of Machine Learning Princeton University Linear regression is one of the simplest and most fundamental modeling ideas in statistics

More information

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have

More information

Quadratic Equations Part I

Quadratic Equations Part I Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing

More information

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience Hypothesis Testing Definitions Hypothesis a claim about something Null hypothesis ( H 0 ) the hypothesis that suggests no change from previous experience Alternative hypothesis ( H 1 ) the hypothesis that

More information

Orthogonal, Planned and Unplanned Comparisons

Orthogonal, Planned and Unplanned Comparisons This is a chapter excerpt from Guilford Publications. Data Analysis for Experimental Design, by Richard Gonzalez Copyright 2008. 8 Orthogonal, Planned and Unplanned Comparisons 8.1 Introduction In this

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

Design of Experiments SUTD - 21/4/2015 1

Design of Experiments SUTD - 21/4/2015 1 Design of Experiments SUTD - 21/4/2015 1 Outline 1. Introduction 2. 2 k Factorial Design Exercise 3. Choice of Sample Size Exercise 4. 2 k p Fractional Factorial Design Exercise 5. Follow-up experimentation

More information

In the previous chapter, we learned how to use the method of least-squares

In the previous chapter, we learned how to use the method of least-squares 03-Kahane-45364.qxd 11/9/2007 4:40 PM Page 37 3 Model Performance and Evaluation In the previous chapter, we learned how to use the method of least-squares to find a line that best fits a scatter of points.

More information

Expected Value II. 1 The Expected Number of Events that Happen

Expected Value II. 1 The Expected Number of Events that Happen 6.042/18.062J Mathematics for Computer Science December 5, 2006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Expected Value II 1 The Expected Number of Events that Happen Last week we concluded by showing

More information

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests last lecture: introduction to factorial designs next lecture: factorial between-ps ANOVA II: (effect sizes and follow-up tests) 1 general

More information

Introduction to Basic Proof Techniques Mathew A. Johnson

Introduction to Basic Proof Techniques Mathew A. Johnson Introduction to Basic Proof Techniques Mathew A. Johnson Throughout this class, you will be asked to rigorously prove various mathematical statements. Since there is no prerequisite of a formal proof class,

More information

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression. WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship

More information

SCIENTIFIC INQUIRY AND CONNECTIONS. Recognize questions and hypotheses that can be investigated according to the criteria and methods of science

SCIENTIFIC INQUIRY AND CONNECTIONS. Recognize questions and hypotheses that can be investigated according to the criteria and methods of science SUBAREA I. COMPETENCY 1.0 SCIENTIFIC INQUIRY AND CONNECTIONS UNDERSTAND THE PRINCIPLES AND PROCESSES OF SCIENTIFIC INQUIRY AND CONDUCTING SCIENTIFIC INVESTIGATIONS SKILL 1.1 Recognize questions and hypotheses

More information

MBF1923 Econometrics Prepared by Dr Khairul Anuar

MBF1923 Econometrics Prepared by Dr Khairul Anuar MBF1923 Econometrics Prepared by Dr Khairul Anuar L4 Ordinary Least Squares www.notes638.wordpress.com Ordinary Least Squares The bread and butter of regression analysis is the estimation of the coefficient

More information

Math 016 Lessons Wimayra LUY

Math 016 Lessons Wimayra LUY Math 016 Lessons Wimayra LUY wluy@ccp.edu MATH 016 Lessons LESSON 1 Natural Numbers The set of natural numbers is given by N = {0, 1, 2, 3, 4...}. Natural numbers are used for two main reasons: 1. counting,

More information

16. . Proceeding similarly, we get a 2 = 52 1 = , a 3 = 53 1 = and a 4 = 54 1 = 125

16. . Proceeding similarly, we get a 2 = 52 1 = , a 3 = 53 1 = and a 4 = 54 1 = 125 . Sequences When we first introduced a function as a special type of relation in Section.3, we did not put any restrictions on the domain of the function. All we said was that the set of x-coordinates

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly

More information

Basic methods to solve equations

Basic methods to solve equations Roberto s Notes on Prerequisites for Calculus Chapter 1: Algebra Section 1 Basic methods to solve equations What you need to know already: How to factor an algebraic epression. What you can learn here:

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

Advanced Experimental Design

Advanced Experimental Design Advanced Experimental Design Topic Four Hypothesis testing (z and t tests) & Power Agenda Hypothesis testing Sampling distributions/central limit theorem z test (σ known) One sample z & Confidence intervals

More information