ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1

4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population. These methods utilize the information contained in a random sample from the population in drawing conclusions. Chapter IV 2

4 1 Statistical Inference Statistical Inference is divided into two areas: a) Parameter Estimation and b) Hypothesis Testing Parameter Estimation - Parameters are descriptive measures of an entire population. - Their values are usually unknown because it is unfeasible to measure an entire population. - Instead, a random sample from the population is taken in order to obtain parameter estimates. - Statistical analysis deals with finding estimates of the population parameters along with the amount of error associated with these estimates. These estimates are also known as sample statistics. Chapter IV 3

4 1 Statistical Inference Parameter Estimation - There are several types of parameter estimates: Point estimates are the single, most likely value of a parameter. For example, the point estimate of population mean (the parameter) is the sample mean (the parameter estimate). - Confidence intervals are a range of values likely to contain the population parameter. As an example of parameter estimates, consider that a spark plug manufacturer is studying a problem in their spark plug gap. It would be too costly to measure every single spark plug that is made. Instead, a random sample of 100 spark plugs is collected and the gap is measured to be 9.2 mm [this is the point estimate for the population mean (μ)]. Additionally, a 95% confidence interval for μ which is (8.8, 9.6) is determined. This means that with a 95% confidence the true value of the average gap for all the spark plugs is between 8.8 and 9.6. Chapter IV 4

4 3 Hypothesis Testing 4 3.1 Statistical Hypothesis Many of the problems in engineering require to determine whether to accept or reject a statement about a given parameter. The statement is called hypothesis and the decision-making procedure about the hypothesis is called hypothesis testing. Statistical hypothesis testing is the data analysis stage of a comparative experiment, in which we might be interested in, for example, comparing the mean of a population to a specified value. In this chapter, we will consider comparative experiments involving one population, and with focus on is testing hypothesis concerning the parameters of the population (mean, standard deviation). Chapter IV 5

4 3.1 Statistical Hypothesis A statistical hypothesis can arise from physical laws, theoretical knowledge, past experience, or external considerations, such as engineering requirements. Since probability distributions are used to represent populations, a statistical hypothesis can be stated in terms of the probability distribution of a random variable. The hypothesis usually involves one or more parameters of this distribution. Chapter IV 6

4 3.1 Statistical Hypothesis For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems. The burning rate is a random variable that can be described by a probability distribution. We are interested on the mean burning rate (a parameter of this distribution). Specifically, we want to determine whether or not the mean burning rate is 50 cm 3 /s. This can be expressed as 3 3 The statements H 0 and H 1 are called null hypothesis and alternative hypothesis, respectively. Chapter IV 7

4 3.1 Statistical Hypothesis In this particular case, since the alternative hypothesis states H 1 : μ 50 cm 3 /s, H 1 is called a two sided alternative hypothesis. A one sided alternative hypothesis would be stated as, for example in this case: H 0 : μ = 50 cm 3 /s H 1 : μ < 50 cm 3 /s or H 0 : μ = 50 cm 3 /s H 1 : μ > 50 cm 3 /s Hypothesis are always statements about the population or distribution under study, not statements about the sample. Chapter IV 8

4 3.1 Statistical Hypothesis The value of the population parameter in the null hypothesis is usually determined as: a) The result of past experience or knowledge of the process, in this case the hypothesis statement will be about whether the parameter has changed. b) Being drawn from some theory or model regarding the process under study, the hypothesis will be about verifying the theory or model. c) The result of external considerations, such as design or engineering considerations. A procedure leading to a decision about a particular hypothesis is called Test of a Hypothesis. Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. Chapter IV 9

4 3.1 Statistical Hypothesis A procedure leading to a decision about a particular hypothesis is called Test of a Hypothesis. Hypothesis-testing procedures rely on using the information in a random sample from the population of interest. If this information is consistent with the hypothesis, then the hypothesis is true. If this information is inconsistent with the hypothesis, then the hypothesis is false. In general, what it is tested is the null hypothesis, where, the rejection of the null hypothesis leads to accepting the alternative hypothesis. Null hypothesis are always stated such that an exact value of the parameter is expressed, for example H 0 : μ = 50 cm 3 /s Chapter IV 10

4 3.1 Statistical Hypothesis In general, what it is tested is the null hypothesis, where, the rejection of the null hypothesis leads to accepting the alternative hypothesis. Null hypothesis are always stated such that an exact value of the parameter is expressed, for example H 0 : μ = 50 cm 3 /s. Alternative hypothesis allows the parameter to take several values, for example H 0 : μ < 50 cm 3 /s. Testing of a hypothesis requires taking a random sample, computing a test statistic from the sample data, and then using the test statistic to make a decision about the null hypothesis. Chapter IV 11

4 3.2 Testing Statistical Hypothesis Consider the burning rate problem, used previously: 3 A sample of 10 specimens is taken and the mean of the sample, x, is determined (x is an estimation of μ). If x is close to μ, then there is evidence to support H 0. If x is considerably different to μ, then there is evidence to support H 1, instead. Since the sample mean, x, can take many values, a range around μ is determined, such that if x falls within this range then H 0 is not rejected (that is H 0 is accepted). Otherwise if x falls outside this pre-established range then H 0 rejected (that is H 1 is accepted). Chapter IV 12 3

4 3.2 Testing Statistical Hypothesis Consider the case discussed where 3 3 Assume that the decision rule is: 48.5 < x < 51.5. Thus values of x less than 48.5 or larger that 51.5 constitute the critical region for the test. The values that define the critical region are called critical values. Chapter IV 13

4 3.2 Testing Statistical Hypothesis This decision process can result in two erroneous conclusions: a) The true mean burning rate might be 50 cm 3 /s but the randomly selected testing specimens resulted in a x within the critical region, resulting in the rejection of H 0 in favor of H 1. This type of erroneous conclusion is called Type I Error. b) Assume that μ 50 cm 3 /s but x falls outside the critical region, in this case we would fail to reject H 0 (that is H 0 is accepted), when H 0 is false. This type of erroneous conclusion is called Type II Error. Chapter IV 14

4 3.2 Testing Statistical Hypothesis These observations are summarized in the following table: (accepted) Now, the probability of making a Type I error is represented by α: α= P(type I error) = P(reject H 0 when H 0 is true) In the propellant burning example, a type I error occurs if μ = 50 but x < 48.5, or x > 51.5 Chapter IV 15

4 3.2 Testing Statistical Hypothesis Assume now that σ = 2.5 cm 3 /s, then if H 0 : μ = 50 cm 3 /s is true, the distribution of the sample mean, x, is approximately normal with mean μ = 50 and standard deviation σ/ 10 = 2.5/ 10 = 0.79. Thus, the probability of making a Type I error is: α= P(X < 48.5 when μ = 50 ) + P (X > 51.5 when μ = 50 ) Then, the z values corresponding to the critical values of 48.5 and 51.5 are 48.5 50 51.5 50 z 1.90 z 1.90 0.79 0.79 Which results in, α = P(Z < - 1.90) + P (Z > 1.90) = 0.0287 + 0.0287 = 0.0574 Chapter IV 16

4 3.2 Testing Statistical Hypothesis Which results in, α = P(Z < - 1.90) + P (Z > 1.90) = 0.0287 + 0.0287 = 0.0574 This result implies that 5.74 % of all random samples will lead to rejection of H 0 when the true mean is actually 50. That is, it is expected to make a type I error 5.74 % of the time, provided that the true mean is actually 50. Chapter IV 17

4 3.2 Testing Statistical Hypothesis α can be reduced by using critical values that are farther from the mean μ say for example 48 and 52 α P Z 48 50 0.79 P Z 52 50 0.79 α 0.0057 0.0057 0.0114 P Z 2.53 P Z 2.53 Chapter IV 18

4 3.2 Testing Statistical Hypothesis α can also be reduced by increasing the sample size, say, n = 16 instead 10. Thus, σ/ 16 = 2.5/ 16 = 0.625 α P Z 48.5 50 0.625 P Z α 0.0082 0.0082 0.0164 51.5 50 0.625 P Z 2.4 P Z 2.4 Chapter IV 19

4 3.2 Testing Statistical Hypothesis It is also important to study the probability of Type II error, β: That is, H 0 is accepted when H 0 is false. β= P(type II error) = P(fail to reject H 0 when H 0 is false) To find β it is necessary to have a specific alternative hypothesis, that is a particular value for μ. Suppose that it is important to reject H 0 : μ = 50 when the burning rate is greater than 52 or less than 48. We could calculate the probability of a type II error β for the values μ = 48 and μ = 52 and use this result to draw some conclusions about the test procedure. Chapter IV 20

4 3.2 Testing Statistical Hypothesis That is, how would the test procedure work if it is needed to detect that is reject H 0 for a mean value of μ = 48 or μ = 52? Because of symmetry, it is only necessary to evaluate one of the two cases say finding the probability of not rejecting H 0 : μ = 50 when the true mean is μ = 52. The normal distribution on the left corresponds to the test statistics X when H 0 : μ = 50 is true. The normal distribution on the right is the distribution of X when H 1 is true and the value of μ = 52. A type II error will occur when the sample mean x falls between 48.5 and 51.5 (the critical region boundaries) when μ = 52. Chapter IV 21

4 3.2 Testing Statistical Hypothesis This is the probability that 48.5 X 51.5 when the true mean is μ = 52. Which is the shaded area of the normal distribution on the right. Therefore: β = P(48.5 X 51.5 when μ = 52) Then, the z values corresponding 48.5 and 51.5 when μ = 52 are 48.5 52 51.5 52 z 4.43 z 0.63 0.79 0.79 Resulting in β P 4.43 Z 0.63 P Z 0.63 P Z 4.43 β 0.2643 0.0000 0.2643 Chapter IV 22

4 3.2 Testing Statistical Hypothesis β 0.2643 0.0000 0.2643 In this case, testing H 0 : μ = 50 against H 1 : μ 50 with n = 10 and critical values of 48.5 and 51.5, and true mean value of 52 will result in a probability of 0.2643 of failing to reject (that is accept) a false H 0. Due to symmetry if the true value of the mean is μ = 48 then β = 0.2643 The probability of making a Type II error increases rapidly as the true value of μ approaches the hypothesized value. For example, if the true value of μ is 50.5 and H 0 : μ = 50, then β= P(48.5 X 51.5 when μ = 50.5) 48.5 50.5 51.5 50.5 z 2.53 z 1.27 0.79 0.79 Chapter IV 23

4 3.2 Testing Statistical Hypothesis Which results in β P 2.53 Z 1.27 P Z 1.27 P Z 2.53 β 0.8980 0.0057 0.8923 Type II error probability also depends on the size of the sample. Thus, the following conclusions can be obtained regarding type I and II errors: a) The size of the critical region, and thus the probability of a type I error α can be reduced by adjusting the critical values b) Type I and II errors are related. A decrease in the probability of one type of error results in an increase in the probability of the other (if n remains constant) c) Increases in the sample size will reduce α and β if the critical values are constant. d) When H 0 is false β increases as the true value of the parameter approaches the value hypothesized in H 0. Chapter IV 24

4 3.2 Testing Statistical Hypothesis Type I error probability α is controlled through the selection of the critical values. In general a value of α = 0.05 is used in most situations unless there is information available indicating that this is an inappropriate value 4 3.3 P Values in Hypothesis Testing The P value is the probability that the sample average, x, will take on a value that is at least as extreme as the observed value when H 0 is true. That is, P value conveys information about the weight of evidence against H 0. The smaller the P value is, the greater the evidence against H 0. If P value is small enough the H 0 is rejected in favor of H 1. P value approach allows a decision maker to draw conclusions at any level of significance that is appropriate. Chapter IV 26

4 3.3 P Values in Hypothesis Testing The P value measures the plausibility of H 0. The smaller the P value is, the greater the evidence against H 0. The P value is the probability of obtaining a sample more extreme than the one observed in the data assuming that H 0 is true. Chapter IV 27

4 3.3 P Values in Hypothesis Testing Calculation and Interpretation of the P Value Consider the propellant burning rate example, with σ = 2.5 cm 3 /s: 3 3 Suppose that a random sample with n = 10 and x = 51.8 cm 3 /s is collected. For this example x μ 51.8 50 z 2.28, thus, the probability of P(z = 2.28) = 0.0113. σ/ n 0.79 It is also necessary to consider z value being negative (z = 2.28), corresponding to 48.2. Because of symmetry, P(z 2.28) = 0.0113 Therefore, the P value for this hypothesis testing case is P = 0.0113 + 0.0113 = 0.0226 Chapter IV 28

Calculation and Interpretation of the P Value P value states whether H 0 is true. In this case, the probability of getting a random sample whose mean is at least as far from 50 as 51.8 (or 48.2) is 0.026 (very small). Then, a random sample with 51.8 as mean is very rare if the actual mean is 50. Using a level of significance of 0.05 (Confidence interval of 5%), in this case H 0 would be rejected. Chapter IV 29

Calculation and Interpretation of the P Value In a practical application, once the P value is computed, its value is compared to a predetermined significance of level to make a conclusion regarding H 0. Typically the level of significance used is 0.05 (Confidence interval of 5%). Thus, the P value provides a level of the credibility for H 0 by measuring the weight of evidence against H 0. Example. Problem 4 16. Set n = 8 Problem 4 25. c) Find P value if x 9.09v Chapter IV 30

4 4 Inference on the Mean of a Population. Variance Known Under the following Assumptions The Quantity Z: has a standard normal distribution. 4 4.1 Hypothesis Testing on the Population Mean (μ) Consider the case in which: Chapter IV 31

4 4.1 Hypothesis Testing on the Population Mean (μ) Consider the case in which: And a random sample X 1, X 2,, X n, of the population is collected. Using the z test, the P value for this sample can be determined provided that the variance (σ 2 ) of the population is known. The test statistic for the z test is defined by: X is the mean of the sample whereas σ/ n is known as the Standard Error of the Mean (S.E.M) The determination of the P value is function of the definition of H 1. Chapter IV 32

4 4.1 Hypothesis Testing on the Population Mean (μ) If H 1 is two sided: H 1 μ 0, then P = 2[1 - (z 0 )] If H 1 is upper tailed: H 1 > μ 0, then P = 1 - (z 0 ) If H 1 is lower tailed: H 1 < μ 0, then P = (z 0 ) Chapter IV 33

4 4.1 Hypothesis Testing on the Population Mean (μ) To use significance level testing with z test, it is only necessary to determine the critical regions for H 1, whether H 1 is two sided or one sided. If H 0 : μ = μ 0 is true: a) H 1 is two sided: The probability that Z 0 falls between z α/2 and z α/2 is 1 α. Thus, the probability, α, of: Z 0 < z α/2 or Z 0 > z α/2 when H 0 : μ = μ 0 is true is very small then H 0 must be rejected. b) H 1 is one sided: The probability that Z 0 > z α or Z 0 < z α is 1 α. Thus, the probability, α, of: Z 0 < z α or Z 0 > z α when H 0 : μ = μ 0 is true is very small then H 0 must be rejected. Chapter IV 34

4 4.1 Hypothesis Testing on the Population Mean (μ) These results are summarized in the following table, and on Table The significance level, α, is going to be considered 0.05, unless otherwise stated. Thus if P value, is less than 0.05, H 0 must be rejected in favor of H 1. In summary, using P value criteria, if: P > 0.05 then accept H 0 P < 0.05 then reject H 0 Chapter IV 35

4 4.1 Hypothesis Testing on the Population Mean (μ) In summary, using the significance level testing criteria, if: For H 1 : μ μ 0 reject H 0 if Z 0 < 1.96 or Z 0 > 1.96. (α/2 = 0.05/2 = 0.025) For H 1 : μ > μ 0 reject H 0 if Z 0 > 1.645. (α = 0.05) For H 1 : μ < μ 0 reject H 0 if Z 0 < 1.645. Chapter IV 36

4 4.2 Type II Error and Choice of Sample size Consider the case in which the alternative hypothesis is two sided : Where δ = μ true μ hypothesized and z.05/2 = 1.96 Chapter IV 37

Consider now, the case in which the H 1 is one sided, upper tail: > In this case: β Φ Z δ n σ Where δ = μ true μ hypothesized and z α=.05 = 1.645 Finally, if H 1 : μ < μ 0 then, the probability of type II error, β, is: β 1 Φ Z δ n σ Where δ = μ true μ hypothesized and z α=.05 = 1.645 Chapter IV 38

4 4.2 Type II Error and Choice of Sample size Determination of sample size If it is necessary to determine the sample size, n, to reduce the probability of type II error, to a given value β, then, for H 1 : μ μ 0 : Chapter IV 39

4 4.3 Large Sample Test The test procedure developed for the null hypothesis H 0 : μ = μ 0 was under the assumption that the variance of the population, σ 2, is known. In most practical situations this is not the case, that is σ 2 is unknown. However, if the number of samples is n 30, the sample variance s 2 will be close to σ 2 for most samples, and so s can be substituted for σ in the test procedures without any significant effect. The appropriate approach for the analysis of H 0 when σ 2 is unknown and the sample is small will be discussed in section 4 5. Chapter IV 41

4 4.5 Confidence Interval on the Mean In many situations, when making a decision about the mean, H 0 : μ = μ 0 it is more practical to have an interval than a point estimate. One way to finding this interval is by determining a Confidence Interval (CI). A confidence interval is defined, for a two sided alternative hypothesis H 1 : μ μ 0, as: Where Resulting in Which can be rearranged as P z / Z z / 1 α Z X μ σ/ n P z / X μ σ/ n z / 1 α z σ P X n μ X z / σ n 1 α Chapter IV 42

Chapter IV 43

4 4.5 Confidence Interval on the Mean For the case in which the two sided case of H 1 has a CI is 1 α = 95%, z α/2 = 1.96, thus the previous equation becomes X 1.96 σ If the alternative hypothesis is lower tailed, H 1 : μ < μ 0, then the confidence interval for the upper confidence bound is determined as If the alternative hypothesis is upper tailed, H 1 : μ > μ 0, then the confidence interval for the lower confidence bound is determined as For a CI, (1 α), of 95%, z α = 1.645 n μ X z σ n X z σ n μ μ X 1.96σ n Chapter IV 44

4 5 Inference on the Mean of a Population, Variance Unknown Consider a population with normal distribution for which the mean, μ, and the standard deviation, σ, are unknowns. Assume that it is necessary to test the two sided alternative hypothesis: For the cases in which the sample size is large, n 30, the test statistic is very similar to that of the case in which σ is known: Z X μ S/ n In this case S is the standard deviation of the sample. (Eqn. 4 39). That is for samples with large number of elements σ S. Example. Problem 4 29 (a) and (b) Problem 4 32 Problem 4 37 Chapter IV 45

4 5 Inference on the Mean of a Population, Variance Unknown When the sample is small (n 30) and σ 2 unknown, testing the hypothesis on the mean, μ, is performed using the T test. t distribution depends on the number of samples and therefore the t table is function the number of items sampled, n. k = = (n 1) Where k (or ) is the degree of freedom Chapter IV 46

4 5 Inference on the Mean of a Population, Variance Unknown t distribution is symmetric about zero, and unlike normal distribution, the probability values, α, provided in tables correspond to the right side of the curve. A t distribution table is presented in Table II (Appendix A). t α, k is the value of the random variable T with k degrees of freedom above which we find a probability α. Since t distribution is symmetric t 1 α = t α Example. Problem 4 48 (a), (e); Problem 4 49 (d) Chapter IV 47

4 5 Inference on the Mean of a Population, Variance Unknown Finally, a summary of the testing hypotheses on the Mean of a Normal Distribution when the Standard Deviation of the Population is Unknown and the number of samples n 30, is presented next. Chapter IV 48

4 5 Inference on the Mean of a Population, Variance Unknown 4 5.2 Type Error II and Choice of Sample Size Remember, Type II Error is the probability, β, of Fail to Reject H 0 when H 0 is false. In order to determine β for problems involving the t test a set of charts, called Operating Characteristic (OC) charts have been compiled (Appendix A, charts a, b, c, d). β is function of the type alternative hypothesis (two sided, right or left tail), the level of significance α, and a scale factor defined, for two sided alternative hypothesis as: If H 1 : μ > μ 0, then d μ μ σ. Use chart c) if α =.05, and chart d) if α =.01. If H 1 : μ < μ 0, then d μ μ σ Use chart a) if α =.05, and chart b) if α =.01. d μ μ σ μ 1 is the true mean value and μ 0 is the hypothesized value. Chapter IV 49

4 5 Inference on the Mean of a Population, Variance Unknown 4 5.3 Confidence Interval on the Mean The probability of the test statistic T, where t α/2, n-1 T t α/2, n-1 is (1 α) and can be written as P t α/2, n 1 T t α/2, n 1 1 α Thus resulting in P where T / t α/2, n 1 X μ S/ n t α/2, n 1 1 α Which after rearranging yields P X t S n μ X t,, Chapter IV 50 S n 1 α

4 5.3 Confidence Interval on the Mean If H 1 : μ > μ 0, then X t, Sn μ If H 1 : μ < μ 0, then μ X t, S n Example. Problem 4 54; Problem 4 61 Chapter IV 51

4 6 Inference on the Variance of a Normal Population 4 6.1 Hypothesis Testing on the Variance of a Normal Population Consider the Null and Alternative Hypothesis to be: The test statistic for this type of problem is: This test statistic is called Chi-Square statistic. Chapter IV 52

4 6 Inference on the Variance of a Normal Population 4 6.1 Hypothesis Testing on the Variance of a Normal Population The distribution of the Chi square is defined as follows: k is the number of degrees of freedom and Γ is the gamma function. Chapter IV 53

4 6 Inference on the Variance of a Normal Population 4 6.1 Hypothesis Testing on the Variance of a Normal Population Chi square is not a symmetrical distribution and the shape of the curve is function of the degree of freedom k = = n 1 Chapter IV 54

4 6.1 Hypothesis Testing on the Variance of a Normal Population The null and alternative hypothesis tests for the 2 test are expressed as: H 0 : σ 2 = σ 0 2 Two sided alternative hypothesis H 1 : σ 2 σ 0 2 Upper sided alternative hypothesis H 1 : σ 2 > σ 0 2 Lower sided alternative hypothesis H 1 : σ 2 < σ 0 2 Similarly as for hypothesis test on the mean, there are three possibilities to evaluate the null hypothesis, H 0, the P value and Fixed Level criteria are shown next. Chapter IV 55

4 6.1 Hypothesis Testing on the Variance of a Normal Population Additionally, the confidence interval criterion can also be applied to test H 0 on the variance. For H 1 : σ 2 σ 0 2 Chapter IV 56

4 6.1 Hypothesis Testing on the Variance of a Normal Population If H 1 : σ 2 >σ 02 : n 1 S χ, σ If H 1 : σ 2 < σ 0 2 σ n 1 S χ, Example. Problem 4 66; Problem 4 68 Skip Sections 4.7 through 4.9 Chapter IV 57

4 10 Testing for Goodness of Fit Hypothesis testing procedures are designed for problems in which the population or probability distribution is known and the hypotheses involve parameters of the distribution. If instead it is necessary to determine whether a given sample can be considered to fall within a given probability distribution, the test statistic used is: O is the observed count E is the expected count E can be found using the assumed probability distribution followed by the population. For Example: Poisson Distribution or Normal Distribution E can also be estimated from historical count (previous records) Chapter IV 58

4 10 Testing for Goodness of Fit 2 table (Table III, p. 489) requires k = = n p 1 as entry value the degree of freedom (DoF) n is the number of samples p is the number of parameters studied The decision on H 0 is made using P value criteria, for a level of significance of 0.05 If P value > 0.05 then H 0 is accepted (fail to reject). If P value < 0.05 then H 0 is rejected. Example. Problem 4 97 Chapter IV 59

4 10 Testing for Goodness of Fit Example. A car dealer wants to know whether the sales by color of a particular car model followed this year the expected trend based on historical data, so it can make a proper estimation when placing next year s the order. Car Color Observed Expected (O E) 2 /E Brown 8 6.5 Silver 6 7 Red 9 6.5 Blue 10 12 Black 13 10 Green 4 8 Total 50 Chapter IV 60