INSY 7300 6 F01 Reference: Chapter of Montgomery s 8 th Edition Point Estimation As an example, consider the Bond Strength data in Table.1, atop page 6 of By S. Maghsoodloo Montgomery s 8 th edition, on Modified Mortar (the experimental group). The most two important sample statistics for the response variable, y 1, of the experimental group are 10 n 1 1 y1 y 1j/ n116.764, S 1 (y1jy 1) 0.10014, where n1 10 j1 9 j1 and the unit of measurements is in kgf/cm. Before sampling, y 1 is an unbiased estimator of 1, i.e., E( y 1) = 1, and S 1 is an unbiased estimator of 1 iff the population is infinite in which case E( S 1 ) = 1. If the population is finite, E( S 1 ) 1. For nearly all underlying distributions, E(S), i.e., S is a biased estimator of population standard deviation. For a normal universe, E(S) = c 4, where 0 < c 4 < 1, and c 4 (n) = (n/). n1 [(n1)/] The operator E is linear because (1): E(CY) = CE(Y), and (): E( Y 1 +Y ) = E( Y 1 ) + E(Y ), where C is any constant. The operator V is nonlinear because (1): V(CY) CV(Y). In fact V(CY) = C V(Y) and V(Y 1 Y ) = V(Y 1 ) + V(Y ) COV( Y 1, Y ). If Y 1 and Y are independent, then COV(Y 1, Y ) = 1 = E[(Y 1 1 )(Y )] = 0. The converse of this is not generally true. Now, consider the numerator of n j j=1 S = (y y) /(n1)=s /(n1)=css/(n1): yy n yy j j j j j j1 i1 S (y y) y y y (y) y ( y ) /n S yy = CSS = USS CF Degrees of freedom (df): (n 1) = n 1 1
In general, if a random variable (rv), Y, has V(Y) = y, then E(CSS/ df ) = y. Interval Estimation There are 3 types of QCH: Smaller The Better (STB), LTB, & Nominal The Best (NTB) STB Examples: Tire eccentricity, Loudness of a compressor, Rate of wear, Braking distance, etc Ideal target = 0 and only a single upper spec limit, USL= y u. LTB (Larger The Better) Examples: Welding Strength, TTF (time to failure), Efficiency, Yield, etc. Ideal target = and a single lower spec limit, LSL = y L. NTB Examples: Clearance, Chemical content level, Output voltage, % Asphalt in a hot mix asphalt (HMA) which generally ranges within 3.5 8.00%. Ideal target = m and there are always an LSL = m 1 and an USL = m +. Generally, a CI (Confidence INTERVAL) for an STB parameter should be upper one sided, for an LTB type parameter should be lower one sided, and always sided CI for an NTB parameter. Example 1. A company manufactures ropes for climbing purposes. The consumers LSL for breaking strength y is y L = 100 psi, and y ~ N(, 676 psi ). Obtain a 95 % proper CI for the parameter using the average of a random sample of size n = 5, where y = 115 psi. y / n 0.5 0.45 0.05 y Figure 1 + Z 0.05 σ y
The sampling distribution (SMD) of y in Figure 1 shows that the Pr (y 1.645 6/ n) = 0.95 115 1.6455. < 106.446 <, where L = 106.446, and Z 0.05 1.645. If a CI is lower one sided, then the corresponding test of hypothesis must be right tailed, i.e., for the above CI we should test H 0 : = 0 psi versus H 1 : > 0 (the alternative H 1 : < 0 will lead to a contradiction when H 0 is rejected). Typical values of 0 = 105, 108, 110, etc. / n 105 AU = y U.05 y Figure. The SMD of y given that H 0 : = 105 is true The nominal level of significance is generally set at = 0.05. A U = Upper Acceptance Limit = 105 + Z0.05 6/ 5 = 113.554 = y U AI (Acceptance Interval) : 0 y 113.554. The test statistic y = 115 > y U Reject H 0 at the LOS = 0.05 and conclude that > 105. Note that an upper one sided CI for the above test is given by < 115 + 8.554 = 13.554, which includes the hypothesized value of = 105 and hence contradictory to the rejection of H 0. Assignment 1. Work problem.17 on page 60 of your text. ANS: (c) P value = 0.0549, (d) [799.75, 84.5]. (b) Work problem.0, p. 61. ANS: n 139. (c) Work problem.5. (d) Work problem. but change part (a) to determining if the population mean repair time is less than 50 hours? Note that in this problem you will have to use the Student's t distribution. 3
Test of Hypothesis In conducting any test of hypothesis, only one of the 4 circumstances given in Table 1 will occur, where Pr denotes probability. Table 1. The four circumstances that may occur when testing H 0 Reject H 0 Accept H 0 H 0 is true Type I error (or False Positive) Occurrence Pr = Correct decision (True Negative) Specificity of the test = Occurrence Pr = 1 H 0 is false Correct decision (True Positive), Occurrence Pr = 1 = Power or Sensitivity of the test Type II error (or False Negative) Occurrence Pr = = The Pr of Accepting H 0 at a specified value of the parameter under H 0. A u = 113.554 Figure 3 y 5.0 y 0 = 110 1 110 113.554 y Note that in order to commit a type II error, H 0 must be false. In reference to Example 1, this implies that must differ from 105 psi, say = 110. From Figure 3, Z 1 = (113.554 110)/5. = 0.68346 (at = 110) = (0.68346) = 0.75844. 4
Assignment. Compute the values of for values of = 105, 113.554, 115, 118 and 15. Graph as a function of. This graph of type II error Pr versus the parameter under H 0 is called the OC (Operating Characteristic) curve. If the population variance is unknown, then statistical inference (i.e., estimation & test of hypothesis) on cannot be conducted using the statistic Z 0 =(y 0 ) n /, rather resort has to be made to the sampling distribution of the statistic (y 0) n / S which has (W. S. Gosset s) Student s t distribution with (n 1) df, such as in problem.0. (Also see the Example. on pp. 51 5 of Montgomery s 8 th edition, and Problems.3,.33 &.34 all on the paired t test). Statistical Inference on We use the fact that the SMD (sampling distribution) of the random variable (n 1)S / from a normal (or Laplace Gaussian) universe follows a Chi square ( ) distribution with (n 1) df. As an example, consider the problem.31 on page 63 of Montgomery s 8 th edition. Data Statistics: 0 j=1 j j y = 116.56, y = 5.88, USS = y = 694.330, S = 0.79045 CSS = 694.330 116.56 /0 = 15.0185; Figure 4 atop the next page shows that χ 0.95,19 =10.1170 Pr( χ 10.1170 ) = 0.95 Pr[(n 1)S / 10.1170] = 0.95 19 0 < 19(0.79045)/10.1170 0 < 1.48448 0 < 1.1839 We are 95% confident that the process variance lies within the interval (0, 1.4845]. The Pr that this last interval includes is 0 or 1. The above CI implies that we cannot reject the null hypothesis H 0 : = 1.0 versus the alternative H 1 : < 1.0, i.e., we cannot conclude that < 1.0 at the 5% level; however, we can reject the null hypothesis H 0 : = 1.60 versus the alternative H 1 : < 1.60 because 1.60 is outside the 95% CI (0 < 1.48448]. Note that 10.1170 represents the 95 th percentage point of Chi square, i.e., 10.1170 = χ 0.95,19, or its 0.05 quantile. 5
19 The Modal point = MO =17 Statistical Inference on Two Normal Population Parameters Although Montgomery covers the test of two variance equality at the end of Chapter (pp. 58 59) and a pretest on 1 = is judicious in order to determine whether to pool the variances from independent populations, we will first cover inferences about variances of two normal populations, followed by inferences on two independent population means. Statistical Inference on Two Variances Sir Ronald A. Fisher s F statistic describes the sampling distribution of the ratio of scaled distributions give below: F = / 1 1 [(n11) S / 1]/(n11) 1 = / [(n1) S / ]/ (n1) S1 / 1 S /, where 1 = n 1 1 is the df of the numerator and = n 1 is the df of the denominator. The Modal 6
point of an F distribution is roughly 1 for 1 and > 8 (but always less than 1). Now, consider the F 0.05, 11,9 = 3.105 Example.3 on pages 58 59 of Montgomery: n 1 =1, S 1 = 14.5, n = 10, and S = 10.8. Figure 5 above clearly shows that the Pr(F 11, 9 3.105) = 0.95, i.e., the cdf of the rv F 11,9 at F 0.05, 11,9 = 3.105 is equal to 0.95. Put differently, the 5 percentage point of the F distribution, or the 0.95 quantile, with 1 = 11 and = 9 is given by F 0.05, 11,9 = 3.105. Consequently, Pr( F 11,9 3.105 ) = 0.95 S / Pr( 1 1 S / 3.105 ) = 0.95 Pr( S1 3.105 S 1 / < ) = 0.95 0.437 1 / <. The above CI is consistent with testing H 0 : 1 / = 1 vs the alternative H 1 : 1 / > 1 because the CI encloses the null hypothesized value of 1 / = 1. However, suppose we had n 1 = 1, S 1 = 14.5, n = 10 but S = 4.4. Now the test statistic F 0 = 3.805 > F 0.05,11,9 = 3.105 leads to the rejection of H 0 at the 5% level; however, the 95% sided CI : 0.8386 1 / 11.7703, where 0.8386 = 1 S /S F 0.05,11,9, F 0.05,11,9 = 3.911, contains 7
1 / = 1, which is contradictory to the rejection of H 0! The correct one sided CI : 1.0574 1 / < excludes the null hypothesized value of the right tailed alternative H 1 : upper one sided 95% CI: 0 < 1 / 1 / 1 / = 1 as required. Note that for > 1, we must obtain a lower one sided CI because the S 1 / S F 0.95,11,9 = 3.805 0.3453 0 < / 9.501 1 includes the hypothesized value of 1 / = 1, which contradicts the rejection of H 0. Testing the Equality of Two Independent Population Means (a) Case of H 0 : 1 = = not rejected at the 0% level. Consider the sided hypothesis H 0 : 1 = versus H 1 : 1. Then, the Test Statistic is where S p = (CSS 1 + CSS )/df. Values of the rejection region is 1 1 t 0 = [(y1 y ) ] / ( Sp ), n 1 n t t 0 /,n n lead to the rejection of H 0, i.e., 1 (, t ) (t, ). See the example on pp. /,n n /,n n 1 1 38 39 of Montgomery, and problems.6,.7. Note that a pretest on H 0 : yields F 0 = 1 = at the 0% S 1 / S = 0.10013777/0.06146 = 1.693, which yields a P value = 0.4785 > 0.0, and hence the null hypothesis H 0 : 1 = is tenable. Thus, we may use the pooled t statistic to test H 0 : 1 =, i.e., the P value of the pretest has exceeded 0% providing convincing evidence in favor of pooling variances. Note that H 0 declares that 1 and have a common value of. Analysis of Data in Table.1 on page 6 of Montgomery s 8 th Edition Y = Tension Bond Strength is measured in Kgf/cm. Experimental Group: y 1j : 16.85, 16.40, 17.1, 16.35,16.5, 17.04, 16.96, 17.15, 16.59, 16.57 y 1 = 16.7640; USS 1 = 811.18, CF 1 = 167.64 / 10 = 810.31696 CSS 1 = 0.9014 S 1 = 0.10013777 S 1 = 0.31645 8
Control Group: y j : 16.6, 16.75, 17.37, 17.1, 16.98, 16.87, 17.34, 17.0, 17.08, 17.7 y = 17.04, USS = 904.85080, CF = 170.4 /10 = 904.976400 CSS = 0.5531600 0.06146 S = 0.479 (Control group), and as a result S p Only when n 1 = n = n, then 1S1 S 1 = (n11) S1 (n1) S n1n S p ( S 1 +S )/ = 0.08080 S p = 0.0808 = 0.84534. Assuming that 1 is not rejected at the 0% level,, then t 0 = [( y1 y) ]/se, where se = se( y1 y) = S p (1/n 1)+(1/n ) = 0.84534(0.10 + 0.10)0.5 = 0.171 t 0 = S = [ 0.780 0]/0.171 =.1869; t 0.05,18 =.10094 Reject H 0 : 1 = 0 at the LOS = 0.05 because t 0 > t 0.05,18. The P value = Pr(T 18.1869) = 0.01097343 = 0.04194685 < 0.05 because H 0 was rejected at the 5% level. See Table. on p. 41 of Montgomery. () Case of 1 Test statistics: t 0 = [(y1y ) ] / ( S1 / n 1) ( S / n ), but the df is given by in equation (.3) on page 48 of Montgomery s 8 th edition and generally min(n 1, n ) < n 1 + n. A simplified version of that Eq. (.3) of Montgomery is given by = νν[v(y 1 1) + v(y )] 1 1 ν (v(y )) + ν (v(y )) = 1 ( FR 0 n 1) ( FR 0 n) 1, where v(y 1) = 1 / 1 S n, R n = n /n 1 and F 0 = S 1 / S. For equal sample sizes, the above formula reduces to = (n 1)(F 1) (F ) 1 0 0. For the sake of illustration we assume that 1, so that for the data of Montgomery s Table.1, the se( y1 y) = [( S 1 /n 1 ) + ( S /n )] 0.50 = 0.171, t 0 =.1869, and is also given by the formula in Table.4 on page 5 9
of Montgomery s 8 th edition. = 9(0.01616) (0.01001377 ) (0.006146) = 17.05 value = Pr(T 17.05.1869) = 0.014981006771 = 0.04996 (see Table., p.41), which is bit more conservative than that of the pooled t test P value. Note that F 0 = 1.696 and = 9(F 1) 0 0 (F ) 1 leads to the same answer of = 17.05. Further, Montgomery also covers the paired t test on pp. 53 57 and Problems.3,.33 &.34. The pertinent hardness example with data in Table.6 will be discussed in class. The Relative Efficiency in Hypothesis Testing The relative efficiency (RELEFF) of an level statistical test T 1 to an level test T is given by n /n 1 iff both tests have identical values of type II error probability. As an example, if T 1 requires a sample of size n 1 = 0 and has = 0.05, = 0.10, but T requires an n = 5 to attain the same = 0.05 and = 0.10, then the efficiency of T 1 relative to T is given by 5/0 = 15%, or the RELEFF(of T to T 1 ) = 0/5 = 80%. On the other hand, if the 5% level tests T 1 and T both use the same random sample of size n = n 1 = n = 5, but (T 1 ) = 0.10 while (T ) = 0.15, then the RELEFF of T 1 to T is given by 0.15/0.10 = 15%. Further, suppose the RELEFF(T 1, T ) = 15%, both having the same &, and T has a sample size n = 30. Then the sample size for T 1 must be obtained from RELEFF(T 1, T ) = 1.5 = n /n 1 = 30/n 1 n 1 = 30/1.5 = 4. Errata for Chapter of Montgomery s 8 th Edition 1. Page 33, in Figure.5 change to.. Page 37, atop the page in the description of Figure.10, change the terminology critical region to either critical values or critical limits (or possibly rejection thresholds). 3. Page 59, the sided CI on variance ratio 1 / in Eq. (.50) should appropriately be changed to 0.4331 is right tailed. 1 / <, because the test on 1 / = 1 10