As an example, consider the Bond Strength data in Table 2.1, atop page 26 of y1 y 1j/ n , S 1 (y1j y 1) 0.

Similar documents
1; (f) H 0 : = 55 db, H 1 : < 55.

Reference: Chapter 7 of Devore (8e)

Chapter 8 of Devore , H 1 :

An interval estimator of a parameter θ is of the form θl < θ < θu at a

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Chapter 10: Inferences based on two samples

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Reference: Chapter 13 of Montgomery (8e)

Chapter 7 Comparison of two independent samples

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Lecture 15: Inference Based on Two Samples

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):

Hypothesis testing. 1 Principle of hypothesis testing 2

Mathematical statistics

IENG581 Design and Analysis of Experiments INTRODUCTION

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Chapter 9 Inferences from Two Samples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Reference: Chapter 14 of Montgomery (8e)

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Inference for Proportions, Variance and Standard Deviation

Practice Problems Section Problems

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

CH.9 Tests of Hypotheses for a Single Sample

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Tests about a population mean

Exam 2 (KEY) July 20, 2009

1 Statistical inference for a population mean

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Ch. 7. One sample hypothesis tests for µ and σ

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

The Components of a Statistical Hypothesis Testing Problem

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

The t-statistic. Student s t Test

Hypothesis testing for µ:

Basic Concepts of Inference

CBA4 is live in practice mode this week exam mode from Saturday!

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Hypothesis Testing One Sample Tests

The Chi-Square Distributions

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

χ L = χ R =

Tables Table A Table B Table C Table D Table E 675

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

Statistics for Business and Economics

Ch 2: Simple Linear Regression

Two-Sample Inferential Statistics

Chapter Seven: Multi-Sample Methods 1/52

Course information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment.

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

An inferential procedure to use sample data to understand a population Procedures

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

Paired comparisons. We assume that

Inferences about central values (.)

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

STA 101 Final Review

Solution: First note that the power function of the test is given as follows,

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Psychology 282 Lecture #4 Outline Inferences in SLR

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Nonparametric Statistics Chapter 2 Maghsoodloo The r c Contingency Tables 2.1 A two-way contingency table consists of r rows and c columns, in which

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Statistical Inference

4.1 Hypothesis Testing

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Maximum-Likelihood Estimation: Basic Ideas

SMAM 314 Exam 3d Name

Design of Experiments

2. Tests in the Normal Model

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

The Chi-Square Distributions

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Hypothesis tests for two means

Outline for Today. Review of In-class Exercise Bivariate Hypothesis Test 2: Difference of Means Bivariate Hypothesis Testing 3: Correla

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Simple and Multiple Linear Regression

Introductory Econometrics. Review of statistics (Part II: Inference)

Confidence Intervals, Testing and ANOVA Summary

STAT 4385 Topic 01: Introduction & Review

ECO220Y Simple Regression: Testing the Slope

Statistical Inference

Control of Manufacturing Processes

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Transcription:

INSY 7300 6 F01 Reference: Chapter of Montgomery s 8 th Edition Point Estimation As an example, consider the Bond Strength data in Table.1, atop page 6 of By S. Maghsoodloo Montgomery s 8 th edition, on Modified Mortar (the experimental group). The most two important sample statistics for the response variable, y 1, of the experimental group are 10 n 1 1 y1 y 1j/ n116.764, S 1 (y1jy 1) 0.10014, where n1 10 j1 9 j1 and the unit of measurements is in kgf/cm. Before sampling, y 1 is an unbiased estimator of 1, i.e., E( y 1) = 1, and S 1 is an unbiased estimator of 1 iff the population is infinite in which case E( S 1 ) = 1. If the population is finite, E( S 1 ) 1. For nearly all underlying distributions, E(S), i.e., S is a biased estimator of population standard deviation. For a normal universe, E(S) = c 4, where 0 < c 4 < 1, and c 4 (n) = (n/). n1 [(n1)/] The operator E is linear because (1): E(CY) = CE(Y), and (): E( Y 1 +Y ) = E( Y 1 ) + E(Y ), where C is any constant. The operator V is nonlinear because (1): V(CY) CV(Y). In fact V(CY) = C V(Y) and V(Y 1 Y ) = V(Y 1 ) + V(Y ) COV( Y 1, Y ). If Y 1 and Y are independent, then COV(Y 1, Y ) = 1 = E[(Y 1 1 )(Y )] = 0. The converse of this is not generally true. Now, consider the numerator of n j j=1 S = (y y) /(n1)=s /(n1)=css/(n1): yy n yy j j j j j j1 i1 S (y y) y y y (y) y ( y ) /n S yy = CSS = USS CF Degrees of freedom (df): (n 1) = n 1 1

In general, if a random variable (rv), Y, has V(Y) = y, then E(CSS/ df ) = y. Interval Estimation There are 3 types of QCH: Smaller The Better (STB), LTB, & Nominal The Best (NTB) STB Examples: Tire eccentricity, Loudness of a compressor, Rate of wear, Braking distance, etc Ideal target = 0 and only a single upper spec limit, USL= y u. LTB (Larger The Better) Examples: Welding Strength, TTF (time to failure), Efficiency, Yield, etc. Ideal target = and a single lower spec limit, LSL = y L. NTB Examples: Clearance, Chemical content level, Output voltage, % Asphalt in a hot mix asphalt (HMA) which generally ranges within 3.5 8.00%. Ideal target = m and there are always an LSL = m 1 and an USL = m +. Generally, a CI (Confidence INTERVAL) for an STB parameter should be upper one sided, for an LTB type parameter should be lower one sided, and always sided CI for an NTB parameter. Example 1. A company manufactures ropes for climbing purposes. The consumers LSL for breaking strength y is y L = 100 psi, and y ~ N(, 676 psi ). Obtain a 95 % proper CI for the parameter using the average of a random sample of size n = 5, where y = 115 psi. y / n 0.5 0.45 0.05 y Figure 1 + Z 0.05 σ y

The sampling distribution (SMD) of y in Figure 1 shows that the Pr (y 1.645 6/ n) = 0.95 115 1.6455. < 106.446 <, where L = 106.446, and Z 0.05 1.645. If a CI is lower one sided, then the corresponding test of hypothesis must be right tailed, i.e., for the above CI we should test H 0 : = 0 psi versus H 1 : > 0 (the alternative H 1 : < 0 will lead to a contradiction when H 0 is rejected). Typical values of 0 = 105, 108, 110, etc. / n 105 AU = y U.05 y Figure. The SMD of y given that H 0 : = 105 is true The nominal level of significance is generally set at = 0.05. A U = Upper Acceptance Limit = 105 + Z0.05 6/ 5 = 113.554 = y U AI (Acceptance Interval) : 0 y 113.554. The test statistic y = 115 > y U Reject H 0 at the LOS = 0.05 and conclude that > 105. Note that an upper one sided CI for the above test is given by < 115 + 8.554 = 13.554, which includes the hypothesized value of = 105 and hence contradictory to the rejection of H 0. Assignment 1. Work problem.17 on page 60 of your text. ANS: (c) P value = 0.0549, (d) [799.75, 84.5]. (b) Work problem.0, p. 61. ANS: n 139. (c) Work problem.5. (d) Work problem. but change part (a) to determining if the population mean repair time is less than 50 hours? Note that in this problem you will have to use the Student's t distribution. 3

Test of Hypothesis In conducting any test of hypothesis, only one of the 4 circumstances given in Table 1 will occur, where Pr denotes probability. Table 1. The four circumstances that may occur when testing H 0 Reject H 0 Accept H 0 H 0 is true Type I error (or False Positive) Occurrence Pr = Correct decision (True Negative) Specificity of the test = Occurrence Pr = 1 H 0 is false Correct decision (True Positive), Occurrence Pr = 1 = Power or Sensitivity of the test Type II error (or False Negative) Occurrence Pr = = The Pr of Accepting H 0 at a specified value of the parameter under H 0. A u = 113.554 Figure 3 y 5.0 y 0 = 110 1 110 113.554 y Note that in order to commit a type II error, H 0 must be false. In reference to Example 1, this implies that must differ from 105 psi, say = 110. From Figure 3, Z 1 = (113.554 110)/5. = 0.68346 (at = 110) = (0.68346) = 0.75844. 4

Assignment. Compute the values of for values of = 105, 113.554, 115, 118 and 15. Graph as a function of. This graph of type II error Pr versus the parameter under H 0 is called the OC (Operating Characteristic) curve. If the population variance is unknown, then statistical inference (i.e., estimation & test of hypothesis) on cannot be conducted using the statistic Z 0 =(y 0 ) n /, rather resort has to be made to the sampling distribution of the statistic (y 0) n / S which has (W. S. Gosset s) Student s t distribution with (n 1) df, such as in problem.0. (Also see the Example. on pp. 51 5 of Montgomery s 8 th edition, and Problems.3,.33 &.34 all on the paired t test). Statistical Inference on We use the fact that the SMD (sampling distribution) of the random variable (n 1)S / from a normal (or Laplace Gaussian) universe follows a Chi square ( ) distribution with (n 1) df. As an example, consider the problem.31 on page 63 of Montgomery s 8 th edition. Data Statistics: 0 j=1 j j y = 116.56, y = 5.88, USS = y = 694.330, S = 0.79045 CSS = 694.330 116.56 /0 = 15.0185; Figure 4 atop the next page shows that χ 0.95,19 =10.1170 Pr( χ 10.1170 ) = 0.95 Pr[(n 1)S / 10.1170] = 0.95 19 0 < 19(0.79045)/10.1170 0 < 1.48448 0 < 1.1839 We are 95% confident that the process variance lies within the interval (0, 1.4845]. The Pr that this last interval includes is 0 or 1. The above CI implies that we cannot reject the null hypothesis H 0 : = 1.0 versus the alternative H 1 : < 1.0, i.e., we cannot conclude that < 1.0 at the 5% level; however, we can reject the null hypothesis H 0 : = 1.60 versus the alternative H 1 : < 1.60 because 1.60 is outside the 95% CI (0 < 1.48448]. Note that 10.1170 represents the 95 th percentage point of Chi square, i.e., 10.1170 = χ 0.95,19, or its 0.05 quantile. 5

19 The Modal point = MO =17 Statistical Inference on Two Normal Population Parameters Although Montgomery covers the test of two variance equality at the end of Chapter (pp. 58 59) and a pretest on 1 = is judicious in order to determine whether to pool the variances from independent populations, we will first cover inferences about variances of two normal populations, followed by inferences on two independent population means. Statistical Inference on Two Variances Sir Ronald A. Fisher s F statistic describes the sampling distribution of the ratio of scaled distributions give below: F = / 1 1 [(n11) S / 1]/(n11) 1 = / [(n1) S / ]/ (n1) S1 / 1 S /, where 1 = n 1 1 is the df of the numerator and = n 1 is the df of the denominator. The Modal 6

point of an F distribution is roughly 1 for 1 and > 8 (but always less than 1). Now, consider the F 0.05, 11,9 = 3.105 Example.3 on pages 58 59 of Montgomery: n 1 =1, S 1 = 14.5, n = 10, and S = 10.8. Figure 5 above clearly shows that the Pr(F 11, 9 3.105) = 0.95, i.e., the cdf of the rv F 11,9 at F 0.05, 11,9 = 3.105 is equal to 0.95. Put differently, the 5 percentage point of the F distribution, or the 0.95 quantile, with 1 = 11 and = 9 is given by F 0.05, 11,9 = 3.105. Consequently, Pr( F 11,9 3.105 ) = 0.95 S / Pr( 1 1 S / 3.105 ) = 0.95 Pr( S1 3.105 S 1 / < ) = 0.95 0.437 1 / <. The above CI is consistent with testing H 0 : 1 / = 1 vs the alternative H 1 : 1 / > 1 because the CI encloses the null hypothesized value of 1 / = 1. However, suppose we had n 1 = 1, S 1 = 14.5, n = 10 but S = 4.4. Now the test statistic F 0 = 3.805 > F 0.05,11,9 = 3.105 leads to the rejection of H 0 at the 5% level; however, the 95% sided CI : 0.8386 1 / 11.7703, where 0.8386 = 1 S /S F 0.05,11,9, F 0.05,11,9 = 3.911, contains 7

1 / = 1, which is contradictory to the rejection of H 0! The correct one sided CI : 1.0574 1 / < excludes the null hypothesized value of the right tailed alternative H 1 : upper one sided 95% CI: 0 < 1 / 1 / 1 / = 1 as required. Note that for > 1, we must obtain a lower one sided CI because the S 1 / S F 0.95,11,9 = 3.805 0.3453 0 < / 9.501 1 includes the hypothesized value of 1 / = 1, which contradicts the rejection of H 0. Testing the Equality of Two Independent Population Means (a) Case of H 0 : 1 = = not rejected at the 0% level. Consider the sided hypothesis H 0 : 1 = versus H 1 : 1. Then, the Test Statistic is where S p = (CSS 1 + CSS )/df. Values of the rejection region is 1 1 t 0 = [(y1 y ) ] / ( Sp ), n 1 n t t 0 /,n n lead to the rejection of H 0, i.e., 1 (, t ) (t, ). See the example on pp. /,n n /,n n 1 1 38 39 of Montgomery, and problems.6,.7. Note that a pretest on H 0 : yields F 0 = 1 = at the 0% S 1 / S = 0.10013777/0.06146 = 1.693, which yields a P value = 0.4785 > 0.0, and hence the null hypothesis H 0 : 1 = is tenable. Thus, we may use the pooled t statistic to test H 0 : 1 =, i.e., the P value of the pretest has exceeded 0% providing convincing evidence in favor of pooling variances. Note that H 0 declares that 1 and have a common value of. Analysis of Data in Table.1 on page 6 of Montgomery s 8 th Edition Y = Tension Bond Strength is measured in Kgf/cm. Experimental Group: y 1j : 16.85, 16.40, 17.1, 16.35,16.5, 17.04, 16.96, 17.15, 16.59, 16.57 y 1 = 16.7640; USS 1 = 811.18, CF 1 = 167.64 / 10 = 810.31696 CSS 1 = 0.9014 S 1 = 0.10013777 S 1 = 0.31645 8

Control Group: y j : 16.6, 16.75, 17.37, 17.1, 16.98, 16.87, 17.34, 17.0, 17.08, 17.7 y = 17.04, USS = 904.85080, CF = 170.4 /10 = 904.976400 CSS = 0.5531600 0.06146 S = 0.479 (Control group), and as a result S p Only when n 1 = n = n, then 1S1 S 1 = (n11) S1 (n1) S n1n S p ( S 1 +S )/ = 0.08080 S p = 0.0808 = 0.84534. Assuming that 1 is not rejected at the 0% level,, then t 0 = [( y1 y) ]/se, where se = se( y1 y) = S p (1/n 1)+(1/n ) = 0.84534(0.10 + 0.10)0.5 = 0.171 t 0 = S = [ 0.780 0]/0.171 =.1869; t 0.05,18 =.10094 Reject H 0 : 1 = 0 at the LOS = 0.05 because t 0 > t 0.05,18. The P value = Pr(T 18.1869) = 0.01097343 = 0.04194685 < 0.05 because H 0 was rejected at the 5% level. See Table. on p. 41 of Montgomery. () Case of 1 Test statistics: t 0 = [(y1y ) ] / ( S1 / n 1) ( S / n ), but the df is given by in equation (.3) on page 48 of Montgomery s 8 th edition and generally min(n 1, n ) < n 1 + n. A simplified version of that Eq. (.3) of Montgomery is given by = νν[v(y 1 1) + v(y )] 1 1 ν (v(y )) + ν (v(y )) = 1 ( FR 0 n 1) ( FR 0 n) 1, where v(y 1) = 1 / 1 S n, R n = n /n 1 and F 0 = S 1 / S. For equal sample sizes, the above formula reduces to = (n 1)(F 1) (F ) 1 0 0. For the sake of illustration we assume that 1, so that for the data of Montgomery s Table.1, the se( y1 y) = [( S 1 /n 1 ) + ( S /n )] 0.50 = 0.171, t 0 =.1869, and is also given by the formula in Table.4 on page 5 9

of Montgomery s 8 th edition. = 9(0.01616) (0.01001377 ) (0.006146) = 17.05 value = Pr(T 17.05.1869) = 0.014981006771 = 0.04996 (see Table., p.41), which is bit more conservative than that of the pooled t test P value. Note that F 0 = 1.696 and = 9(F 1) 0 0 (F ) 1 leads to the same answer of = 17.05. Further, Montgomery also covers the paired t test on pp. 53 57 and Problems.3,.33 &.34. The pertinent hardness example with data in Table.6 will be discussed in class. The Relative Efficiency in Hypothesis Testing The relative efficiency (RELEFF) of an level statistical test T 1 to an level test T is given by n /n 1 iff both tests have identical values of type II error probability. As an example, if T 1 requires a sample of size n 1 = 0 and has = 0.05, = 0.10, but T requires an n = 5 to attain the same = 0.05 and = 0.10, then the efficiency of T 1 relative to T is given by 5/0 = 15%, or the RELEFF(of T to T 1 ) = 0/5 = 80%. On the other hand, if the 5% level tests T 1 and T both use the same random sample of size n = n 1 = n = 5, but (T 1 ) = 0.10 while (T ) = 0.15, then the RELEFF of T 1 to T is given by 0.15/0.10 = 15%. Further, suppose the RELEFF(T 1, T ) = 15%, both having the same &, and T has a sample size n = 30. Then the sample size for T 1 must be obtained from RELEFF(T 1, T ) = 1.5 = n /n 1 = 30/n 1 n 1 = 30/1.5 = 4. Errata for Chapter of Montgomery s 8 th Edition 1. Page 33, in Figure.5 change to.. Page 37, atop the page in the description of Figure.10, change the terminology critical region to either critical values or critical limits (or possibly rejection thresholds). 3. Page 59, the sided CI on variance ratio 1 / in Eq. (.50) should appropriately be changed to 0.4331 is right tailed. 1 / <, because the test on 1 / = 1 10