Inference About Means and Proportions with Two Populations

Content Inferences About the Difference Between Two Population Means: 1 and Known Inferences About the Difference Between Two Population Means: 1 and Unknown Inferences About the Difference Between Two Population Means: Matched Samples Inferences About the Difference Between Two Population Proportions

Inferences About the Difference Between Two Population Means: 1 and Known Interval Estimation of 1 Hypothesis Tests About 1 3

Estimating the Difference Between Two Population Means Let 1 equal the mean of population 1 and equal the mean of population. The difference between the two population means is 1 -. To estimate 1 -, we will select a simple random sample of size n 1 from population 1 and a simple random sample of size n from population. Let x1 equal the mean of sample 1 and x equal the mean of sample. The point estimator of the difference between the means of the populations 1 and is x x. 1 4

Sampling Distribution of x x 1 Expected Value E x 1 x 1 Standard Deviation (Standard Error) x1 x n n 1 1 where: 1 = standard deviation of population 1 = standard deviation of population n 1 = sample size from population 1 n = sample size from population 5

Interval Estimation of 1 - : 1 and Known Interval Estimate 1 x1x z n n 1 where: 1 - is the confidence coefficient 6

Interval Estimation of 1 - : 1 and Known Example: Par, Inc. Par, Inc. is a manufacturer of golf equipment and has developed a new golf ball that has been designed to provide extra distance. In a test of driving distance using a mechanical driving device, a sample of Par golf balls was compared with a sample of golf balls made by Rap, Ltd., a competitor. The sample statistics appear on the next slide. 7

Interval Estimation of 1 - : 1 and Known Example: Par, Inc. Sample Size Sample Mean Sample #1 Par, Inc. Sample # Rap, Ltd. 10 balls 80 balls 75 yards 58 yards Based on data from previous driving distance tests, the two population standard deviations are known with 1 = 15 yards and = 0 yards. 8

Interval Estimation of 1 - : 1 and Known Example: Par, Inc. Let us develop a 95% confidence interval estimate of the difference between the mean driving distances of the two brands of golf ball. 9

Estimating the Difference Between Two Population Means Population 1 Par, Inc. Golf Balls 1 = mean driving distance of Par golf balls 1 = difference between the mean distances Population Rap, Ltd. Golf Balls = mean driving distance of Rap golf balls Simple random sample of n 1 Par golf balls x 1 = sample mean distance for the Par golf balls Simple random sample of n Rap golf balls x = sample mean distance for the Rap golf balls x 1 - x = Point Estimate of 1 10

Point Estimate of 1 - Point estimate of 1 - = x1 x = 75 58 = 17 yards where: 1 = mean distance for the population of Par, Inc. golf balls = mean distance for the population of Rap, Ltd. golf balls 11

Interval Estimation of 1 - : 1 and Known (15) (0) x1x z 17 1.96 n n 10 80 1 1 17 ± 5.14 or 11.86 yards to.14 yards We are 95% confident that the difference between the mean driving distances of Par, Inc. balls and Rap, Ltd. balls is 11.86 to.14 yards. 1

Interval Estimation of 1 - : 1 and Known Hypotheses H H : D D 0 1 0 : a 1 0 Test Statistic H H : D D 0 1 0 : a 1 0 H H : D D 0 1 0 : a 1 0 Left-tailed Right-tailed Two-tailed z ( x x ) D 1 0 1 n n 1 13

Interval Estimation of 1 - : 1 and Known Example: Par, Inc. Can we conclude, using a =.01, that the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. Golf balls? 14

Interval Estimation of 1 - : 1 and Known p Value and Critical Value Approaches 1. Develop the hypotheses. H 0 : 1-0 H a : 1 - > 0 where: 1 = mean distance for the population of Par, Inc. golf balls = mean distance for the population of Rap, Ltd. golf balls. Specify the level of significance. =.01 15

Interval Estimation of 1 - : 1 and Known p Value and Critical Value Approaches 3. Compute the value of the test statistic. z ( x x ) D 1 0 1 n n 1 z (35 18) 0 17 (15) (0).6 10 80 6.49 16

Interval Estimation of 1 - : 1 and Known p Value Approach 4. Compute the p value. For z = 6.49, the p value <.0001. 5. Determine whether to reject H 0. Because p value =.01, we reject H 0. At the.01 level of significance, the sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls. 17

Interval Estimation of 1 - : 1 and Known Critical Value Approach 4. Determine the critical value and rejection rule. For =.01, z.01 =.33 Reject H 0 if z.33 5. Determine whether to reject H 0. Because z = 6.49.33, we reject H 0. The sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls. 18

Inferences About the Difference Between Two Population Means: 1 and unknown Interval Estimation of 1 Hypothesis Tests About 1 19

Interval Estimation of 1 - : 1 and Unknown When 1 and are unknown, we will: use the sample standard deviations s 1 and s as estimates of 1 and, and replace z / with t /. 0

Interval Estimation of 1 - : 1 and Unknown Interval Estimate s x1x t / n s n 1 1 Where the degrees of freedom for t / are: df s n s n 1 1 1 s 1 s 1 n11 n1 n 1n 1

Difference Between Two Population Means: 1 and Unknown Example: Specific Motors Specific Motors of Detroit has developed a new Automobile known as the M car. 4 M cars and 8 J cars (from Japan) were road tested to compare miles-per-gallon (mpg) performance. The sample statistics are shown on the next slide.

Difference Between Two Population Means: 1 and Unknown Example: Specific Motors Sample #1 M Cars Sample # J Cars 4 cars 8 cars 9.8 mpg 7.3 mpg.56 mpg 1.81 mpg Sample Size Sample Mean Sample Std. Dev. 3

Difference Between Two Population Means: 1 and Unknown Example: Specific Motors Let us develop a 90% confidence interval estimate of the difference between the mpg performances of the two models of automobile. 4

Point Estimate of 1 Point estimate of 1 - = x1 x = 9.8 7.3 =.5 mpg where: 1 = mean miles-per-gallon for the population of M cars = mean miles-per-gallon for the population of J cars 5

Interval Estimation of 1 - : 1 and Unknown The degrees of freedom for t / are: df (.56) (1.81) 4 8 4.07 4 1 (.56) 1 (1.81) 4 1 4 8 1 8 With / =.05 and df = 4, t / = 1.711 6

Interval Estimation of 1 - : 1 and Unknown s s (.56) (1.81) x1x t / 9.8 7.3 1.711 n n 4 8 1 1.5 ± 1.069 or 1.431 to 3.569 mpg We are 90% confident that the difference between the miles-per-gallon performances of M cars and J cars is 1.431 to 3.569 mpg. 7

Hypothesis Tests About 1 - : 1 and Unknown Hypotheses H : H 0 1 0 a : 1 0 Test Statistic D D H H : 0 1 0 a : D D 1 0 H H : 0 1 0 a : D D 1 0 Left-tailed Right-tailed Two-tailed t ( x x ) D 1 0 s1 s n n 1 8

Hypothesis Tests About 1 - : 1 and Unknown Example: Specific Motors Can we conclude, using a.05 level of significance, that the miles-per-gallon (mpg) performance of M cars is greater than the milesper-gallon performance of J cars? 9

Hypothesis Tests About 1 - : 1 and Unknown p Value and Critical Value Approaches 1. Develop the hypotheses. H 0 : 1-0 H a : 1 - > 0 where: 1 = mean mpg for the population of M cars = mean mpg for the population of J cars 30

Hypothesis Tests About 1 - : 1 and Unknown p Value and Critical Value Approaches. Specify the level of significance. =.05 3. Compute the value of the test statistic. t ( x x ) D (9.8 7.3) 0 1 0 s1 s (.56) (1.81) n n 1 4 8 4.003 31

Hypothesis Tests About 1 - : 1 and Unknown p Value Approach 4. Compute the p value. The degrees of freedom for t are: df (.56) (1.81) 4 8 40.566 41 1 (.56) 1 (1.81) 4 1 4 8 1 8 Because t = 4.003 > t.005 = 1.683, the p value <.005. 3

Hypothesis Tests About 1 - : 1 and Unknown p Value Approach 5. Determine whether to reject H 0. Because p value =.05, we reject H 0. We are at least 95% confident that the miles-per-gallon (mpg) performance of M cars is greater than the miles-per-gallon performance of J cars?. 33

Hypothesis Tests About 1 - : 1 and Unknown Critical Value Approach 4. Determine the critical value and rejection rule. For =.05 and df = 41, t.05 = 1.683 Reject H 0 if t 1.683 5. Determine whether to reject H 0. Because 4.003 1.683, we reject H 0. We are at least 95% confident that the miles-pergallon (mpg) performance of M cars is greater than the miles-per-gallon performance of J cars?. 34

Interval Estimation 1 - : 1 and Unknown and 1 When 1 and are unknown, we will: use the pooled-variance ( 合併變異數 ) as estimates of 1 (= ), and replace z / with t /. s p s ( n 1) s ( n 1) s 1 1 p ( n 1) ( n 1) 1 35

Interval Estimation 1 - : 1 and Unknown and 1 Interval Estimate x x t s p 1 1 1 / n1 n Where the degrees of freedom for t / are: df n1n 36

Interval Estimation 1 - : 1 and Unknown and 1 A coffee manufacturer is interested in estimating the difference in the average daily coffee consumption of regular-coffee drinkers and decaffeinated-coffee drinkers. Its researcher randomly selects 13 regular-coffee drinkers and asks how many cups of coffee per day they drink. He randomly locates 15 decaffeinated-coffee drinkers and asks how many cups of coffee per day they drink. 37

Interval Estimation 1 - : 1 and Unknown and 1 The average for the regular-coffee drinkers is 4.35 cups, with a standard deviation of 1.0 cups. The average for the decaffeinated-coffee drinkers is 6.84 cups, with a standard deviation of 1.4 cups. The research assumes, for each population, that the daily consumption is normally distributed. Construct a 95% confidence interval to estimate the difference in the averages of the two populations. 38

Interval Estimation 1 - : 1 and Unknown and 1 The confidence interval estimate is 1 1 x1x t / sp n n 1 (1.0) (1) (1.4) (14) 1 1 (4.35 6.84).056 1315 13 15.49 1.03 3.5 1.46 r d The researcher is 95% confidence that the difference in population average daily consumption of cups of coffee between regular- and decaffeinated-coffee drinkers is between 1.46 cups and 3.5 cups. The point estimate for the difference in population means is.49 cups, with an error of 1.03 cups. 39

Hypothesis Tests About 1 - : 1 and Unknown and 1 Hypotheses H : H 0 1 0 a : 1 0 Test Statistic D D H H : 0 1 0 a : D D 1 0 H H : 0 1 0 a : D D 1 0 Left-tailed Right-tailed Two-tailed t ( x x ) D 1 0 ( n11) s1 ( n 1) s 1 1 ( n 1) ( n 1) n n 1 1 where the degrees of freedom for t / are: df n1n 40

Hypothesis Tests About 1 - : 1 and Unknown and 1 Is there a difference in the way Chinese cultural values affect the purchasing strategies of industrial buyers in Taiwan and mainland China? A study by researchers at the National Chiao- Tung University in Taiwan attempted to determine whether there is a significant difference in the purchasing strategies of industrial buyers between Taiwan and mainland China based on the cultural dimension labeled integration. 41

Hypothesis Tests About 1 - : 1 and Unknown and 1 Integration is being in harmony with one s self, family, and associates. For the study, 46 Taiwanese buyers and 6 mainland Chinese buyers were contacted and interviewed. Buyers were asked to respond to 35 items using 9-point scale with possible answers ranging from no importance (1) to extreme importance (9). The resulting statistics for two groups are shown in the next slide. 4

Hypothesis Tests About 1 - : 1 and Unknown and 1 Taiwanese Buyers n 1 = 46 n = 6 x1 5.4 x 5.04 Mainland Chinese Buyers s1 (.58).3346 df = n 1 + n = 46 + 6 = 70 s (.49).401 Use a =.01, test to determine whether there is a significant difference between buyers in Taiwan and buyers in mainland China on integration. 43

Hypothesis Tests About 1 - : 1 and Unknown and 1 1. Develop the hypotheses.. Assuming both populations are normally distributed with equal unknown variances t test H 0 : 1 - = 0 H a : 1-0 where: 1 = mean scores for the population of Taiwanese buyers = mean scores for the population of Chinese buyers 44

Hypothesis Tests About 1 - : 1 and Unknown and 1 3. Specify the level of significance. =.01 4. Reject H 0 if t t.00,70.648 5. Compute the value of the test statistic. (5.4 5.04) 0 t (.3364)(45) (.401)(5) 1 1 46 6 46 6.8 45

Hypothesis Tests About 1 - : 1 and Unknown and 1 6. The Taiwan industrial buyers scored significantly higher than the mainland China industrial buyers on integration. 46

Inferences About the Difference Between Two Population Means: Matched Samples With a matched-sample design each sampled item provides a pair of data values. This design often leads to a smaller sampling error than the independent-sample design because variation between sampled items is eliminated as a source of sampling error. 47

Dependent Samples Before and After Measurements on the same individual Studies of twins Studies of spouses 48

Inferences About the Difference Between Two Population Means: Matched Samples Example: Express Deliveries A Chicago-based firm has documents that must be quickly distributed to district offices throughout the U.S. The firm must decide between two delivery services, UPX (United Parcel Express) and INTEX (International Express), to transport its documents. 49

Inferences About the Difference Between Two Population Means: Matched Samples Example: Express Deliveries In testing the delivery times of the two services, the firm sent two reports to a random sample of its district offices with one report carried by UPX and the other report carried by INTEX. Do the data on the next slide indicate a difference in mean delivery times for the two services? Use a.05 level of significance. 50

Inferences About the Difference Between Two Population Means: Matched Samples District Office Seattle Los Angeles Boston Cleveland New York Houston Atlanta St. Louis Milwaukee Denver Delivery Time (Hours) UPX INTEX Difference 3 30 19 16 15 18 14 10 7 16 5 4 15 15 13 15 15 8 9 11 7 6 4 1 3-1 - 5 51

Inferences About the Difference Between Two Population Means: Matched Samples p Value and Critical Value Approaches 1. Develop the hypotheses. H 0 : d = 0 H a : d Let d = the mean of the difference values for the two delivery services for the population of district offices 5

Inferences About the Difference Between Two Population Means: Matched Samples p Value and Critical Value Approaches. Specify the level of significance. =.05 3. Compute the value of the test statistic. d d i (7 6... 5).7 n 10 ( di d) 76.1 sd.9 n 1 9 d d.7 0 t.94 s n.9 10 d 53

Inferences About the Difference Between Two Population Means: Matched Samples p Value Approach 4. Compute the p value. For t =.94 and df = 9, the p value is between.0 and.01. (This is a two-tailed test, so we double the uppertail areas of.01 and.005.) 5. Determine whether to reject H 0. Because p value =.05, we reject H 0. We are at least 95% confident that there is a difference in mean delivery times for the two services? 54

Inferences About the Difference Between Two Population Means: Matched Samples Critical Value Approach 4. Determine the critical value and rejection rule. For =.05 and df = 9, t.05 =.6. Reject H 0 if t.6 5. Determine whether to reject H 0. Because t =.94.6, we reject H 0. We are at least 95% confident that there is a difference in mean delivery times for the two services? 55

Inferences About the Difference Between Two Population Proportions Interval Estimation of p 1 - p Hypothesis Tests About p 1 - p 56

Sampling Distribution of p p 1 Expected Value E( p p ) p p 1 1 Standard Deviation (Standard Error) p1(1 p1) p(1 p) p1 p n n 1 where: n 1 = size of sample taken from population 1 n = size of sample taken from population 57

Sampling Distribution of p p 1 If the sample sizes are large, the sampling distribution of p1 pcan be approximated by a normal probability distribution. The sample sizes are sufficiently large if all of these conditions are met: n 1 p 1 5 n 1 (1 - p 1 ) 5 n p 5 n (1 - p ) 5 58

Sampling Distribution of p p 1 p 1 ( 1 p 1 ) p ( 1 p ) p p 1 n n 1 p 1 p p p 1 59

Interval Estimation of p 1 - p Interval Estimate p (1 p ) p (1 p ) p p z 1 1 1 / n n 1 60

Interval Estimation of p 1 - p Example: Market Research Associates Market Research Associates is conducting research to evaluate the effectiveness of a client s new advertising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households aware of the client s product. The new campaign has been initiated with TV and newspaper advertisements running for three weeks. 61

Interval Estimation of p 1 - p Example: Market Research Associates A survey conducted immediately after the new campaign showed 10 of 50 households aware of the client s product. Does the data support the position that the advertising campaign has provided an increased awareness of the client s product? 6

Point Estimator of the Difference Between Two Population Proportions p 1 = proportion of the population of households aware of the product after the new campaign p = proportion of the population of households aware of the product before the new campaign p 1 p = sample proportion of households aware of the product after the new campaign = sample proportion of households aware of the product before the new campaign p 10 60 p.48.40.08 50 150 1 63

Interval Estimation of p 1 - p For =.05, z.05 = 1.96:.48.40 1.96.48(.5).40(.60) 50 150.08 ± 1.96(.0510).08 ±.10 Hence, the 95% confidence interval for the difference in before and after awareness of the product is -.0 to +.18. 64

Hypothesis Tests about p 1 - p Hypotheses We focus on tests involving no difference between the two population proportions (i.e. p 1 = p ) H 0 : p 1 p 0 H : a p 1 p 0 H 0 : p 1 - p <0 H 0 : p 1 p 0 H : - > a p 1 p 0 H 0 : p 1 p 0 H : a p 1 p 0 Left-tailed Right-tailed Two-tailed 65

Hypothesis Tests about p 1 - p Standard Error of p p when p 1 = p = p 1 1 1 p1 p p(1 p) n1 n Pooled Estimator of p when p 1 = p = p p np 1 1 n np n 1 66

Hypothesis Tests about p 1 - p Test Statistic z ( p p ) 1 p(1 p) 1 1 n n 1 67

Hypothesis Tests about p 1 - p Example: Market Research Associates Can we conclude, using a.05 level of significance, that the proportion of households aware of the client s product increased after the new advertising campaign? 68

Hypothesis Tests about p 1 - p p-value and Critical Value Approaches 1. Develop the hypotheses. H 0 : p 1 - p 0 H a : p 1 - p > 0 p 1 = proportion of the population of households aware of the product after the new campaign p = proportion of the population of households aware of the product before the new campaign 69

Hypothesis Tests about p 1 - p p-value and Critical Value Approaches. Specify the level of significance. =.05 3. Compute the value of the test statistic. 50(.48) 150(.40) 180 p.45 50 150 400 s 1 1 p.45(.55)( ).0514 1 p 50 150 (.48.40) 0.08 z 1.56.0514.0514 70

Hypothesis Tests about p 1 - p p Value Approach 4. Compute the p value. For z = 1.56, the p value =.0594 5. Determine whether to reject H 0. Because p value > =.05, we cannot reject H 0. We cannot conclude that the proportion of households aware of the client s product increased after the new campaign. 71

Hypothesis Tests about p 1 - p Critical Value Approach 4. Determine the critical value and rejection rule. For =.05, z.05 = 1.645 Reject H 0 if z 1.645 5. Determine whether to reject H0. Because 1.56 < 1.645, we cannot reject H 0. We cannot conclude that the proportion of households aware of the client s product increased after the new campaign. 7