SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope. As x increases y decreases. Thus the slope is negative. _T B. When doing a t test on two independent samples it is appropriate to pool the standard deviations when the sample standard deviation of one sample is about the same as that of the other sample. For this case the standard deviations are pooled and all of the observations from both samples are used to estimate it. T_C. The average golf scores of the same ten players in Genesee Valley Park and Oakdale country club are being compared. A t test on the differences between the golf scores is appropriate because this is an example of paired data. This is paired data because there is a common attribute the same players. T_D. The p value of a statistical test is.025. Therefore you would reject H 0 at α =.05. The p value is less than the level of significance so you would reject H 0. T E. The correlation coefficient for two random variables is.962. This means that there is probably a strong linear relationship between the random variables. There is a strong linear correlation because the correlation coefficient is close to one. 2. A random sample of 500 adult residents of Maripoca County found that 385 were in favor of increasing the highway speed limit to 75 miles per hour. Another sample of 400 residents of Pima county indicated that 267 were in favor of the increased speed limit. A. Find a 98% confidence interval on the difference in the proportion of people in the two counties that favor the increased speed limit. (10 points) ) p 1 p ) ) p 2 ± z 1 (1 p ) 1 ) α /2 m + ) p 2 (1 ) p 2 ) n ) p 1 = 385 500 =.77,) p 2 = 267 400 =.6675.7700.6675 ± 2.33 (.77)(.23) 500.1025 ±.0702 (.0323,.1727) + (.6675)(.3325) 400
B. Based on the confidence interval in A would you conclude at α =.02 that there is a difference of opinion on the matter of the increased speed limit. Explain. (5 points) The number zero is not contained in the confidence interval. Thus I would conclude that there might be a difference of opinion on the matter of the increased speed limit at α =.02. 3. Seven adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially and three months after participating in an aerobic exercise program and following a low fat diet. Do an appropriate test of hypothesis to determine whether the low fat diet and aerobic exercise are of value in producing a mean reduction in blood cholesterol levels. Use α =.01. (20 points) Blood Cholesterol Level Subject Before After Difference 1 265 229 36 2 240 231 9 3 258 227 31 4 295 240 55 5 251 238 13 6 245 241 4 7 287 234 53 H 0 µ d = 0 H 1 µ d > 0 Assumptions Paired data Differences are normally distributed. Region of rejection Reject H 0 at α =.01 if T > 3.143 Test statistic T = x µ s/ n x d = 28.7 s d = 20.75 28.7 T = 20.75/ 7 = 3.66 Reject or do not reject H 0 Reject H 0 at α =.01 Conclusion There is highly significant evidence that exercise and diet help reduce chloresterol.
4. An experiment was conducted to compare the filling capability of the packaging equipment at two different wineries. Ten bottles of the same brand of wine from Ridgecrest Vineyards and from Valley View Vineyards were randomly selected and measured. The volumes of wine in milliliters is given below. Row Ridgecrest Valley View 1 755 756 2 751 754 3 752 757 4 753 756 5 753 755 6 753 756 7 753 756 8 754 755 9 752 755 10 751 756 A. Based on the normal probability plots below is it appropriate to assume that the samples are normally distributed? Explain. (3 points) The data does not quite lie on a straight line. There may be reason to doubt the normality assumption.
B. Based on the results of the test for equality of variances given below is it appropriate to do a t test on independent samples with the assumption that the variances of the two populations are equal? Explain. (3 points) F-Test (normal distribution) Test Statistic: 2.203 P-Value : 0.55 The p value is.55 so there is insufficient evidence to doubt the claim of equal variances. C. Here are the results of the t test assuming equality of variance. Answer the questions below it. Two-sample T for Ridgecrest vs Valley View N Mean StDev SE Mean Ridgecre 10 752.70 1.25 0.40 Valley V 10 755.600 0.843 0.27 Difference = mu Ridgecrest - mu Valley View Estimate for difference: -2.900 95% CI for difference: (-3.903, -1.897) T-Test of difference = 0 (vs not =): T-Value = -6.08 P-Value = 0.000 DF = 18 Both use Pooled StDev = 1.07 (1) What is the null and alternative hypothesis? (2 points) H 0 µ 1 µ 2 = 0 H 1 µ 1 µ 2 0 (2). What is the value of the test statistic and the p value? (2 points) T = 6.08 p value =0 D. At α =.01 is there a significant difference in the filling capacity of the equipment at the two different wineries? Explain your answer. (5 points) There is a highly significant difference between the filling capacity at the two wineries. T = 6.08 < -2.552 the first percentile of the distribution. Also the p value is zero <.01.
A. From the scatterplot alone does there appear to be a linear relationship between height and weight? Explain your answer.(3 points) B. Using your hand held calculator to do the computations complete the following statements. (1)-(5) 1 point each (6) and (7) 3 points each (1) The slope of the least square regression line is 5.8425. (2) The y intercept of the least square regression line is 245.75 (3) The equation of the least square regression line is y = 5.845x 245.75 (4) The correlation coefficient is.7827 (5) The regression line accounts for 61.2 % of the variation. (6) The regression equation would predict that a 67 inch tall man weighs 145.7 lbs. (7) The residual (difference between the observed and the predicted value) when x=71 is- 10.06 C. Based on the above information is a linear model appropriate for this data. Explain your answer. (4 points) A linear model would not be appropriate for this data. The scatterplot is not near a straight line. The correlation coefficient is only.7827 and only 61.2% of the variation is accounted for. 6. The grams of solids removed from a material y is thought to be related to the drying time. Ten observations obtained from an experimental study with a scatterplot follow. Data Display Row y x 1 4.3 2.5 2 1.5 3.0 3 1.8 3.5 4 4.9 4.0 5 4.2 4.5 6 4.8 5.0 7 5.8 5.5 8 6.2 6.0 9 7.0 6.5 10 7.9 7.0
A. Based on the scatterplot do the x and y values appear to be positively or negatively correlated. Explain. (2 points) The points are positively correlated becauses as x increases y increases. B.(1) Based on the scatterplot is a linear equation appropriate.? (2 points) With the possible exception of two points the linear equation would be appropriate. (2) Would the fit of a linear equation be better if some of the points were not used? Explain. (2 points) It might be better if the first and the fourth points were removed.
Regression Analysis: y versus x The regression equation is y = - 0.70 + 1.17 x Predictor Coef SE Coef T P Constant -0.699 1.213-0.58 0.580 x 1.1661 0.2445 4.77 0.001 S = 1.110 R-Sq = 74.0% R-Sq(adj) = 70.7% Analysis of Variance Source DF SS MS F P Regression 1 28.044 28.044 22.75 0.001 Residual Error 8 9.860 1.233 Total 9 37.904 Unusual Observations Obs x y Fit SE Fit Residual St Resid 1 2.50 4.300 2.216 0.653 2.084 2.32R R denotes an observation with a large standardized residual C. Fill in the blanks below (1)The slope is 1.1661 1 point (2) The y intercept is 0.699 1points (3) The regression equation is y = 1.661x.6999 1points (4) The correlation coefficient is.86 1 points (5) The percentage of the variation accounted for by the linear model is 76% 1 point D. Based on the p values for the t statistics (1) Is the slope significantly different from zero? Explain. (2 points) The slope is significantly different from zero because the p value is.001. (2) Is the y intercept significantly different from zero? Explain. (2 points) The y intercept is not significantly different from zero because ) the p value is.580 >.05. E. A confidence interval for the slope would be given by β 1 ± t α /2 s β ) 1. Find a 99% confidence interval on the slope using the estimate of the standard error given in the Mintab output. (5 points) 1.166 ± 3.355(.2445) (.3457,1.986)