Point Estimation and Confidence Interval

Similar documents
Chapter 9. Hypothesis testing. 9.1 Introduction

Ch. 7 Statistical Intervals Based on a Single Sample

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Inference for Proportions, Variance and Standard Deviation

Chapter 8 - Statistical intervals for a single sample

Statistics 135 Fall 2007 Midterm Exam

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

Lecture 7: Confidence interval and Normal approximation

Confidence intervals CE 311S

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

Problems Pages 1-4 Answers Page 5 Solutions Pages 6-11

7 Estimation. 7.1 Population and Sample (P.91-92)

7.1 Basic Properties of Confidence Intervals

Assignment 6: Confidence Intervals and Parametric Hypothesis Testing

Math 2200 Fall 2014, Exam 3 You may use any calculator. You may use a 4 6 inch notecard as a cheat sheet.

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Assignment 6: Confidence Intervals and Parametric Hypothesis Testing

Chapter 20 Comparing Groups

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

Chapter 6 Continuous Probability Distributions

Lecture 10: Comparing two populations: proportions

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015

Chapter 15 Sampling Distribution Models

Estimation and Confidence Intervals

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Chapter 8: Confidence Intervals

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Statistics for Business and Economics

1 Binomial Probability [15 points]

STA 584 Supplementary Examples (not to be graded) Fall, 2003

Lecture 6: Point Estimation and Large Sample Confidence Intervals. Readings: Sections

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

MAT2377. Ali Karimnezhad. Version December 13, Ali Karimnezhad

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Solutions - Final Exam

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

MAT2377. Rafa l Kulik. Version 2015/November/23. Rafa l Kulik

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except in problem 1. Work neatly.

STAT 201 Assignment 6

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

Problem Set 4 - Solutions

The variable θ is called the parameter of the model, and the set Ω is called the parameter space.


Math 2311 Test 1 Review. 1. State whether each situation is categorical or quantitative. If quantitative, state whether it s discrete or continuous.

Chapters 4-6: Estimation

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Statistics, continued

[ z = 1.48 ; accept H 0 ]

Statistics and Sampling distributions

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

3/30/2009. Probability Distributions. Binomial distribution. TI-83 Binomial Probability

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 1: Revie of Calculus and Probability

DSST Principles of Statistics

Random Variable And Probability Distribution. Is defined as a real valued function defined on the sample space S. We denote it as X, Y, Z,

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

STAT 285: Fall Semester Final Examination Solutions

LECTURE NOTES. INTSTA2 Introductory Statistics 2. Francis Joseph H. Campeña, De La Salle University Manila

1) What is the probability that the random variable has a value less than 3? 1)

Practice Questions: Statistics W1111, Fall Solutions

Exam III #1 Solutions

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Population 1 Population 2

Probability and Statistics Notes

Probability and Probability Distributions. Dr. Mohammed Alahmed

6 The normal distribution, the central limit theorem and random samples

Chapter 3. Comparing two populations

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

ST 371 (IX): Theories of Sampling Distributions

Assignment 2 SOLUTIONS

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Hypotheses Testing. 1-Single Mean

INTERVAL ESTIMATION OF THE DIFFERENCE BETWEEN TWO POPULATION PARAMETERS

Chapter 9 Inferences from Two Samples

Sections 7.1 and 7.2. This chapter presents the beginning of inferential statistics. The two major applications of inferential statistics

Chapter 10: Chi-Square and F Distributions

Confidence Intervals, Testing and ANOVA Summary

Introduction to Statistical Data Analysis Lecture 5: Confidence Intervals

Topic 6 - Confidence intervals based on a single sample

Discrete Distributions

Hypothesis testing for µ:

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

Ch. 7: Estimates and Sample Sizes

2011 Pearson Education, Inc

Chapitre 3. 5: Several Useful Discrete Distributions

Testing a Claim about the Difference in 2 Population Means Independent Samples. (there is no difference in Population Means µ 1 µ 2 = 0) against

MATH 227 CP 7 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Chapter 6. Estimates and Sample Sizes

AMS 315/576 Lecture Notes

Transcription:

Chapter 8 Point Estimation and Confidence Interval 8.1 Point estimator The purpose of point estimation is to use a function of the sample data to estimate the unknown parameter. Definition 8.1 A parameter is a constant that describes the population. A statistic is a random variable that can be computed from the sample data without making use of any unknown parameters. The statistic ˆΘ used to estimate the unknown parameter is called a point estimator of θ. A point estimate is the value of ˆΘ calculated from the observed sample values. Sample mean (mean of the sample): X = 1 n n i=1 Sample variance (variance of the sample): S 2 = 1 n 1 X i n (X i X) 2 Sample standard deviation (standard deviation of the sample): S = 1 n (X i n 1 X) 2 Recall A statistic ˆΘ is said to be unbiased if E( ˆΘ) = θ. X is an unbiased estimator of µ, and S 2 is an unbiased estimator of σ 2. Warning: Unbiased estimator is not unique. Definition 8.2 If we consider all possible unbiased estimators of θ, the one with the smallest variance is called the most efficient estimator of θ. i=1 i=1 8-1

8.2 Interval estimation We have proved that the sample mean is an unbiased estimator of the population mean. Suppose a sample of size n is taken from a Poisson distribution and it is found that x = 3.8. Then, 3.8 is a point estimate of λ. But, what exactly does this tell us about the true value of λ? Can we fell reasonably certain, for example, that λ lies somewhere close to x say, in the interval from 3.7 to 3.9. Or, on other hand, is X so variable that there is a good chance that X λ is fairly large? To address this uncertainty we turn from point estimation to a technique known as interval estimation. Interval estimation is exactly what the name implies. We want to find two statistics, ˆΘ 1 and ˆΘ 2, that can be used to generate an interval of real numbers that we hope contains the true value of the parameter θ being estimated. Definition 8.3 A 100(1 α)% confidence interval for a parameter θ is an interval of the form [ˆθ 1, ˆθ2 ], in which ˆΘ 1 and ˆΘ 2 are statistics such that P ( ˆΘ 1 θ ˆΘ 2 ) = 1 α 8.3 Confidence interval for µ when σ is known Theorem 1 If x is the value of the sample mean of a random sample of size n from a normal population with the known variance σ 2, then a 100(1 α)% confidence interval for µ is z α/2 σ n is called the margin of error. x ± z α/2 σ n = ( x z α/2 σ n, x + z α/2 σ n ) Example 8.1 What is the average price of statistics books? Below is a random sample of prices: 40 53 39 37 22 35 66 80 95 35 What is a 95% confidence interval for µ?, assuming σ = 20 x = 50.2, s = 23.13 50.2 ± 1.96 20 10 = 50.2 ± 12.4 = (37.8, 62.6) 8-2

The interval (37.8, 62.6) is called a 95% confidence interval for µ. We are 95% confident that the unknown µ lies between $37.8, and $62.6 We got this interval by a method that gives correct results 95% of the time. Caution: There are only two possibilities: (37.8, 62.6) contains the true µ. This random sample is one of the few samples for which x is not within $12.4 of the true µ. Only 5% of all samples give such inaccurate results. Example 8.1 You have measured the systolic blood pressure of a random sample of 25 students in UST. A 95% confidence interval for the mean systolic blood pressure for all students in UST is (122, 138). Which of the following statements gives a valid interpretation of this interval? (a) 95% of the sample of students have a systolic blood pressure betwee22 and 138. (b) 95% of the population of students have a systolic blood pressure betwee22 and 138. (c) The probability that the population mean blood pressure is betwee22 and 138 is 0.95. (d) If the procedure were repeated many times, 95% of the resulting confidence intervals would contain the population mean systolic blood pressure. (e) If the procedure were repeated many times, 95% of the sample means would be between 122 and 138. Example 8.2 The Bureau of Labor Statistics (BLS) conducts surveys each month to collect information on the labor market. According to one recent survey, the average hourly earnings of workers employed in the manufacturing industries edged up 1 cent in September to $12.86. The survey is based on a random sample of 390,000 workers. Suppose that the standard deviation of hourly earnings is $1.875. Find 95% and 99% confidence intervals for the mean hourly earnings of all works employed in the manufacturing industries in September 1998. What are the margins of error for 95% and 99% confidence? 8-3

n = 390000, x = 12.86, σ = 1.875 m =.0059 m =.0077 x ± z.025 σ n = 12.86 ± 1.96 1.875 390000 = 12.86 ± 0.0059 = (12.8541, 12.8659) x ± z 0.005 σ n = 12.86 ± 2.576 1.875 390000 = 12.86 ± 0.0077 = (12.8523, 12.8677) Example 8.3 Refer to Example 8.2 Suppose that the sample size is 1000. Find 95% and 99% confidence intervals for the mean hourly earnings in September 1998. What are the margins of error for 95% and 99% confidence? n = 1000, x = 12.86, σ = 1.875 x ± z 0.025 σ n = 12.86 ± 1.96 1.875 1000 = 12.86 ±.116 = (12.744, 12.976) x ± z.005 σ n = 12.86 ± 2.576 1.875 1000 = 12.86 ±.153 = (12.707, 13.013) 8-4

8.4 Confidence interval for µ when σ is unknown Recall that for a random sample of size n from N(µ, σ 2 ) Z = X µ σ/ n N(0, 1) What if σ is unknown? Student-t Statistic: where S is the sample standard deviation. T = X µ S/ n, Theorem 2 (Student-t Distributions) For a random sample of size n from N(µ, σ 2 ), T = X µ S/ n t(n 1), where t(n 1) is the t-distribution with n-1 degrees of freedom. To understand the t-distribution, we first introduce the chi-square distribution. Definition 8.4 X is said to have the chi-square distribution with v degrees of freedom, denoted by X χ 2 (v), if its pdf is given by f(x) = Definition 8.5 If Y and Z are independent, { 1 2 v/2 Γ(v/2) xv/2 1 e x/2 for x > 0 0 elsewhere Z N(0, 1), Y χ 2 (v) Then T = Z Y/v has pdf f(t) = Γ((v + 1)/2) πvγ(v/2) (1 + t2 v ) (v+1)/2, < t < and it is called the t distribution with v degrees of freedom. Let X 1, X 2..., X n be a random sample of size n from N(µ, σ 2 ), and let X and S 2 be the mean and variance of the random sample. Then n i=1 ( Xi ) µ 2 σ χ 2 (n); (n 1)S2 σ 2 has a chi-square distribution with n 1 degrees of freedom; 8-5

X and S 2 are independent n( X µ)/s has a t-distribution with n 1 degrees of freedom. Properties of t distributions: Symmetric about 0 Bell-shaped similar to N(0,1) curve, but have heavier tails As k increases, the t(k) distribution approaches N(0,1) distribution The t confidence interval If x and s are the values of the mean and the standard deviation of a random sample of size n from a normal population, then a 100(1 α)% confidence interval for µ is s x ± t α/2,n 1 n or where margin of error = t α/2,n 1 s n x ± margin of error, Example 8.4 Refer to Table A.4. What t critical value would you use for a C.I. for µ? (a) A 95% C.I. based on n = 10 (b) A 99% C.I. based on n = 20 Example 8.5 Great discoveries in science tend to be made by persons who are quite young. Listed below are 12 major scientific breakthroughs from the middle of the sixteenth century to the early part of the twentieth century. 8-6

Discovery Discoverer Date Age Earth goes around sun Copernicus 1543 40 Telscope, basic laws of astronomy Galileo 1600 34 Principles of motion, graviation, Newto665 23 calculus Nature of electricity Frankli746 40 Burning is uniting with oxygen Lavoisier 1774 31 Earth evolved by gradual Lyell 1830 33 processes Evidence for natural selection Darwi858 49 controlling evolution Field equations for light Maxwell 1864 33 Radioactivity Curie 1896 33 Quantum theory Planck 1901 43 Special theory of relativity, E = mc 2 Einstei905 26 Mathematical foundations Kolmogrov 1933 30 for probability theory Let µ denote the true average age at which great scientific discoveries are made. Construct a 95% confidence interval for µ. n = 12, x = 34.58, s = 7.3045 A 95% C.I. is 34.58 ± 2.201 7.3045 12 = 34.67 ± 4.7311 = (29.9389, 39.2211) Example 8.6 Large superstores use scanners to calculate a customer s bill. Scanners should be as accurate as possible. A state agency regularly monitors stores by randomly selecting a large number of items and comparing the shelf price with the checkout scanner price. Are the overcharges balanced by the undercharges, or is the mean overcharge of all different items in the store positive? During one check by the agency, 16 items were found to be incorrectly scanned. The amounts of overcharges were 2.00 -.99 1.00 -.50.40 -.60.20.30.50 3.00-1.20 1.00.50.30 -.70.40 (a) Find a 95% C.I. for the mean overcharge. n = 16, x =.351, s = 1.083 A 95% C.I. is.351 ± 2.131 1.083 16 = (.226,.928) 8-7

8.5 Point and interval estimation for a population proportion In many problems we must estimate proportions, probabilities, percentages or rate, such as Example: (a) What is the current unemployment rate in Hong Kong? (b) What is President Bush s approval rating? (c) What proportion of students in Math 144 will receive Grade A? (d) Point estimation of proportions Let p be a population proportion of successes (or the probability of success), and let X be the number of successes in a random sample of size n. Define the sample proportion ˆp by ˆp = X n Then p(1 p) E(ˆp) = p, Var(ˆp) = n Therefore, the sample proportion is an unbiased estimator of p. Interval estimation of p Recall that X np N(0, 1) as n np(1 p) Hence and ˆp p p(1 p) n ˆp p ˆp(1 ˆp) n N(0, 1) N(0, 1) Confidence interval for p: An approximate 100(1 α)% confidence interval for p is ˆp ± z α/2 ˆp(1 ˆp) n or ˆp ± margin of error, ˆp(1 ˆp) where margin of error= z α/2. n 8-8

Example 8.7 Do you approve or disapprove of the way Bill Clinton is handling his job as president? A Gallup poll conducted May 7-9, 1999 found that 60% of 1,015 adults interviewed approved Clinton s job performance. The Gallup poll claims that For results based on the total sample of adults nationwide, one can say with 95% confidence that the margin of sampling error is no greater than +/ 3 percentage points.. Explain. ˆp =.60, n = 1015 95% C.I. for p is.60 ± 1.96 (.60.4/1015).5 =.60 ± 1.96.015377 =.60 ±.03 = (.57,.63) Example 8.8 In a random sample of n = 500 families owning television sets in the city of Hamilton, Canada, it is found that x = 340 subscribed to HBO. Find a 95% confidence interval for the actual proportion of families in this city who subscribe to HBO. ˆp = 340/500 = 0.68, 95% CI is 0.68 ± 1.96 0.68(0.32)/500 = 0.68 ± 0.04 = [0.64, 0.72] Choosing the sample size Recall the margin of error is given by m = z α/2 ˆp(1 ˆp) n Sample size for desired margin of error: The 100(1 α)% confidence interval for p will have a margin of error approximately equal to a specified value m when the sample size is ( zα/2 ) 2 n = p (1 p ) m where p is a guessed value for the sample proportion. Conservative sample size: ( zα/2 ) 2 n = (1/4). m The margin of error will be less than or equal to m. 8-9

Example 8.9 Find the sample size needed if the margin of error of the 95% confidence interval is (a) m = 1% (b) m = 2% (c) m = 3% (d) m = 3%, p =.3 (a) n =.25(1.96/.01) 2 = 9604 (b) n =.25(1.96/.02) 2 = 2401 (c) n =.25(1.96/.03) 2 = 1067.1 (d) n =.3.7(1.96/.03) 2 = 896.4 8.6 Estimation of differences between means Variances known Let X 1 and X 2 are the sample means of independent random samples of size and from normal populations with means µ 1 and µ 2 and with the known variances σ 2 1 and σ 2 2. Then Z = ( X 1 X 2 ) (µ 1 µ 2 ) N(0, 1) σ1 2 + σ2 2 Confidence interval for µ 1 µ 2 : If x 1 and x 2 are the values of the sample means of independent random samples of size and from normal populations with means µ 1 and µ 2 and with the known variances σ1 2 and σ2, 2 then a (1 α)100% confidence interval for µ 1 µ 2 is σ1 2 x 1 x 2 ± z α/2 + σ2 2 The margin of error is = z α/2 σ 2 1 + σ2 2 8-10

Example 8.10 A survey of credit card holders revealed that Americans carried an average credit card balance of $3900 i995 and $3300 i994 (U.S. News & World Report, January 1, 1996). Suppose that these averages are based on random samples of 400 credit card holders i995 and 450 credit card holders i994 and that the population standard deviations of the balances were $880 i995 and $810 i994. Construct a 95% confidence interval for the difference between the mean credit card balances for all credit card holders i995 and 1994. For 1995: = 400, x 1 = $3900, σ 1 = $880 For 1994: = 450, x 2 = $3300, σ 2 = $810 σ x1 x 2 = σ 2 1 + σ2 2 = 880 2 400 + 8102 450 = 58.258 A 95% confidence interval for µ 1 µ 2 is ( x 1 x 2 ) ± z 0.025 σ x1 x 2 = (3900 3300) ± 1.96(58.258) = 600 ± 114.19 = (485.81, 714.19) Variance unknown Case I: σ 2 1 = σ 2 2 = σ 2 Pooled estimator of σ 2 : S 2 p is an unbiased estimator of σ 2 ( + 2)S 2 p σ 2 χ 2 ( + 2) X 1 X 2 and S 2 p are independent Therefore S 2 p = ( 1)S 2 1 + ( 1)S 2 2 + 2 T = ( X 1 X 2 ) (µ 1 µ 2 ) t( + 2) 1 S p + 1 Confidence interval for µ 1 µ 2 when σ 1 = σ 2 unknown: If x 1 and x 2 are the values of the sample means of independent random samples of size and from normal populations with means µ 1 and µ 2 and with the unknown common variance σ 2 1 = σ 2 2 = σ 2, then a (1 α)100% confidence interval for µ 1 µ 2 is x 1 x 2 ± t α/2,n1 + 2s p 1 + 1 8-11

Example 8.11 A company claims that its medicine, Brand A, provides faster relief from pain than another company s medicine, brand B. A researcher tested both brands of medicine on two groups of randomly selected patients. The results of test are given in the following table. The mean and standard deviation of relief times are in minutes. Brand Sample size Mean S.d. A 25 44 13 B 22 49 11 Construct a 95% C.I. for the difference between the mean relief times for the two brands of medicine Assume that σ 1 = σ 2. s 2 p = 24(13)2 +21(11) 2 = 146.6, s 45 p = 12.1078 A 95% c.i. for µ 1 µ 2 is 44 49 ± 2.0141(12.1078) 1/25 + 1/22 = 5 ± 7.06 = ( 12.13, 2.13) Example 8.12 An instructor in Math 144 believes that students will not do well if they skip too many classes. To test the claim, the instructor divides students into high-attendance group and lower-attendance group. Group Sample size Mean S.d. 1 69 84.5 12 2 51 77 14 Find a 95% confidence interval for µ 1 µ 2. s 2 p = 166.034, s p = 12.8854. A 95% confidence interval is 84.5 77 ± 1.98(12.8854) 1/69 + 1/51 = 7.5 ± 4.7 = (2.82, 12.21) Case II. σ 2 1 σ 2 2 For this case, we have T = ( X 1 X 2 ) (µ 1 µ 2 ) t v, S1 2 + S2 2 8-12

where v = ( s2 1 + s2 2 ) 2 1 1 ( s2 1 ) 2 + 1 1 ( s2 2 ) 2 Thus, a (1 α)100% confidence interval is approximately s 2 1 x 1 x 2 ± t α/2,v + s2 2 Example 8.13 Does increasing the amount of calcium in our diet reduce blood pressure? Examination of a large sample of people revealed a relationship between calcium intake and blood pressure, but such observational studies do not establish causation. A randomized comparative experiment gave one group of 10 men a calcium supplement for 12 weeks.the control group of 11 men received a placebo. Below are the data for seated systolic blood pressure. Calcium group Placebo Group Begin End Decrease Begin End Decrease 107 100 7 123 124-1 110 114-4 109 97 12 123 105 18 112 113-1 129 112 17 102 105-3 112 115-5 98 95 3 111 116-5 114 119-5 107 106 1 112 114-2 112 102 10 110 121-11 136 125 11 117 118-1 102 125 11 130 133-3 119 114 5 Find a 95% confidence interval for µ 1 µ 2. = 10, x 1 = 6.1, s 1 = 8.81, = 11, x 2 =.64, s 2 = 5.87 s 2 p = 9(8.74)2 +10(5.87) 2 = 54.3188, s 19 p = 7.37. A 95% confidence interval for µ 1 µ 2 is 1 5 +.64 ± 2.093(7.37) 10 + 1 = 5.64 ± 6.74 = ( 1.1, 12.4) 11 Use twos ample 95 c1 c2, 95% C.I. for µ 1 µ 2 is ( 0.30, 13.77), df = 15 8-13

Matched pairs intervals Consider the problem of comparing two means for samples that are not independent. This situation arises quite naturally when observations occur in pairs. Given paired data: (X 1, Y 1 ), (X 2, Y 2 ),, (X n, Y n ) Let d i = X i Y i. Then a 100(1 α)% confidence interval for µ X µ Y is d ±t α/2,n 1 s d / n, where d and s d are the sample mean and the sample standard deviation of {d i }. Example 8.14 It is claimed that a new diet will reduce a person s weight by 4.5 kg on the average in a period of 4 weeks. The weights of 9 women who followed this diet were recorded before and after a 4-week period: Weight before Weight after Difference 58.5 60.0-1.5 60.3 54.9 5.4 61.7 58.1 3.6 69.0 62.1 6.9 64.0 58.5 5.5 62.6 59.9 2.7 56.7 54.4 2.3 70.4 68.5 1.9 73.2 70.5 2.7 Find a 90% C.I. for the mean weight loss in a 4-week period. d = 3.278, s d = 2.475 A 90% C.I. is 3.278 ± 1.86(2.475/3) = 3.278 ± 1.535 = (1.743, 4.813) 8.7 Estimation of differences between proportions Suppose we have two independent samples. 8-14

Population proportion Sample size Sample proportion 1 θ 1 ˆΘ1 2 θ 2 ˆΘ2 The sampling distribution of ˆΘ 1 ˆΘ 2 : and By the central limit theorem, where E( ˆΘ 1 ˆΘ 2 ) = θ 1 θ 2, Var( ˆΘ 1 ˆΘ 2 ) = θ 1(1 θ 1 ) + θ 2(1 θ 2 ). ˆΘ 1 ˆΘ 2 (θ 1 θ 2 ) S.E. = S.E. N(0, 1) ˆΘ 1 (1 ˆΘ 1 ) + ˆΘ 2 (1 ˆΘ 2 ) Confidence interval for p 1 p 2 : If X 1 is a binomial random variable with B(, θ 1 ), X 2 is a binomial random variable with B(, θ 2 ), X 1 and X 2 are independent, then an approximate 100(1 α)% confidence interval for θ 1 θ 2 is (ˆθ 1 ˆθ 2 ) ± z α/2 s.e. where ˆθ 1 = x 1 and ˆθ 2 = x 2 s.e. = ˆθ 1 (1 ˆθ 1 ) + ˆθ 2 (1 ˆθ 2 ), Example 8.15 A study is made to determine if a cold climate results in more students being absent from school during a semester than for a warmer climate. Two groups of students are selected at random, one group from Vermont and the other group from Georgia. Of the 300 students from Vermont, 64 were absent at least 1 day during the semester, and of the 400 students from Georgia, 51 were absent 1 or more days. Find a 90% confidence interval for the difference between the fractions of students who are absent in the two states. = 300, ˆθ 1 = 64/300 =.2133 = 400, ˆθ 2 = 51/400 =.1275 SE =.2133.7867/300 +.1275.8725/400 =.02894 8-15

An approximate 90% CI for θ 1 θ 2 is.2133.1275 ± 1.645.02894 =.0858 ±.0476 = (.0382,.1334) Example 8.16 A clinical trial is conducted to determine if a certain type of inoculation has an effect on the incidence of a certain disease. A sample of 1000 rats was kept in a controlled environment for a period of 1 year and 500 of the rats were given the inoculation. Of the group not given the drug, there were 120 incidences of the disease, while 98 of the inoculated group contracted it. If we call p 1 the probability of incidence of the disease in uninoculated rats and p 2 the probability of incidence after receiving the drug, compute a 90% confidence interval for p 1 p 2. 0.0011 < p 1 p 2 < 0.0869 8-16