Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two normal distributions N(µ X, σx 2 ) and N(µ Y, σy 2 ), respectively. Case 1: σx 2 and σ2 Y are known. A 100(1 α)% confidence interval for µ X µ Y is [ x ȳ z α/2 σ W, x ȳ + z α/2 σ W ], where σ W = σx/n 2 + σy/m 2 is the standard deviation of the point estimator X Ȳ. Remark: If the sample sizes n and m are large (at least 30) and σ X and σ Y are unknown, we can replace σx 2 and σy 2 with s 2 x and s 2 y, respectively to find an approximate 100(1 α)% confidence interval: x ȳ ± z α/2 s 2 x/n + s 2 y/m. Case 2: σx 2 and σ2 Y are unknown but the sample sizes are small. A 100(1 α)% confidence interval for µ X µ Y is [ 1 x ȳ t 0 s p n + 1 1 m, x ȳ + t 0s p n + 1, m ] where x, ȳ and s p are the observed values of X, Ȳ and S p with (n 1)s 2 X s p = + (m 1)s2 Y n + m 2 and t 0 = t α/2 (n + m 2). For paired random sample (X 1, Y 1 ), (X 2, Y 2 ),, (X n, Y n ), let D i = X i Y i, i = 1, 2,..., n. Then, we could assume that D 1, D 2,..., D n is a random sample from N(µ D, σd 2 ), where µ D and σ D are the mean and standard deviation of each difference. A 100(1 α)% confidence interval for µ D = µ X µ Y is [ d s d t α/2 (n 1), d s + t α/2 (n 1) d ], n n where d and s d are the observed mean and standard deviation of the sample d 1, d 2,..., d n. 1
7.3. Confidence Intervals for Proportions Let Y b(n, p) and y be the observed value of Y. Then, an approximate 100(1 α)% confidence interval for p is [ y (y/n)(1 y/n) n z α/2, y ] (y/n)(1 y/n) n n + z α/2, n where y/n is a point estimate of p. Remark: one-sided 100(1 α)% confidence interval for p with (i) upper bound: [0, y/n + z α (y/n)(1 y/n)/n]; (ii) lower bound: [y/n z α (y/n)(1 y/n)/n, 1]. Let Y 1 b(n 1, p 1 ) and Y 1 b(n 1, p 1 ), and let y 1 and y 2 be the observed values of Y 1 and Y 2, respectively. Then, an approximate 100(1 α)% confidence interval for p 1 p 2 is (y 1 /n 1 y 2 /n 2 ) ± z α/2 (y1 /n 1 )(1 y 1 /n 1 )/n 1 + (y 2 /n 2 )(1 y 2 /n 2 )/n 2. 7.4. Sample Size (1) Sample Size of Estimating µ Let ε = z α/2 (σ/ n) be the maximum error of the estimate for µ. Then, the required sample size with a given maximum error of the estimation is which is rounded to the next integer. n = z2 α/2 σ2 ε 2, (2) Sample Size of Estimating Proportion p Let ε = z α/2 ˆp(1 ˆp)/n be the maximum error of the point estimate ˆp. Assume that we can obtain an estimate ˆp by using available historical data. Then, the required sample size within the given error bound ε is n = z2 α/2 ˆp(1 ˆp) ε 2, which is rounded to the next integer. 2
If ˆp is not available (i.e., no historical data), we can use the following formula to compute the required sample size: which is rounded to the next integer. n = z2 α/2 4ε 2, Chapter 8. Tests of Statistical Hypotheses 8.1. Tests about One Mean Definition. A statistical hypothesis is a statement about the parameters of one or more populations. Null hypothesis H 0 : specifies the distribution of the population (or the parameters will be equal to some specified values). Alternative hypothesis H 1 : a competing statement against H 0. Two-sided alternative hypothesis H 1 contains sign; the one-sided alternative hypothesis contains either > sign or < sign. Definition. (i) Type I error: rejecting H 0 when H 0 is true; (ii) Type II error: failing to reject H 0 when H 0 is false. Definition. α = P (Type I error) is called the significance level of the test. β = P (Type II error) is called the power of the test. Definition. p-value is the smallest level of significance that would lead to rejection of H 0 with the given data. Criterion: we reject H 0 if p-value α. Otherwise (i.e. p-value> α), we fail to reject H 0. (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ 2 ), X 1,..., X n is a random sample from the distribution N(µ, σ 2 ). Null hypothesis H 0 : µ = µ 0. The sample mean is X = (1/n) n i=1 X i and the sample variance is S 2 = 1 n n 1 i=1 (X i X) 2. 3
Test statistics: Z = X µ 0 σ/ n. Table : Tests of Hypotheses about One Mean, Variance Known where z = x µ0 σ/ n µ = µ 0 µ > µ 0 z z α µ = µ 0 µ < µ 0 z z α µ = µ 0 µ µ 0 z z α/2 is the observed value of the test statistics Z. Let z be the observed value of the test statistics Z. We can compute the p-value via the formula: 2P (Z z ), if H 1 : µ µ 0 ; p value = 1 P (Z < z), if H 1 : µ > µ 0 ; P (Z z), if H 1 : µ < µ 0. Case 2: σ is unknown. Test statistics: T = X µ 0 S/ n. T has a t distribution with r = n 1 degrees of freedom. Table : Tests of Hypotheses about One Mean, Variance Unknown where t = x µ0 s/ n µ = µ 0 µ > µ 0 t t α (n 1) µ = µ 0 µ < µ 0 t t α (n 1) µ = µ 0 µ µ 0 t t α/2 (n 1) is the observed value of the test statistics T. Let t be the observed value of the test statistics T. We can compute the p-value via the formula: 2P (T t ), if H 1 : µ µ 0 ; p value = 1 P (T < t), if H 1 : µ > µ 0 ; P (T t), if H 1 : µ < µ 0. 4
(2) Comparison of Two Means (Paired t-test) Assume that X and Y are dependent. Let W = X Y. From the original data {(x 1, y 1 ), (x 2, y 2 ),..., (x n, y n )}, one can get the difference data {w 1, w 2,..., w n }. Null hypothesis H 0 : µ X = µ Y is equivalent to H 0 : µ W = 0. Test statistics: T = W 0 S W / n. Test procedures are the same as previous case. 8.2. Tests of the Equality of Two Means Let X N(µ X, σx 2 ) and Y N(µ Y, σy 2 ). Assume that X and Y are independent. We have two samples: {X 1, X 2,..., X n } and {Y 1, Y 2,..., Y m }. Null hypothesis H 0 : µ X µ Y = 0. Test statistics: where S P = T = X Ȳ S P 1/n + 1/m, (n 1)SX 2 + (m 1)S2 Y. n + m 2 T has a t distribution with r = n + m 2 degrees of freedom. Table : Tests of Hypotheses for Equality of Two Means when σ 2 X = σ2 Y where t = µ X = µ Y µ X > µ Y t t α (n + m 2) µ X = µ Y µ X < µ Y t t α (n + m 2) µ X = µ Y µ X µ Y t t α/2 (n + m 2) x ȳ s P 1/n+1/m is the observed value of the test statistics T. p-value can be computed by using formulas mentioned before. 5
8.3. Tests about Proportions Let Y be the number of successes in n independent trials with probability of success p and y be the observed value of Y. Table : Tests of Hypotheses for One Proportion p = p 0 p > p 0 z = y/n p0 α p0(1 p 0)/n p = p 0 p < p 0 z = y/n p0 z α p0(1 p 0)/n p = p 0 p p 0 z = y/n p0 z α/2 p0(1 p 0)/n Let Y 1 and Y 2 represent, respectively, the numbers of observed successes in n 1 and n 2 independent trials with probabilities of success p 1 and p 2. Table : Tests of Hypotheses for Two Proportions where ˆp = (y 1 + y 2 )/(n 1 + n 2 ). 8.4. The Wilcoxon Tests y p 1 = p 2 p 1 > p 2 z = 1/n 1 y 2/n 2 ˆp(1 ˆp)(1/n1+1/n z α 2) p 1 = p 2 p 1 < p 2 z = p 1 = p 2 p 1 p 2 z = y 1/n 1 y 2/n 2 ˆp(1 ˆp)(1/n1+1/n 2) z α y 1/n 1 y 2/n 2 ˆp(1 ˆp)(1/n1+1/n 2) z α/2 Let m be the unknown median of a continuous-type random variable X. Let X 1, X 2,, X n denote the observations of a random sample from the distribution of X. We would like to test H 0 : m = m 0 against H 1 : m > m 0. We rank the absolute values X 1 m 0, X 2 m 0,, X n m 0 in non-decreasing order according to magnitude. Let R k denote the rank of X k m 0 among X 1 m 0, X 2 m 0,, X n m 0. With each R k, we associate the sign of the difference X k m 0. Namely, if X k m 0 > 0, then we use R k, but if X k m 0 < 0, we use R k. If the absolute values of the differences from m 0 of two or more observations are equal, each observation is assigned the average of the corresponding ranks. The Wilcoxon statistic W is the sum of these n signed ranks. For an approximate significance level α, the critical region is z z α or w z α n(n + 1)(2n + 1)/6. The p-value is computed by ( ) w 1 p value = P (W w) P Z n(n + 1)(2n + 1)/6 = 1 P ( Z ) w 1. n(n + 1)(2n + 1)/6 8.6. Best Critical Regions 6
(Neyman-Pearson Lemma) Let X 1, X 2,..., X n be a random sample of size n from a distribution with pdf or pmf f(x; θ), where θ 0 and θ 1 are two possible values of θ. Denote the joint pdf or pmf of X 1, X 2,..., X n by the likelihood function L(θ) = f(x 1 ; θ)f(x 2 ; θ) f(x n ; θ). If there exists a positive constant k and a subset C of the sample space such that (a) P [(X 1, X 2,..., X n ) C; θ 0 ] = α, (b) L(θ0) L(θ k for (x 1) 1,..., x n C, and (c) L(θ0) L(θ k for (x 1) 1,..., x n C, then C is a best critical region of size α for testing the simple null hypothesis H 0 : θ = θ 0 against the simple alternative hypothesis H 1 : θ = θ 1. A test defined by a critical region C of size α is a uniformly most powerful test if it is a most powerful test against each simple alternative in H 1. The critical region C is called a uniformly most powerful critical region of size α. 9.1. Chi-square Goodness-of-Fit Tests Let an experiment have k mutually exclusive and exhaustive outcomes A 1,, A k. Denote p i = P (A i ), i = 1,, k. We would like to test the hypothesis H 0 : p i = p i0, i = 1,, k against all other alternative hypotheses H 1. Case 1: discrete distributions. Let the experiment be repeated n independent times. Let Y i be the observed number of times (frequency) that A i occurred. Then the expected frequency is np i0 (which should be at least 5). When H 0 is true, the test statistic is Q k 1 = k (Y i np i0 ) 2 χ 2 (k 1). np i0 i=1 The critical region is q k 1 χ 2 α(k 1), where α is the significance level. If there are d unknown parameters in the hypothesized distribution that need to be estimated from the given sample data, then we must calculate p i0 by using the estimates of the parameters. The test statistic is Q k 1 χ 2 (k 1 d) and the critical region would be q k 1 χ 2 α(k 1 d). Case 2: Continuous distributions. Let W be a continuous random variable with distribution function F (w). We would like to test H 0 : F (w) = F 0 (w) against all other alternatives H 1, where F 0 (w) is a known continuous distribution function. We partition the space of W into k class intervals: A 1 = (, a 1 ], A 2 = (a 1, a 2 ],, A k = (a k 1, ). Let p i = P (W A i ). Let Y i be the number of times that the observed values of W belong to A i, i = 1,, k in n independent repetitions of the experiment. Then, Y 1,, Y k have a multinomial distribution with parameters n, p 1,, p k 1. Let p i0 = P (W A i ) when the distribution function of W is F 0 (w). Then, H 0 is modified to be H 0 : p i = p i0, i = 1,, k. 7
H 0 is rejected if k (y i np i0 ) 2 q k 1 = χ 2 α(k 1 d), i=1 np i0 where d is the number of unknown parameters in F 0 (w). 9.2. Contingency Tables Suppose that each of h independent experiments can result in one of the k mutually exclusive and exhaustive events A 1, A 2,..., A k. Let p ij = P (A i ), i = 1, 2,..., k, j = 1, 2,..., h. We want to test H 0 : p i1 = p i2 = = p ih = p i, i = 1, 2,..., k. We repeat the jth experiment n j independent times, and let Y 1j, Y 2j,.., Y kj denote the frequencies of the respective events A 1, A 2,..., A k. Under H 0, we estimate probabilities using h j=1 ˆp i = Y ij h j=1 n, i = 1, 2,..., k. j The chi-square test statistics is Q = h k j=1 i=1 (Y ij n j ˆp i ) 2 n j ˆp i. If the observed value q χ 2 α((h 1)(k 1)), then we reject H 0. Otherwise, we don t reject H 0. Test for Independence of Attributes of Classification: Suppose that a random experiment results in an outcome that can be classified by two different attributes. Assume that the first attribute is assigned to one and only one of k mutually exclusive and exhaustive events-say A 1, A 2,..., A k, and the second attribute is assigned to one and only one of h mutually exclusive and exhaustive events-say B 1, B 2,..., B h. Let the probability of A i B j be defined by p ij = P (A i B j ), i = 1, 2,..., k, j = 1, 2,..., h. The random experiment is to be repeated n independent times, and Y ij will denote the frequency of the event A i B j. Let p i = P (A i ), i = 1,..., k and p j = P (B j ), j = 1,..., h. We wish to test the independence of the A and B attibutes, namely H 0 : p ij = p i p j, i = 1,..., k, j = 1,..., h. Let Y i = h j=1 Y ij, i = 1,..., k (frequency of A i ) and Y j = k i=1 Y ij, j = 1,..., h (frequency of B j ). The chi-square test statistic is Q = h k j=1 i=1 [Y ij n(y i /n)(y j /n)] 2. n(y i /n)(y j /n) If the computed value q χ 2 α[(k 1)(h 1)], then we reject H 0 at the significance level α. Otherwsie, we don t reject H 0. 8