Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Size: px
Start display at page:

Download "Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection"

Transcription

1 Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui Department of Mathematical Sciences College of Sciences San Diego State University USA Summary This paper discusses interval estimation of the simple difference (SD) between the proportions of the primary infection and the secondary infection, given the primary infection, by developing three asymptotic interval estimators using Wald's test statistic, the likelihood-ratio test, and the basic principle of Fieller's theorem. This paper further evaluates and compares the performance of these interval estimators with respect to the coverage probability and the expected length of the resulting confidence intervals. This paper finds that the asymptotic confidence interval using the likelihood ratio test consistently performs well in all situations considered here. When the underlying SD is within 0.10 and the total number of subjects is not large (say, 50), this paper further finds that the interval estimators using Fieller's theorem would be preferable to the estimator using the Wald's test statistic if the primary infection probability were moderate (say, 0.30), but the latter is preferable to the former if this probability were large (say, 0.80). When the total number of subjects is large (say, 200), all the three interval estimators perform well in almost all situations considered in this paper. In these cases, for simplicity, we may apply either of the two interval estimators using Wald's test statistic or Fieller's theorem without losing much accuracy and efficiency as compared with the interval estimator using the asymptotic likelihood ratio test. Key words: Interval Estimation; Coverage probability; Likelihood ratio test; Fieller's Theorem. 1. Introduction To establish the characteristics of a given disease, one of the interesting problems is to assess the effect due to the primary infection on the likelihood of developing the secondary infection. For example, consider the data (Agresti, 1990, Pages 45±46) about a sample of calves. Calves are first classified by whether they get a primary pneumonia infection. After recovering from the primary infection, calves are then reclassified by whether they develop a secondary infection within a defined time period. In this situation, observations are taken from the same group of calves and hence are likely to be dependent. Therefore, when estimating the simple difference (SD) between the probability of the primary infection and the conditional probability

2 60 K.-J. Lui: Confidence Intervals of the Difference between Proportions of the secondary infection, given the primary infection, we cannot apply all the interval estimators of SD developed under two independent samples (Thomas and Gart, 1977; Anbar, 1983, 1984; Beal, 1987; Mee, 1984; Hauck and Anderson, 1986; Miettinen and Nurminen, 1985; Santner and Snell, 1980; Wallenstein, 1997). Note that the completely randomized trial, in which calves are randomly allocated into the control and experimental groups, is certainly not ethical and adequate for use here. In this paper, we concentrate discussion on interval estimation of the SD between the probability of the primary infection and the conditional probability of the secondary infection, given the primary infection. We develop three asymptotic interval estimators using Wald's test statistic, the likelihood ratio test, and the basic principle of Fieller's theorem. To evaluate and compare the performance of these interval estimators, we calculate the coverage probability and the expected length of the resulting confidence intervals on the basis of the exact distribution in a variety of situations. We find that the interval estimator using the asymptotic likelihood ratio test, which involves a sophisticated numerical procedure, consistently performs well in all the situations considered here. When the underlying SD is within 0.10 and the total number of subjects is not large (say, 50), the interval estimator using Fieller's theorem would be preferable to the estimator using the Wald's test statistic if the underlying primary infection probability were moderate (say, 0.30). On the other hand, however, the latter would be preferable to the former if the underlying primary infection probability were high (say, 0.80). When the total number of subjects is large (say, 200), all the three estimators perform reasonably well in almost all situations considered in this paper. Therefore, for simplicity, we may apply either of the two asymptotic interval estimators using Wald's test statistic or Fieller's theorem in these situations without losing much accuracy and efficiency as compared with the asymptotic confidence interval using the likelihood ratio test. Note that Agresti (1990) discusses a hypothesis testing procedure for testing whether there is an effect due to the primary infection on the probability of developing the secondary infection and Lui (1998) discusses interval estimation of risk ratio between the two successive infections. However, none of these two papers considers interval estimation of the SD as focused here. 2. Interval Estimators Consider a study, in which the data can be summarized by use of the following 2 2 table: Secondary Infection Yes No Primary Yes p 11 p 12 p 1: Infection No p 22 p 22 ;

3 Biometrical Journal 42 (2000) 1 61 where 0 < p ij < 1 (for i ˆ 1; 2 and j ˆ 1; 2) denotes the probability of the corresponding cells, p 1: ˆ p 11 p 12, and p 1: p 22 ˆ 1. As also noted elsewhere (Agresti, 1990), by definition, no subject can have the secondary infection without first having the primary infection (i.e., p 21 ˆ 0). In this paper, we focus discussion on interval estimation of the SD between the probability of the primary infection and the conditional probability of the secondary infection, given the primary infection. In terms of the p ij, the SD, denoted by d, is defined as p 1: p 11 =p 1:. Hence, for given p 1: and d, we have p 11 ˆ p 1: p 1: d, p 12 ˆ p 1: 1 p 1: d, and p 22 ˆ 1 p 1:. Note that the range for d, by definition, is 1 < d < 1. Suppose that we take a random sample of n subjects. Let n ij denote the corresponding number of subjects who fall in the cell with probability p ij. Then the log-likelihood for a given (n 11 ; n 12 ; n 22 ) is then Log L ˆ C n 11 flog p 1: log p 1: d g n 12 flog p 1: log 1 p 1: d g n 22 log 1 p 1: ; 1 where C is a constant, that does not depend on parameters d and p 1:. On the basis of (1), we can easily show that the maximum likelihood estimates (MLEs) of p 1: and d are ^p 1: ˆ n 11 n 12 =n and ^d ˆ ^p 1: ^p 11 =^p 1:, respectively, where ^p 11 ˆ n 11 =n. Furthermore, with using the inverse of the observed information matrix, we obtain the estimate dvar ^d of the asymptotic variance for ^d to be f^p 11^p 12 =^p 3 1: ^p 1: 1 ^p 1: g=n (Appendix). Therefore, the asymptotic 1 a % confidence interval for d is m l ; m u Š ; where m l ˆ max n 1; ^d q Z a=2 dvar ^d o and m u ˆ min 2 n1; ^d q o Z a=2 dvar ^d and Z a is the upper 100a th percentile of the standard normal distribution. For testing H 0 : d ˆ d 0 versus H a : d 6ˆ d 0, it is easy to see that the acceptance region using the asymptotic likelihood ratio test consists of all sample vectors (n 11 ; n 12 ; n 22 ) such that 2 n 11 log ^p 11 n 12 log ^p 12 n 22 log ^p 22 n 11 log f^p 1: d 0 g log f^p 1: d 0 d 0 gš n 12 log f^p 1: d 0 g log f1 ^p 1: d 0 d 0 gš n 22 log f1 ^p 1: d 0 g c 2 a ; 3 where ^p ij ˆ n ij =n is the MLE of p ij ; ^p 1: d 0 denotes the conditional MLE of p 1:, for a given fixed d 0 (Appendix), and c 2 a is the upper 100a th percentile of the central c 2 -distribution with one degree of freedom. Therefore, we can obtain the asymptotic likelihood ratio test based confidence interval by inverting the acceptance region (Casella and Berger, 1990): r l ; r u Š ; 4

4 62 K.-J. Lui: Confidence Intervals of the Difference between Proportions where 1 < r l < r u < 1 are the smaller and the larger roots of d 0 such that 2 n 11 log ^p 11 n 12 log ^p 12 n 22 log ^p 22 n 11 log f^p 1: d 0 g log f^p 1: d 0 d 0 gš n 12 log f^p 1: d 0 g log f1 ^p 1: d 0 d 0 gš n 22 log f1 ^p 1: d 0 g ˆ c 2 a : Recall that, by definition, the d defined here can be rewritten as a ratio p 2 1: p 11 =p 1:. Following Fieller's theorem (Casella and Berger, 1990), we define Z ˆ n^p 2 1: ^p 1: = n 1 ^p 11 d^p 1:. Note that the expectation E n^p 2 1: ^p 1: = n 1 ˆ p 2 1: and E ^p 11 ˆ p 11. Thus, E Z ˆ 0. By use of the delta method and the multivariate p Central Limit Theorem (Anderson, 1958), we can easily show that n Z asymptotically follows the normal distribution with mean 0 and asymptotic variance Var 3 ˆ p 11 1 p 11 2np 1: 1 = n 1 dš 2 p 1: 1 p 1: 2 2np 1: 1 = n 1 dš p 11 p 22. Thus, the probability that PfZ 2 = Var 3 =n Za=2 2 g ˆ: 1 a if n were large. This leads us to consider the following working quadratic equation in d: ^Ad 2 ^Bd ^C 0 ; 5 where ^A ˆ ^p 2 1: Z2 a=2^p 1: 1 ^p 1: =n, ^B ˆ 2 n^p 2 1: ^p 1: = n 1 ^p 11 Š ^p 1: Za=2 2 2n^p 1: 1 ^p 1: 1 ^p 1: = n 1 nš ^p 11^p 22 =nš, and ^C ˆ n^p 2 1: ^p 1: = n 1 ^p 11 Š 2 Za=2 2 ^p 11 1 ^p 11 =n 2n^p 1: 1 2 ^p 1: 1 ^p 1: = n 1 2 nš 2 2n^p 1: 1 ^p 11^p 22 = n 1 nš. If both ^A > 0 and ^B 2 4 ^A ^C > 0, then the asymptotic a % confidence interval of SD as n is large is given by q l ; q u Š ; 6 where and n q l ˆ max 1; ^B n q u ˆ min 1; ^B p ^B 2 4 ^A ^C p ^B 2 4 ^A ^C o = 2 ^A o = 2 ^A. 3. Coverage Probability and Expected Length To evaluate the finite-sample performance of interval estimators (2, 4, and 6) for the SD, we calculate the coverage probability and the expected length of the resulting 95% confidence interval on the basis of the exact trinomial distribution. By definition, the coverage probability is simply equal to P 1 d 2 c l ; c u Š f n 11 ; n 12 ; n 22, where c l ; c u Š is the confidence interval obtained by use of (2, 4, and 6) and is a function of n 11 ; n 12 ; n 22, 1 d 2 c l ; c u Š is the indicator function and ˆ 1 if d 2 c l ; c u Š is true, and ˆ 0, otherwise, and where f n 11 ; n 12 ; n 22 is

5 Biometrical Journal 42 (2000) 1 63 the trinomial distribution with the underlying cell probabilities: p 11 ; p 12 ; and p 22. Similarly, the expected length of the resulting confidence interval is given by P cu c l f n 11 ; n 12 ; n 22. Note that when ^p 1: ˆ 0; ^d is not well-defined and interval estimator (2) is inapplicable. Similarly, in this case, the coefficient of the quadratic terms d 2 in equation (5) is 0 and hence we cannot apply (6) to obtain the confidence interval of d either. Furthermore, if either ^A < 0 or ^B 2 4 ^A ^C < 0, then (6) cannot be applied as well. Note also that the logarithmic function log X is defined only for 0 < X < 1. Therefore, if any cell frequency n ij in a random vector (n 11 ; n 12 ; n 22 ) were 0, we would not be able to apply interval estimator (4). When evaluating the performance of (2, 4, and 6), we calculate the coverage probability and the expected length, conditional upon those samples in which the confidence limits of using the respective interval estimator exist. For completeness, we also calculate the probability that we fail to produce confidence limits for each of interval estimators (2, 4, and 6). For given values of p 1: and d, as noted before, all parameter values: p 11 ˆ p 1: p 1: d, p 12 ˆ p 1: 1 p 1: d, and p 22 ˆ 1 p 1: are uniquely determined. We consider the situations, in which p 1: ˆ 0:30, 0.50, and 0.80; d ˆ 0:30; 0:20; 0:10;... ; 0:30 but which such a restriction that the corresponding cell probabilities: p 11 ; p 12; and p 22 are all >0; and n ˆ 50, 100, and 200. We write programs in SAS (1990) to enumerate the exact probability f n 11 ; n 12 ; n 22 of the desired trinomial distribution. 4. Results Table 1 summarizes the results about the coverage probability and the expected length of the resulting 95% confidence intervals conditional upon those samples in which the confidence limits of the respective interval estimator exist in a variety of situations. As seen from Table 1, when n 200, all estimators perform reasonably well in almost all situations considered here. When both n and p 1: are not large (i.e., n ˆ 50 and p 1: ˆ 0:30) and d is within 0.10, estimators (4 and 6) outperforms estimator (2), of which the coverage probability is likely to be less than the desired confidence level. On the other hand, in these cases but in which p 1: is large (ˆ 0:80), estimator (2 and 4) is preferable to estimator (6). We also find that the probability of failing to produce an 95% confidence interval by use of either estimator (2 and 6) is negligible (< 0:001) in all situations considered in Table 1, but this probability for use of (4) can be of practical significance when n is not large (ˆ 50). 5. An example To illustrate the practical usefulness of (2, 4, and 6), we consider the example (Agresti, 1990, Pages 45±46) about 156 calves born in Florida. Calves are first

6 64 K.-J. Lui: Confidence Intervals of the Difference between Proportions Table 1 The coverage probability and the expected length (presented in parenthesis) of the resulting 95% confidence interval for the underlying risk difference between the primary infection and the secondary infection given the primary infection d ˆ 0:30; 0:20;... ; 0:30 but with such a restriction that p 11 ; p 12 ; and p 22 are all >0 for use of estimators (2, 4, 6) in the situations, in which the probability of primary infection p 1: ˆ 0:30, 0.50, and 0.80; and the total number of subjects n ˆ 50, 100, and 200 n p 1: Estimator d (0.548) (0.528) (0.628) (0.391) (0.384) (0.416) (0.278) (0.275) (0.286) (0.557) (0.537) (0.640) (0.398) (0.390) (0.423) (0.282) (0.279) (0.291) 0: (0.548) (0.530) (0.631) (0.391) (0.384) (0.416) (0.278) (0.275) (0.286) (0.518) (0.510) (0.599) (0.371) (0.366) (0.395) (0.263) (0.262) (0.271) (0.465) (0.475) (0.542) (0.334) (0.335) (0.357) (0.238) (0.238) (0.245) (0.380) (0.431) (0.453) (0.275) (0.288) (0.296) (0.196) (0.200) (0.203) (0.412) (0.412) (0.443) (0.294) (0.293) (0.304) (0.209) (0.208) (0.212) (0.448) (0.443) (0.479) (0.319) (0.317) (0.329) (0.226) (0.225) (0.230) (0.468) (0.460) (0.499) (0.333) (0.330) (0.343) (0.236) (0.235) (0.240) (0.474) (0.466) (0.506) (0.338) (0.334) (0.348) (0.239) (0.238) (0.243) (0.468) (0.460) (0.499) (0.333) (0.330) (0.343) (0.236) (0.235) (0.240) (0.448) (0.443) (0.479) (0.319) (0.317) (0.329) (0.226) (0.225) (0.230) (0.412) (0.412) (0.443) (0.294) (0.293) (0.304) (0.209) (0.208) (0.212) (0.284) (0.293) (0.293) (0.203) (0.206) (0.206) (0.144) (0.145) (0.145) (0.328) (0.332) (0.336) (0.233) (0.235) (0.236) (0.166) (0.166) (0.167) (0.356) (0.356) (0.364) (0.253) (0.253) (0.256) (0.180) (0.180) (0.181) (0.371) (0.369) (0.379) (0.264) (0.264) (0.267) (0.187) (0.187) (0.188) (0.377) (0.373) (0.384) (0.268) (0.267) (0.271) (0.190) (0.190) (0.191)

7 Biometrical Journal 42 (2000) 1 65 classified according to whether they are infected with pneumonia within 60 days after birth. They are then classified again by whether they develop a secondary infection within two weeks after clearing up the first infection. As shown in Table 3.2 on Page 46 by Agresti (1990), we have n 11 ˆ 30, n 12 ˆ 63, and n 22 ˆ 63. With given these data, the estimate ^d is Applying interval estimators (2, 4, and 6), we obtain the 95% confidence intervals of d to be [0.151, 0.396], [0.148, 0.392], and [0.137, 0.385], respectively. Because the lower limits of these resulting confidence intervals are all larger then 0, applying any of these interval estimators may suggest that the primary infection of pneumonia should stimulate a natural immunity to reduce the likelihood of a secondary infection. Although this inference is the same as that claimed elsewhere with using a hypothesis test procedure (Agresti, 1990, Page 47), we do need to implicitly assume that the immunity level of calves to pneumonia does not vary much within the first 3 months of birth and the follow-up period of 14 days is sufficiently long enough to calculate the proportion of the secondary infection to draw the above conclusion. When applying the study design discussed here to study the natural immunity, it is certainly important to decide how to choose an appropriate length of the follow-up period. However, this decision is essentially dependent on subjective knowledge of the characteristics of the underlying disease and beyond the scope of this paper. 6. Discussion The coverage probability of interval estimator (4) using the asymptotic likelihoodratio test consistently agrees reasonably well with the desired confidence level of 95% in all situations considered in Table 1, while those of estimators (2 and 6) can be less than the 95% when n is not large. Furthermore, the expected length for use of (4) may often be the shortest among these three estimators when the coverage probability is in the near neighborhood of 95% (Table 1). Therefore, in the situation in which the probability of failing to produce an interval estimate by use of (4) is negligible, estimator (4) might be generally recommended if n were not large (ˆ 50). On the other hand, use of (4) requires a sophisticated numerical procedure to calculate the confidence limits, while application of the other two estimators (2 and 6) is simple to implement. Thus, when n is large 200 and all the three estimators are essentially equivalent, we may wish to apply estimators (2 and 6) for simplicity. In the above example, the MLEs of p 1: and d are ^p 1: ˆ: 0:60 and ^d ˆ: 0:274, respectively. The total number of subjects n is 156. According to the results presented in Table 1, all three interval estimators (2, 4 and 6) are appropriate for use in this case. This is consistent with the finding that all the resulting 95% confidence intervals are similar to one another. Note that the probability of failing to produce confidence limits for use of (2 and 6), as shown in Table 2, is negligible for all situations considered here. There-

8 66 K.-J. Lui: Confidence Intervals of the Difference between Proportions Table 2 The probability of failing to produce an 95% confidence interval in application of interval estimators (2, 4, and 6) for the underlying risk difference d ˆ 0:30; 0:20; 0:10;... ; 0:30 but with such a restriction that p 11 ; p 12 ; and p 22 are all >0 in the situations, in which the prohability of primary infection p 1: ˆ 0:30, 0.50, and 0.80; and the total number of subjects n ˆ 50, 100, and 200 n p 1: Estimator d : : : : : : : : : : : fore, the resulting coverage probability and the expected length for these two estimators calculated conditional upon the samples in which the confidence limits exist are essentially equivalent to those normally calculated over all samples. However, the probability of failing to apply (4) when any cell frequency, n 11 ; n 12 ; or n 22 equals 0 can be non-negligible. For example, when n ˆ 50, p 1: ˆ 0:30, and d ˆ 0:20, this probability is approximately (Table 2). To avoid this limitation in application of (4), we can apply the commonly-used adjustment for sparse data by adding 0.50 to each cell frequency whenever this occurs. With use of this and hoc adjustment in the above case considered in Table 2, we find that the coverage probability and the expected length change from and to and 0.412, respectively. The magnitudes of these changes are certainly of no practical importance. In fact, we have recalculated all the coverage probability and the expected length with use of this as hoc adjustment to eliminate the probability of failing to produce confidence limits for using (4) in all situations considered in Table 1. Because the differences between the results of using (4) presented in

9 Biometrical Journal 42 (2000) 1 67 Table 1 and those with this adjustment are generally quite small, we decide not to present them for brevity. Finally, note that though the logarithmic transformation has been successfully applied to derive the confidence interval for the other epidemiologic indices such as risk ratio or odds ratio (Katz et al., 1978; Lui, 1995, 1996, and 1998), we do not recommend use of this transformation to derive the confidence interval of the SD as focused here. This is not only because the sampling distribution of log ^d can be even more skewed than that of ^d when the underlying d is small, but also because log ^d is undefined when ^d is <0. In summary, this paper proposes three asymptotic confidence interval for the SD between successive infections. This paper demonstrates that the interval estimator using the asymptotic likelihood ratio test can consistently perform well in a variety of situations. However, application of this procedure involves iterative numerical calculation. When the probability of the underlying primary infection is moderate (ˆ 0:30) and the SD is within 0.10, we may use the interval estimator using the Fieller's theorem. On the other hand, when the probability of the underlying primary infection is high (ˆ 0:80), we may apply the interval estimator using the Wald's test statistic. Acknowledgements The author wishes to thank the referee for many helpful and valuable comments to improve the clarity of this paper. This work in part was supported by the grant from the Agency for Health Care Policy and Research #R01-HS Appendix For a given sample vector (n 11 ; n 12 ; n 22 ), the log-likelihood is Log L ˆ C n 11 flog p 1: log p 1: d g n 12 flog p 1: log 1 p 1: d g n 22 log 1 p 1: : Then the MLEs of p 1: and d are simply the roots for p 1: and d of the following two Log 1: ˆ n 11 f1=p 1: 1= p 1: d g n 12 f1=p 1: 1= 1 p 1: d g n 22 = 1 p 1: ˆ 0 A:1 Log ˆ n 11 = p 1: d n 12 = 1 p 1: d ˆ 0 : A:2

10 68 K.-J. Lui: Confidence Intervals of the Difference between Proportions We can easily show that the MLEs are ^p 1: ˆ n 11 n 12 =n and ^d ˆ ^p 1: ^p 11 =^p 1:. 2 Log 2 ˆ n 11 f1=p 2 1: 1= p 1: d 2 g 1: n 12 f1=p 2 1: 1= 1 p 1: d 2 g n 22 = 1 p 1: 2 ; 2 Log 2 ˆ n 11 = p 1: d 2 n 12 = 1 p 1: d 2 ; Log ˆ n 11= p 1: d 2 n 12 = 1 p 1: d 2 : A:5 When substituting the MLEs ^p 1: and ^d for the corresponding parameters in (A.3±A.5) we can obtain the estimate of the asymptotic variance for the MLE ^d to be f^p 11^p 12 =^p 3 1: ^p 1: g=n through use of the inverse of the observed information matrix. Note that for a given fixed d 0 such that 1 < d 0 < 1, as p 1: increases Log L max f0; d 0 g to min f1; 1 d 0 g, the value of in 1: left-hand of equation (A.1) decreases from 1 to 1. Furthermore, (A.1) is a continuous function over max f0; d 0 g p 1: min f1; 1 d 0 g. These suggest that, for a given fixed d 0, where 1 < d 0 < 1, the conditional MLE ^p 1: d 0 of p 1: is simply the unique root for p 1: (falling in the range of max f0; d 0 g p 1: min f1; 1 d 0 g of equation (A.1) with replacing d by d 0. References Agresti, A., 1990: Categorical Data Analysis. Wiley, New York. Anbar, D., 1983: On estimating the difference between two probabilities, with special reference to clinical trials. Biometrics 39, 257±262. Anbar, D., 1984: Confidence bounds for the difference between two probabilities. Biometrics (reply to letter) 40, Anderson, T. W., 1958: An Introduction to Multivariate Statistical Analysis. Wiley, New York. Beal, S. L., 1987: Asymptotic confidence intervals for the difference between two binomial parameters for use with small samples. Biometrics 43, 941±950. Casella, G. and Berger, R. L., 1990: Statistical Inference. Duxbury, Belmont, California. Hauck, W. W. and Anderson, S., 1986: A comparison of large sample confidence interval methods for the difference of two binomial probabilities. The American Statistician 40, 318±322. Katz, D., Baptista, J., Azen, S. P., and Pike, M. C., 1978: Obtaining confidence intervals for the risk ratio in cohort studies. Biometrics 34, 469±474. Lui, K.-J., 1995: Confidence intervals for the risk ratio in cohort studies under inverse sampling. Biometrical Journal 37, 965±971. Lui, K.-J., 1996: Notes on Confidence limits for the odds ratio in case-control studies under inverse sampling. Biometrical Journal 38, 221±229. Lui, K.-J., 1998: Interval estimation of risk ratio between the secondary infection given the primary infection and the primary infection. Biometrics 54, 706±711.

11 Biometrical Journal 42 (2000) 1 69 Mee, R. W., 1984: Confidence bounds for the difference between two probabilities. Biometrics 40, 1175±1176. Miettinen, O. and Nurminen, M., 1985: Comparison analysis of two rates. Statistics in Medicine 4, 213±226. Santner, T. J. and Snell, M. K., 1980: Small-sample confidence intervals for p 1 p 2 and p 1 =p 2 in 2 2 contingency tables. Journal of the American Statistical Association 73, 386±394. Thomas, D. G. and Gart, J. J., 1977: A table of exact confidence limits for differences and ratios of two proportions and their odds ratios. Journal of the American Statistical Association 72, 73±76. SAS Institute, Inc., 1990: SAS Language, Version 6, 1st edition. Cary, North Carolina. Wallenstein, S., 1997: A non-iterative accurate asymptotic confidence interval for the difference between two proportions. Statistics in Medicine 16, 1329±1336. Kung-Jong Lui Received, November 1997 Department of Mathematical Sciences Revised, August 1999 College of Sciences Accepted, August 1999 San Diego State University 5500 Campanile Drive San Diego, CA USA kjl@rohan.sdsu.edu

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions K. Krishnamoorthy 1 and Dan Zhang University of Louisiana at Lafayette, Lafayette, LA 70504, USA SUMMARY

More information

Confidence Intervals for a Ratio of Binomial Proportions Based on Unbiased Estimators

Confidence Intervals for a Ratio of Binomial Proportions Based on Unbiased Estimators Proceedings of The 6th Sino-International Symposium Date published: October 3, 2009 on Probability, Statistics, and Quantitative Management pp. 2-5 Conference held on May 30, 2009 at Fo Guang Univ., Taiwan,

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

In Defence of Score Intervals for Proportions and their Differences

In Defence of Score Intervals for Proportions and their Differences In Defence of Score Intervals for Proportions and their Differences Robert G. Newcombe a ; Markku M. Nurminen b a Department of Primary Care & Public Health, Cardiff University, Cardiff, United Kingdom

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

More information

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Statistical inference of a measure for two binomial variates

Statistical inference of a measure for two binomial variates UNLV Theses, Dissertations, Professional Papers, and Capstones 5-11 Statistical inference of a measure for two binomial variates Serena Petersen University of Nevada, Las Vegas Follow this and additional

More information

Lawrence D. Brown, T. Tony Cai and Anirban DasGupta

Lawrence D. Brown, T. Tony Cai and Anirban DasGupta Statistical Science 2005, Vol. 20, No. 4, 375 379 DOI 10.1214/088342305000000395 Institute of Mathematical Statistics, 2005 Comment: Fuzzy and Randomized Confidence Intervals and P -Values Lawrence D.

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

INVARIANT SMALL SAMPLE CONFIDENCE INTERVALS FOR THE DIFFERENCE OF TWO SUCCESS PROBABILITIES

INVARIANT SMALL SAMPLE CONFIDENCE INTERVALS FOR THE DIFFERENCE OF TWO SUCCESS PROBABILITIES INVARIANT SMALL SAMPLE CONFIDENCE INTERVALS FOR THE DIFFERENCE OF TWO SUCCESS PROBABILITIES Thomas J Santner Department of Statistics 1958 Neil Avenue Ohio State University Columbus, OH 4210 Shin Yamagami

More information

A simulation study for comparing testing statistics in response-adaptive randomization

A simulation study for comparing testing statistics in response-adaptive randomization RESEARCH ARTICLE Open Access A simulation study for comparing testing statistics in response-adaptive randomization Xuemin Gu 1, J Jack Lee 2* Abstract Background: Response-adaptive randomizations are

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Simultaneous Confidence Intervals for Risk Ratios in the Many-to-One Comparisons of Proportions

Simultaneous Confidence Intervals for Risk Ratios in the Many-to-One Comparisons of Proportions Western University Scholarship@Western Electronic Thesis and Dissertation Repository August 2012 Simultaneous Confidence Intervals for Risk Ratios in the Many-to-One Comparisons of Proportions Jungwon

More information

Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero

Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero Clemson University TigerPrints All Theses Theses 8-2013 Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero Xiaohua Bai Clemson University,

More information

Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with Structural Zero

Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with Structural Zero Melbourne Business chool From the electedworks of Chris J. Lloyd ummer 2007 Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with tructural Zero Chris Lloyd Available at: https://works.bepress.com/chris_lloyd/6/

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1 MAT 2379, Introduction to Biostatistics Sample Calculator Problems for the Final Exam Note: The exam will also contain some problems

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Measures of Association and Variance Estimation

Measures of Association and Variance Estimation Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35

More information

And the Bayesians and the frequentists shall lie down together...

And the Bayesians and the frequentists shall lie down together... And the Bayesians and the frequentists shall lie down together... Keith Winstein MIT CSAIL October 16, 2013 Axioms of Probability (1933) S: a finite set (the sample space) A: any subset of S (an event)

More information

Lecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 1 Odds ratios for retrospective studies 2 Odds ratios approximating the

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

Pseudo-score confidence intervals for parameters in discrete statistical models

Pseudo-score confidence intervals for parameters in discrete statistical models Biometrika Advance Access published January 14, 2010 Biometrika (2009), pp. 1 8 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asp074 Pseudo-score confidence intervals for parameters

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT)

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT) . Welcome! Webinar Biostatistics: sample size & power Thursday, April 26, 12:30 1:30 pm (NDT) Get started now: Please check if your speakers are working and mute your audio. Please use the chat box to

More information

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC Journal of Applied Statistical Science ISSN 1067-5817 Volume 14, Number 3/4, pp. 225-235 2005 Nova Science Publishers, Inc. AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC FOR TWO-FACTOR ANALYSIS OF VARIANCE

More information

Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions

Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions Roger L. Berger Department of Statistics North Carolina State University Raleigh, NC 27695-8203 June 29, 1994 Institute

More information

And the Bayesians and the frequentists shall lie down together...

And the Bayesians and the frequentists shall lie down together... And the Bayesians and the frequentists shall lie down together... Keith Winstein MIT CSAIL February 12, 2014 Axioms of Probability (1933) S: a finite set (the sample space) A: any subset of S (an event)

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior: Pi Priors Unobservable Parameter population proportion, p prior: π ( p) Conjugate prior π ( p) ~ Beta( a, b) same PDF family exponential family only Posterior π ( p y) ~ Beta( a + y, b + n y) Observed

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,

More information

Means or "expected" counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv

Means or expected counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv Measures of Association References: ffl ffl ffl Summarize strength of associations Quantify relative risk Types of measures odds ratio correlation Pearson statistic ediction concordance/discordance Goodman,

More information

Statistical Inference for the Risk Ratio in 2x2 Binomial Trials with Stuctural Zero

Statistical Inference for the Risk Ratio in 2x2 Binomial Trials with Stuctural Zero The University of Maine DigitalCommons@UMaine Electronic Theses and Dissertations Fogler Library 2004 Statistical Inference for the Risk Ratio in 2x2 Binomial Trials with Stuctural Zero Suzhong Tian Follow

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

Basic Concepts of Inference

Basic Concepts of Inference Basic Concepts of Inference Corresponds to Chapter 6 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT) with some slides by Jacqueline Telford (Johns Hopkins University) and Roy Welsch (MIT).

More information

Comparison of Estimators in GLM with Binary Data

Comparison of Estimators in GLM with Binary Data Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

Exact unconditional tests for a 2 2 matched-pairs design

Exact unconditional tests for a 2 2 matched-pairs design Statistical Methods in Medical Research 2003; 12: 91^108 Exact unconditional tests for a 2 2 matched-pairs design RL Berger Statistics Department, North Carolina State University, Raleigh, NC, USA and

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Comparing p s Dr. Don Edwards notes (slightly edited and augmented) The Odds for Success

Comparing p s Dr. Don Edwards notes (slightly edited and augmented) The Odds for Success Comparing p s Dr. Don Edwards notes (slightly edited and augmented) The Odds for Success When the experiment consists of a series of n independent trials, and each trial may end in either success or failure,

More information

Generalized confidence intervals for the ratio or difference of two means for lognormal populations with zeros

Generalized confidence intervals for the ratio or difference of two means for lognormal populations with zeros UW Biostatistics Working Paper Series 9-7-2006 Generalized confidence intervals for the ratio or difference of two means for lognormal populations with zeros Yea-Hung Chen University of Washington, yeahung@u.washington.edu

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

7 Estimation. 7.1 Population and Sample (P.91-92)

7 Estimation. 7.1 Population and Sample (P.91-92) 7 Estimation MATH1015 Biostatistics Week 7 7.1 Population and Sample (P.91-92) Suppose that we wish to study a particular health problem in Australia, for example, the average serum cholesterol level for

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

A Likelihood Ratio Test

A Likelihood Ratio Test A Likelihood Ratio Test David Allen University of Kentucky February 23, 2012 1 Introduction Earlier presentations gave a procedure for finding an estimate and its standard error of a single linear combination

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction ReCap. Parts I IV. The General Linear Model Part V. The Generalized Linear Model 16 Introduction 16.1 Analysis

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF pp: --2 (col.fig.: Nil) STATISTICS IN MEDICINE Statist. Med. 2004; 2:000 000 (DOI: 0.002/sim.8) PROD. TYPE: COM ED: Chandra PAGN: Vidya -- SCAN: Nil Simple improved condence intervals for comparing matched

More information

The Components of a Statistical Hypothesis Testing Problem

The Components of a Statistical Hypothesis Testing Problem Statistical Inference: Recall from chapter 5 that statistical inference is the use of a subset of a population (the sample) to draw conclusions about the entire population. In chapter 5 we studied one

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Test Volume 11, Number 1. June 2002

Test Volume 11, Number 1. June 2002 Sociedad Española de Estadística e Investigación Operativa Test Volume 11, Number 1. June 2002 Optimal confidence sets for testing average bioequivalence Yu-Ling Tseng Department of Applied Math Dong Hwa

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

Bayesian Confidence Intervals for the Ratio of Means of Lognormal Data with Zeros

Bayesian Confidence Intervals for the Ratio of Means of Lognormal Data with Zeros Bayesian Confidence Intervals for the Ratio of Means of Lognormal Data with Zeros J. Harvey a,b & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics and Actuarial Science

More information

A SAS/AF Application For Sample Size And Power Determination

A SAS/AF Application For Sample Size And Power Determination A SAS/AF Application For Sample Size And Power Determination Fiona Portwood, Software Product Services Ltd. Abstract When planning a study, such as a clinical trial or toxicology experiment, the choice

More information

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: CHAPTER 9, 10 Hypothesis Testing Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: The person is guilty. The person is innocent. To

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

A Classroom Approach to Illustrate Transformation and Bootstrap Confidence Interval Techniques Using the Poisson Distribution

A Classroom Approach to Illustrate Transformation and Bootstrap Confidence Interval Techniques Using the Poisson Distribution International Journal of Statistics and Probability; Vol. 6, No. 2; March 2017 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education A Classroom Approach to Illustrate Transformation

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Standard Error of Technical Cost Incorporating Parameter Uncertainty

Standard Error of Technical Cost Incorporating Parameter Uncertainty Standard Error of Technical Cost Incorporating Parameter Uncertainty Christopher Morton Insurance Australia Group This presentation has been prepared for the Actuaries Institute 2012 General Insurance

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software Sampling: A Brief Review Workshop on Respondent-driven Sampling Analyst Software 201 1 Purpose To review some of the influences on estimates in design-based inference in classic survey sampling methods

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Multiple Sample Categorical Data

Multiple Sample Categorical Data Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

SPRING 2007 EXAM C SOLUTIONS

SPRING 2007 EXAM C SOLUTIONS SPRING 007 EXAM C SOLUTIONS Question #1 The data are already shifted (have had the policy limit and the deductible of 50 applied). The two 350 payments are censored. Thus the likelihood function is L =

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

n y π y (1 π) n y +ylogπ +(n y)log(1 π). Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

The Calibrated Bayes Factor for Model Comparison

The Calibrated Bayes Factor for Model Comparison The Calibrated Bayes Factor for Model Comparison Steve MacEachern The Ohio State University Joint work with Xinyi Xu, Pingbo Lu and Ruoxi Xu Supported by the NSF and NSA Bayesian Nonparametrics Workshop

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Accepted Manuscript. Comparing different ways of calculating sample size for two independent means: A worked example

Accepted Manuscript. Comparing different ways of calculating sample size for two independent means: A worked example Accepted Manuscript Comparing different ways of calculating sample size for two independent means: A worked example Lei Clifton, Jacqueline Birks, David A. Clifton PII: S2451-8654(18)30128-5 DOI: https://doi.org/10.1016/j.conctc.2018.100309

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information