E509A: Principle of Biostatistics. GY Zou

Size: px
Start display at page:

Download "E509A: Principle of Biostatistics. GY Zou"

Transcription

1 E509A: Principle of Biostatistics (Effect measures ) GY Zou gzou@robarts.ca

2 We have discussed inference procedures for 2 2 tables in the context of comparing two groups. Yes No Group 1 a b n 1 Group 2 c d n 2 m 1 m 2 n For hypothesis testing, we use Pearson chi-square test; For interval estimation, we use methods for p 1 p 2 (of course, NNT).

3 However, Pearson chi-square test will work only if expected value for every cell is greater than 5. For data with small cells, we can use Fisher s exact test. The idea of this test is to fix the row and column totals as the observed table, and compute the probabilities of observing as or more extreme tables in their departure from the null hypothesis (recall the definition of P -value). Fisher (1935, The logic of inductive inference JRSS A 98: 39-54) presented his test at the annual Christmas meeting of the Royal Statistical Society. The title is very good, because I ve heard that the most important contribution of statistics to science is not the formula, but logic. Still, right after his talk, a speaker compared Fisher s talk to the braying of the Golden Ass.

4 The probability of observing the table is given by Pr(a, b, c, d marginals = n 1,n 2,m 1,m 2 )= n 1!n 2!m 1!m 2! n!a!b!c!d! Fisher s procedure requires the probability of all more extreme tables to be computed, using Eq (1) repeatedly. (1) The p-value of the test is obtained by definition: Sum of all those probabilities. Thus, Fisher s exact test is essentially one-sided. If two-sided is called for, the simplest way to do it is to double the p value. This is exactly SAS proc freq gives you when n 1 = n 2.

5 Example of Fisher s exact test (p. 375). Yes No Group Group Pr(a, b, c, d 48, 9, 24) = 24!24!9!39! 48!8!16!1!23! = Amoreextremetableis: Yes No Group Group Pr(a, b, c, d 48, 9, 24) = 24!24!9!39! 48!9!15!0!24! = The p-value is then =

6 SAS function for hypergeometric probability * pdf( HYPER, a, n, n 1,m 1 ); data; bb=pdf( HYPER, 8, 48, 24, 9); cc=pdf( HYPER, 9, 48, 24, 9); dd=bb+cc; proc print; run; Obs bb cc dd Two-sided p-value is then =

7 Yes No Group Group data fisher; do i = 0 to 9; bb=pdf( HYPER, i, 44, 9, 24); output; end; proc print; run;

8 The SAS System Obs i bb One-sided p-value is = Another way called mid-p-value (Lancaster 1961 JASA 56: ): One sided mid-p value is 1/ = Two-sided p-value is 2 mid p = = Two-sided p-values is given by = This is how SAS obtains two-sided p-value.

9 The way to present the results is: Rate in group I was?, in Group II was?; difference? (95% confidence interval? to? ), P =? (Fisher s Exact test two-sided mid P).

10 McKinney et al (1989 The inexact use of Fisher s exact test in six major medical journals JAMA 261: ). Half of 70 articles reviewed either had used a one-tailed test when a two-tailed test was called for, or the authors simply had not bothered to state which test they had used.

11 If only hypothesis testing, an epidemiologist s life would be too easy. Effect estimation makes it hard, also interesting.

12 Besides randomized studies, there are more ways of generating 2 2 table: cross-sectional (naturalistic, multinomial) sampling: select a total of N subjects, followed by the determination for each subject of presence or absence of characteristics of A and B; retrospective sampling: predetermine n 1 of subjects who possess A and n 2 who do not possess A, followed by the determination of B in each group, where A is usually a disease of interest and B is a risk factor. Case-control study prospective sampling: similar to case-control, except A and B is switched. Cohort study.

13 Cross-sectional sample to estimate risk ratio (relative risk, RR) Outcome (D) Exposure (E) Yes (+) No (-) 1(Yes, +) a b n 1 2(No, -) c d n 2 Risk ratio is defined by m 1 m 2 n RR = Pr(D+ E + ) Pr(D + E ) The estimated RR is RR = a/n 1 c/n 2

14 The estimated variance for ln RR estimated by var[ln( RR)] = 1 a 1 n c 1 n 2. 95% CI for RR is obtained by obtaining CI for ln(rr) because the sampling distribution of ln RR is closer to Normal than that of RR l, u =ln( RR) ± 1.96 var(ln RR) The CI for RR is then given by exp(l), exp(u)

15 Example. Data for 200 mothers and their baby birthweight are as Outcome Maternal Age 2500 > 2500 < 20 a =10 b =40 n 1 =50 20 c =15 b = 135 n 2 = 150 m 1 =25 m 2 = 175 n = 200 RR = 10/50 15/150 =2 with variance estimate for ln RR given by var[ln( RR)] = 1 a 1 n c 1 n 2 = = % CI for RR is exp[ln(2) ± ] = (0.96, 4.16)

16 Levin s attributable risk fraction: How much risk would be reduced if the exposure is eliminated? e.g., force all the smoker in London leave town. Since people with disease include two exclusive types: those who were exposed, and those who were not exposed, we have Pr(D + ) = Pr(D + E + )+Pr(D + E ) =Pr(D + E + )Pr(E + )+Pr(D + E )Pr(E ) If E + cannot cause disease, we would expected people with exposure (E + ) have the same disease rate as those who were not exposed, i.e., Pr(D + E ). Thus, the proportion of exposed people will have disease, if the exposure could not cause disease, isgivenby Pr(D + E ) Pr(E + )

17 Levin (1953, Acta Unio Int contra Cancrum 19: ) defined Attributable Fraction as R A = actual counterfactual actual = Pr(D+ E + )Pr(E + ) Pr(D + E )Pr(E + ) = Pr(D + ) Pr(E + )[Pr(D + E + ) Pr(D + E )] Pr(D + E + )Pr(E + )+Pr(D + E )Pr(E ) = Pr(E+ )[Pr(D + E + )/ Pr(D + E ) 1] Pr(E + )Pr(D + E + )/ Pr(D + E )+Pr(E ) Pr(E = + )[RR 1] Pr(E + )RR+1 Pr(E ) R A = Pr(E+ )[RR 1] 1+Pr(E + )(RR 1)

18 R A = Pr(E+ )(RR 1) 1+Pr(E + )(RR 1) Estimated by R A = ] Pr(E + ) [ RR 1 1+ Pr(E + )( RR 1) with estimated variance for ln(1 R A ) given by [ ] var ln(1 R A ) = 1 [ ] b + R A (a + d) nc see Fleiss (1979 Am J Epidemiol 110: ).

19 Example: Infant mortality by birthweight for n = live births in New York City in What % of deaths could have been prevented if low birthweight had been eliminated? 1 yr BW Dead Alive Total 2500g a/n = b/n = (a + b)/n = > 2500g c/n = d/n = (c + d)/n = Total n/n =1 RR = / / = P (E + )=0.0717

20 R A = ] Pr(E + ) [ RR 1 1+ Pr(E + )( RR 1) =.0717( ) ( ) = Variance for ln(1 R A ) is var i.e., [ ] ln(1 R A ) ŝ.e. = ( ) [ ] ln(1 R A ) = var = = % CI for ln(1 R A ) is ln(1 R A ) ± 1.96ŝe[ln(1 R A )] = ln( ) ± =( 0.900, 0.755)

21 CI for R A is then given by [1 exp( 0.755), 1 exp( 0.900)] = (0.530, 0.593) With 95% confidence, between 53% and 59% of all infant death in New York City in 1974 could have been prevented if low birth weight had been eliminated.

22 Attributable risk among the exposed R E =1 1 RR which is widely used in the law to describe the excess risk as a fraction of the risk among those exposed to the antecedent factor. The estimator is R E =1 1 RR Estimation can be conducted through RR.

23 .

24 Cohort study may be used to estimate all of the above effect measures.

25 Case-control study and odds ratio Rothman, Modern Epidemiology, 1986, p.62 The sophisticated use and understanding of case-control studies is the most outstanding methodological development of modern epidemiology My understanding of case-control study is from Breslow Statistics in epidemiology: The case-control study. J Am Stat Assoc 91:14 28.

26 Recall case-control design involves selecting n 1 of subjects who have disease D + and n 2 who do not possess D, followed by the determination of exposure X + or X in each group. Case-control design in general can only provides the ratio of exposure odds of case group to that of the control group, i.e. OR e = exposure odds case exposure odds control Since odds is defined as P 1 P, OR e = Pr(X + D + ) 1 Pr(X + D + ) Pr(X + D ) 1 Pr(X + D ) = Pr(X+ D + )Pr(X D ) Pr(X + D )Pr(X D + )

27 For OR e to be useful, it must have some relationship with risk ratio Pr(D + X + ) Pr(D + X ). Entered Cornfield (1951 J. Natl Cancer Inst 11: ) who showed that 1) Pr(D+ X+)Pr(X+) Pr(X + D )Pr(X D + ) = Pr(D + ) OR e = Pr(X+ D + )Pr(X D ) = Pr(D+ X + )Pr(D X ) Pr(D X + )Pr(D + X ) = OR d Pr(D X )Pr(X ) Pr(D ) Pr(D X + )Pr(X + ) Pr(D Pr(D+ X )Pr(X ) ) Pr(D + ) 2) Pr(D X ) Pr(D + X ) 1, whenpr(d+ ) 0. which implies that OR d RR d. To see 2), observe Pr(D X ) = Pr(X D )P (D ) = P (X ) = Pr(X D )P (D ) Pr(X D )+Pr(X D + ) Pr(X D )P (D ) Pr(X D )Pr(D )+Pr(X D + )Pr(D + ) 1, when Pr(D+ ) 0 Similarly Pr(D X + ) 1 when Pr(D + ) 0.

28 Thus, case-control studies are indeed useful. In fact, Mantel & Haenszel (1959, J Natl Cancer Inst 22: ) stated Among the desirable attributes of the retrospective study is the ability to yield results from presently collectible data... The retrospective approach is also adapted to the limited resources of an individual investigator... For especially rare disease a retrospective study may be the only feasible approach... In the absence of important biases in the study setting, the retrospective method could be regarded, according to sound statistical theory, as the study method of choice (p. 720). This was almost 50 years ago. Recent epidemiologic literature has seen more and more prospective studies, except genetic epidemiologic literature in which case-control design is almost universal. More detailed discussion of statistical issues may be found in Zou (2006 Annals of Human Genetics 70: ).

29 Status Exposure Case Control Yes a b n 1 No c d n 2 Odds ratio in a case control study is estimated by ÔR = ad bc with variance of ln(ôr) estimated by var[ln(ôr)] = 1 a + 1 b + 1 c + 1 d

30 Thus, (1 α) 100% CI for OR is given by [ ] exp ln(ôr) ± Z 1 α/2 var[ln(ôr)]

31 Example. Sun protection during childhood by case-control status for cutaneous melanoma in Belgium, France and Germany. Exposure Sun protection Case Control Yes No ÔR = =0.72 var[ln(ôr)] = = % CI for OR is exp[ln(0.72) ± ] = exp( , ) =(0.53, 0.98)

32 Status Exposure Case Control Yes a b n 1 No c d n 2 Odds ratio in a case control study is estimated by ÔR = ad bc Status trt Yes No 1 a =0 b =14 n 1 =14 2 c =0 d =11 n 2 =11 This is a data set discussed by Parzen M, Lipsitz S, Ibrahim J, Klar N An estimate of the odds ratio that always exists. J Comput Graph Stat 11:

33 Exact confidence interval for OR It is available in SAS proc freq, which computes exact confidence limits for the odds ratio with an algorithm by Thomas (1971, Applied Statistics 20: ), ie., the limits L and U are iterative solutions for the following two equations: m1 ( n1 ) L i i=a m1 i=0 a i=0 m1 i=0 )( n2 i m 1 i ( n1 )( n2 i m 1 i ( n1 )( n2 i m 1 i ( n1 )( n2 i m 1 i ) L i = α/2 ) U i ) = α/2 U i

34 Exact may not be the best in categorical data analysis because the results may be too conservative Agresti A Dealing with discreteness: making exact confidence intervals for proportions, differences of proportions, and odds ratios more exact Stat Meth Med Res 12 (1): the inversion of the asymptotic score test seems to be a good choice. This tends to have actual level fluctuating around the nominal level. If one prefers that level to be a bit more conservative, mid-p adaptations of exact methods work well. For situations that require a bound on the error, it appears that basing conservative intervals on inverting the exact score test has reasonable performance. For teaching, the Wald-type interval of point estimate plus and minus a normal-score multiple of a standard error is simplest. Unfortunately, this can perform poorly, but simple adjustments sometimes provide much improved performance.

35 Odds ratio versus risk ratio. Most traditional statistical methods in epidemiology were developed in the case of case-control design. Specifically, OR was the effect measure of choice. Unfortunately, when it comes to prospective or cross-sectional design, the rare disease assumption may not be satisfied. In such cases, OR becomes very difficulty to interpret, sometimes misleading (see NEJM 1999;341:279 83). For a RR =2, p p OR = p 1/(1 p 1 ) p 2 /(1 p 2 )

36 My view is that we must always remember why Cornfield (1951) proposed OR. Excellent discussion on the choice of effect measures may be found in Greenland (Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol 1987;125:761 8).

37 Converting OR to RR (Zhang & Yu, 1998 JAMA 280: ): RR = RR = OR 1 p 2 + p 2 OR ÔR 1 p 2 + p 2 ÔR (2) Eq. (2) results in correct point estimate only if they are no confounder; Substituting confidence limits for OR to obtain CI for RR yields invalid interval for RR. More discussion can be found in McNutt et al (2003 Am J Epidemiol 157:940 3) and Zou (2004 Am J Epidemiol 159: 702 6).

38 A little trick to check calculations when a confidence interval is constructed through log-transformation (Lee PN Stat Med 18: ): Such a interval should satisfies: square of the point estimate should equal to the product of lower and upper limits. l = exp[ln point Z var(ln point)] u = exp[ln point + Z var(ln point)] thus l u =exp(2 ln point) =(point) 2

39 Sample size estimation with SAS proc power proc power; twosamplefreq test=pchi relativerisk = 1.5 refproportion = 0.2 power=0.8 ntotal=.; run;

40 The POWER Procedure Pearson Chi-square Test for Two Proportions Fixed Scenario Elements Distribution Asymptotic normal Method Normal approximation Reference (Group 1) Proportion 0.2 Relative Risk 1.5 Nominal Power 0.8 Number of Sides 2 Null Relative Risk 1 Alpha 0.05 Group 1 Weight 1 Group 2 Weight 1 Computed N Total Actual N Power Total

41 proc power; twosamplefreq test=pchi oddsratio = 2.5 refproportion = 0.3 groupweights = (1 2) ntotal =. power = 0.8; run;

42 The POWER Procedure Pearson Chi-square Test for Two Proportions Fixed Scenario Elements Distribution Asymptotic normal Method Normal approximation Reference (Group 1) Proportion 0.3 Odds Ratio 2.5 Group 1 Weight 1 Group 2 Weight 2 Nominal Power 0.8 Number of Sides 2 Null Odds Ratio 1 Alpha 0.05 Computed N Total Actual N Power Total

43 proc power; twosamplefreq test=fisher groupproportions = (.35.15) power=0.80 npergroup =.; run;

44 The POWER Procedure Fisher s Exact Conditional Test for Two Proportions Fixed Scenario Elements Distribution Exact conditional Method Walters normal approximation Group 1 Proportion 0.35 Group 2 Proportion 0.15 Nominal Power 0.8 Number of Sides 2 Alpha 0.05 Computed N Per Group Actual N Per Power Group

45

46 Combine information from multiple 2 2 tables (Mantel-Haenszel methods)

47 Outcome Exposure Yes No Yes a k b k n 1k No c k d k n 2k m 1k m 2k n k MH test for no association between exposure and outcome (Mantel & Haenszel, 1959 J Natl Cancer Inst 22: ) [ ( )] 2 χ 2 k a k m 1kn 1k n k MH = k m 1k m 2k n 1k n 2k n 2 k (n k 1) which is distributed as chi-square with one degree-of-freedom, under H 0. 5 years earlier, Cochran (1954 Biometrics 10: ) proposed a test that is virtually identical to χ 2 MH (difference?) When k =1, χ 2 MH reduce to Pearson chi-square test for 2 2 table.

48 Mantel-Haenszel odds ratio estimator (1959) OR MH = k a kd k /n k k b kc k /n k For 20 years, nobody knew what was the standard error for OR MH. Hauck (1979, Biometrics 35: ) provided a formula that is valid when each table are large. Outcome Exposure Yes No Yes a k b k n 1k No c k d k n 2k m 1k m 2k n k

49 The popular variance formula for OR MH is the one derived by Robins, Breslow, & Greenland (1986, Biometrics 42: ). Recall OR MH = k a kd k /n k k b kc k /n k = k R k k S k Define two more terms P k =(a k + d k )/n k and Q k =(b k + c k )/n k var[ln( OR k MH )] = P kr k 2 ( k k R ) 2 + P ks k + k Q kr k k 2 ( k R )( k k S ) k + Q ks k k 2 ( k S k Outcome Exposure Yes No Yes a k b k n 1k No c k d k n 2k m 1k m 2k n k ) 2

50 (1 α) 100% CI for OR [ ( ) exp ln OR MH ] ± Z 1 α/2 var[ln( OR MH )

51 Example (case-control study): OR MH. Case-control studies on the role of high voltage power lines in the etiology of leukemia in children (Hanley & Thriault, 2000 Epidemiology 11(5): 613) Study 1 Study 2 Case Control Case Control < 100m a k =18 b k =25 n 1k = > 100m c k = 162 d k = 252 n 2k = m 1k = 180 m 2k = 277 n k = ÔR 1 =1.12 ÔR 2 =1.62

52 If not stratify, we have Status Case Control < 100m a =30 b = 148 > 100m c = 188 d = 683 ÔR = from leukemia. =0.74, living closer to powerlines protects children However, OR MH = = =

53 Mantel-Haenszel technique has also been used to derive RR estimator (Tarone, 1981 J Chronic Dis 34: ): RR MH = k a kn 2k /n k k c kn 1k /n k with variance given by var[ln( RR MH )] = ( k [n 1kn 2k m 1k a k c k n k ] /n 2 k k a )( kn 2k /n k k c ) kn 1k /n k Outcome Exposure Yes No Yes a k b k n 1k No c k d k n 2k m 1k m 2k n k

54 (1 α) 100% CI for RR [ ( ) exp ln RR MH ] ± Z 1 α/2 var[ln( RR MH )

55 Ex 8.5. Example (Clinical trial): RR MH Age 65+ Age 65- Drug Yes No Yes No B a k =32 b k =8 n 1k = A c k =24 d k =36 n 2k = m 1k =56 m 2k =44 n k = χ 2 MH [ = k k ( a k n 1k m 1k )] 2 n k n 1k n 2k m 1k m 2k n 2 k (n k 1) ) 2 = ( (100 1) (100 1) =18.435

56 RR MH = k a kn 2k /n k k c kn 1k /n k = = =2.0 var[ln( RR)] = 95% CI is then k (n 1kn 2k m 1k a k c k n k )/n 2 k ( k a kn 2k /n k)( k c kn 1k /n k) = = exp(ln 2 ± ) = (1.44, 2.77) =2 2 (checked)

57 Summary: Application of Mantel-Haenszel methods Adjust for confounding (the original purpose): combat Simpson s paradox; Meta-analysis: considering each study as a stratum. The method of meta-analysis is commonly referred to as fix-effect model with intention of summarizing available evidence, but not to predict future study results. For that, random-effect model must have to be adopted.

58 A note: Like fire, the chi-square test is an excellent servant and a bad master (Hill, 1965 Proc R Soc Med 58: ). Study 1 Study 2 Exposure D + D Risk D + D Risk E E RR OR p-value

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Suppose that we are concerned about the effects of smoking. How could we deal with this? Suppose that we want to study the relationship between coffee drinking and heart attacks in adult males under 55. In particular, we want to know if there is an association between coffee drinking and heart

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. risk factors 2. risk

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Lecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 1 Odds ratios for retrospective studies 2 Odds ratios approximating the

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES

POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES This handout covers material found in Section 0.5 of the text. POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES EXAMPLE: Otolaryngology (Example 0.3 of your text, page 405). Suppose a study

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou. E509A: Principle of Biostatistics (Week 11(2): Introduction to non-parametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence ST3241 Categorical Data Analysis I Two-way Contingency Tables Odds Ratio and Tests of Independence 1 Inference For Odds Ratio (p. 24) For small to moderate sample size, the distribution of sample odds

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

BIOS 625 Fall 2015 Homework Set 3 Solutions

BIOS 625 Fall 2015 Homework Set 3 Solutions BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's

More information

2 Describing Contingency Tables

2 Describing Contingency Tables 2 Describing Contingency Tables I. Probability structure of a 2-way contingency table I.1 Contingency Tables X, Y : cat. var. Y usually random (except in a case-control study), response; X can be random

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Analysis of Categorical Data Three-Way Contingency Table

Analysis of Categorical Data Three-Way Contingency Table Yu Lecture 4 p. 1/17 Analysis of Categorical Data Three-Way Contingency Table Yu Lecture 4 p. 2/17 Outline Three way contingency tables Simpson s paradox Marginal vs. conditional independence Homogeneous

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

Sample Size/Power Calculation by Software/Online Calculators

Sample Size/Power Calculation by Software/Online Calculators Sample Size/Power Calculation by Software/Online Calculators May 24, 2018 Li Zhang, Ph.D. li.zhang@ucsf.edu Associate Professor Department of Epidemiology and Biostatistics Division of Hematology and Oncology

More information

1 Comparing two binomials

1 Comparing two binomials BST 140.652 Review notes 1 Comparing two binomials 1. Let X Binomial(n 1,p 1 ) and ˆp 1 = X/n 1 2. Let Y Binomial(n 2,p 2 ) and ˆp 2 = Y/n 2 3. We also use the following notation: n 11 = X n 12 = n 1 X

More information

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Cohort study s formulations PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Srine Dudoit Division of Biostatistics Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~srine

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis

Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis 2014 Maternal and Child Health Epidemiology Training Pre-Training Webinar: Friday, May 16 2-4pm Eastern Kristin Rankin,

More information

Power and Sample Size (StatPrimer Draft)

Power and Sample Size (StatPrimer Draft) Power and Sample Size (StatPrimer Draft) To achieve meaningful results, statistical studies must be carefully planned and designed. Study design has many aspects. Here s just a sampling of questions you

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU

Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU Sample size and power calculation using R and SAS proc power Ho Kim GSPH, SNU Pvalue (1) We want to show that the means of two populations are different! Y 1 a sample mean from the 1st pop Y 2 a sample

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

Three-Way Contingency Tables

Three-Way Contingency Tables Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval Epidemiology 9509 Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered

More information

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal

More information

One-Way Tables and Goodness of Fit

One-Way Tables and Goodness of Fit Stat 504, Lecture 5 1 One-Way Tables and Goodness of Fit Key concepts: One-way Frequency Table Pearson goodness-of-fit statistic Deviance statistic Pearson residuals Objectives: Learn how to compute the

More information

Lecture 8: Summary Measures

Lecture 8: Summary Measures Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:

More information

3 Way Tables Edpsy/Psych/Soc 589

3 Way Tables Edpsy/Psych/Soc 589 3 Way Tables Edpsy/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring 2017

More information

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK March 3-5,

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

Data, Design, and Background Knowledge in Etiologic Inference

Data, Design, and Background Knowledge in Etiologic Inference Data, Design, and Background Knowledge in Etiologic Inference James M. Robins I use two examples to demonstrate that an appropriate etiologic analysis of an epidemiologic study depends as much on study

More information

Review of One-way Tables and SAS

Review of One-way Tables and SAS Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409

More information

Multi-Level Test of Independence for 2 X 2 Contingency Table using Cochran and Mantel Haenszel Statistics

Multi-Level Test of Independence for 2 X 2 Contingency Table using Cochran and Mantel Haenszel Statistics IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. Issue 8, August 015. ISSN 348 7968 Multi-Level Test of Independence for X Contingency Table using Cochran and Mantel

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon

More information

Asymptotic equivalence of paired Hotelling test and conditional logistic regression

Asymptotic equivalence of paired Hotelling test and conditional logistic regression Asymptotic equivalence of paired Hotelling test and conditional logistic regression Félix Balazard 1,2 arxiv:1610.06774v1 [math.st] 21 Oct 2016 Abstract 1 Sorbonne Universités, UPMC Univ Paris 06, CNRS

More information

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds American Journal of Epidemiology ª The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

CDA Chapter 3 part II

CDA Chapter 3 part II CDA Chapter 3 part II Two-way tables with ordered classfications Let u 1 u 2... u I denote scores for the row variable X, and let ν 1 ν 2... ν J denote column Y scores. Consider the hypothesis H 0 : X

More information

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Chapter 157 Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X Introduction This procedure calculates the power and sample size necessary in a matched case-control study designed

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Lecture 25: Models for Matched Pairs

Lecture 25: Models for Matched Pairs Lecture 25: Models for Matched Pairs Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture

More information

Ling 289 Contingency Table Statistics

Ling 289 Contingency Table Statistics Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

10: Crosstabs & Independent Proportions

10: Crosstabs & Independent Proportions 10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church

More information

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21 Sections 2.3, 2.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 21 2.3 Partial association in stratified 2 2 tables In describing a relationship

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

Collated responses from R-help on confidence intervals for risk ratios

Collated responses from R-help on confidence intervals for risk ratios Collated responses from R-help on confidence intervals for risk ratios Michael E Dewey November, 2006 Introduction This document arose out of a problem assessing a confidence interval for the risk ratio

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Chapter 2: Describing Contingency Tables - II

Chapter 2: Describing Contingency Tables - II : Describing Contingency Tables - II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information

Statistical Methods in Epidemiologic Research. EP 521 Spring 2006 Course Notes Vol I (Part 1 of 5)

Statistical Methods in Epidemiologic Research. EP 521 Spring 2006 Course Notes Vol I (Part 1 of 5) EP 521, Spring 2006, Vol I, Part 1 Statistical Methods in Epidemiologic Research 1 EP 521 Spring 2006 Course Notes Vol I (Part 1 of 5) A. Russell Localio*, and Jesse A Berlin ( The Great Master ) *Department

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Inferences for Proportions and Count Data

Inferences for Proportions and Count Data Inferences for Proportions and Count Data Corresponds to Chapter 9 of Tamhane and Dunlop Slides prepared by Elizabeth Newton (MIT), with some slides by Ramón V. León (University of Tennessee) 1 Inference

More information

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S Logistic regression analysis Birthe Lykke Thomsen H. Lundbeck A/S 1 Response with only two categories Example Odds ratio and risk ratio Quantitative explanatory variable More than one variable Logistic

More information

Analysis of categorical data S4. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands

Analysis of categorical data S4. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands Analysis of categorical data S4 Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands m.hauptmann@nki.nl 1 Categorical data One-way contingency table = frequency table Frequency (%)

More information

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF pp: --2 (col.fig.: Nil) STATISTICS IN MEDICINE Statist. Med. 2004; 2:000 000 (DOI: 0.002/sim.8) PROD. TYPE: COM ED: Chandra PAGN: Vidya -- SCAN: Nil Simple improved condence intervals for comparing matched

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Relative Effect Sizes for Measures of Risk. Jake Olivier, Melanie Bell, Warren May

Relative Effect Sizes for Measures of Risk. Jake Olivier, Melanie Bell, Warren May Relative Effect Sizes for Measures of Risk Jake Olivier, Melanie Bell, Warren May MATHEMATICS & THE UNIVERSITY OF NEW STATISTICS SOUTH WALES November 2015 1 / 27 Motivating Examples Effect Size Phi and

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data

More information

Introduction to the Analysis of Tabular Data

Introduction to the Analysis of Tabular Data Introduction to the Analysis of Tabular Data Anthropological Sciences 192/292 Data Analysis in the Anthropological Sciences James Holland Jones & Ian G. Robertson March 15, 2006 1 Tabular Data Is there

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Means or "expected" counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv

Means or expected counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv Measures of Association References: ffl ffl ffl Summarize strength of associations Quantify relative risk Types of measures odds ratio correlation Pearson statistic ediction concordance/discordance Goodman,

More information