Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval

Size: px

Start display at page:

Download "Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval"

Bridget Fowler
5 years ago
Views:

1 Epidemiology 9509 Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario

2 What is being covered 1. Matching, Pairing 2. Inference - tests 3. Inference - estimation

3 proportions in subjects paired or matched twins, eyes, sibs, etc like t-test except outcome is binary want to look at relationship between factor a and factor b factor a present factor b factor a absent factor b Present Absent Total Present a b r 1 Absent c d r 2 Total c 1 c 2 n

4 Inference - test of hypothesis Test for association S Mc = called McNemar s test based on discordant pairs 1 is continuity correction ( b c 1)2 b +c under H o, S Mc is distributed as χ 2 1

5 dependent sample - example smokers/nonsmokers get respiratory disease paired on gender and age 100 pairs under H o, is distributed as χ 2 1 Non-smoker Smoker Respiratory disease Respiratory disease Present Absent Total Present Absent Total

6 dependent sample - example (continued) S Mc = ( ) = = = Under H o, S Mc χ 2 1 or, taking square root p-value = 2Pr(Z N > ) = 2(1 φ(2.65)) = 2(0.0040) = Hence at α = 0.05 conclude that smoking is related to respiratory disease

7 - derivation of test how does this work 1. think of factor a as defining two populations factor a present factor a absent 2. we have sample of size n from each population 3. when factor a is present r 1 of them have factor b present 4. when factor a is absent c 1 of them have factor b present 5. if π 1 is the probability of factor b when a present if π 2 is the probability of factor b when a is absent interested in π 1 π 2

8 - derivation of test II 1. π 1 π 2 estimated by r 1n c 1 n = (a+b) (a+c) n = b c n 2. test H o : π 1 π 2 = 0 3. under H o, any of b results have same probability as any of c results hence binomial (b+c,0.5) sign test this is exact test

9 - derivation of test III 1. normal approximation N((b +c)0.5,(b +c)(0.5) 2 ) ( ) p value = 2Pr Z N > b (b+c)0.5 (b+c)(0.5) 2 ( ) = 2Pr Z N > b c b+c 2. need continuity correction b c 1 b+c 3. more familiar form square to get S Mc = ( b c 1)2 b+c

10 - example (again) Exact test using binomial (b+c,0.5) Bin(24,0.5) p = 2Pr(X B 19) = 2Pr(X B 5) = 2 5 ( 24 ) i=0 i (0.5) 24 = 2( ) (from Appendix B) = 2(0.0032) = Normal approximation (pvalue=0.0080) was conservative

11 - estimation 1. Relative odds, φ 2. risk difference, δ π 3. Relative Risk (????)

12 -estimating Relative odds Derivation of estimator b +c discordant pairs binomial ( ) b +c Pr(b b +c;π) = π b (1 π) c b ˆπ = b b+c π can be written π = similarly ψ can be written π 1 (1 π 2 ) π 1 (1 π 2 )+π 2 (1 π 1 ) ψ = π 1(1 π 2 ) π 2 (1 π 1 )

13 -estimating Relative odds Derivation of estimator (continued) solving for π in terms of ψ π = = 1 1+ π 2(1 π 1 ) π 1 (1 π 2 ) 1 1+ψ 1 = ψ 1+ψ solving for ψ in terms of π so that (1+ψ)π = ψ ψ = π 1 π

14 -estimating Relative odds Derivation of estimator (continued again) ˆπ = b b+c so that ˆψ = b/(b +c) 1 [b/(b +c)] ˆψ = b c

15 estimating relative odds - confidence interval estimated standard error of ˆψ 1 OR b + 1 c approximate 100(1 α)% confidence interval (b+c ) b c ±z α/2 b c bc

16 estimating relative odds -confidence interval (continued) better approximation p = b/(b +c) proportion of pairs with discordancies person with the risk factor has the outcome present. p is observed proportion of successes in binomial(b + c, π) Var(P) = π(1 π) b+c estimate π with p, that is, b Var(P) ˆ = b = bc (b+c) 3 c b+c b+c b+c b+c

17 relative odds -confidence interval (again) P L = p 1 2(b+c) z bc α/2 (b+c) 3 bc (b+c) 3 P U = p + 1 2(b+c) +z α/2 transform back OR L = P L 1 P L OR U = P U 1 P U

18 estimating risk difference null hypothesis rejected previous formula for se no longer holds d p = b c n standard error (a+d)(b+c)+4bc se(d p ) = n n approximate 100(1 α)% confidence interval d p ±( 1 n +z α/2se)

19 paired samples - example (again) The odds ratio b/c = 19/5 = % confidence interval 3.80±1.96(3.80) 24 (19)(5) = 3.80±7.45(0.50) = (0.06,7.54) very approximate disagrees with the significance test can produce confidence intervals with negative values.

20 paired samples - example (continued) P L = = = ( ) = = (24) 1.96 (19)(5) 24 3 P U = = = = (24) (19)(5) 24 3

21 paired samples - example (continued again) OR L = = = OR U = / = 39.0 approximate interval is (1.55,39.0) 39.0 very large approximation may not be very good for this boundary.

22 paired samples - example (ongoing) Risk difference b c n = = 0.14 standard error, se (15+61)(19+5)+4(19)(5) = = approximate 95% confidence interval 0.14±( (0.0610)) = 0.14± that is, (0.0104,0.2696).

23 paired samples - example (wrong analysis) unmatched analysis for matched study Disease Smoker Present Zabsent Total Yes No Total

24 paired samples - example (unmatched analysis II) odds ratio 34(80) 66(20) = 2.06 much smaller than 3.80 from correct table S Y = 200 ( 34(80) 66(20) 200/2)2 100(100)(54)(146) = smaller than the McNemar test S Mc = p value = 2Pr(Z N > 4.287) = 2Pr(Z N > 2.071) = larger than calculated from correct table unmatched analysis provides weaker results. not as powerful

25 SAS program for title1 Matched pairs ; title2 McNemar test ; proc format; value disease 0 = yes 1 = no ; data mary; label ns = non smoker sm = smoker ; input ns sm freq; format ns disease.; format sm disease.;

26 SAS program for II datalines; ; proc freq ; tables sm*ns/agree norow nocol nopercent; exact mcnem; weight freq;

27 SAS program (continued title2 unmatched analysis ; data george; label disease = disease or not smoker = smoker or not ; input disease smoker freq; format disease disease.; format smoker disease.; datalines; ; proc freq ; tables smoker*disease/chisq norow nocol nopercent; weight freq;

28 SAS program output Matched pairs Table of sm by ns sm(smoker) ns(non smoker) Frequency yes no Total yes no Total McNemar s Test Statistic (S) DF 1 Asymptotic Exact Sample Size = 100

29 SAS program output II Matched pairs Unmatched analysis The FREQ Procedure Table of smoker by disease smoker(smoker or not) disease(disease or not) Frequency yes no Total yes no Total

30 SAS program output III Statistic DF Value Prob Chi-Square Likelihood Ratio Chi-Square Continuity Adj. Chi-Square Fisher s Exact Test Left-sided Pr <= F Right-sided Pr >= F Table Probability (P) Two-sided Pr <= P Sample Size = 200

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. risk factors 2. risk