Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

Similar documents
Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval

ST3241 Categorical Data Analysis I Two-way Contingency Tables. Odds Ratio and Tests of Independence

Testing Independence

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

STAT 705: Analysis of Contingency Tables

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Means or "expected" counts: j = 1 j = 2 i = 1 m11 m12 i = 2 m21 m22 True proportions: The odds that a sampled unit is in category 1 for variable 1 giv

2 Describing Contingency Tables

Ordinal Variables in 2 way Tables

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis

Statistics in medicine

Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval

3 Way Tables Edpsy/Psych/Soc 589

BIOS 625 Fall 2015 Homework Set 3 Solutions

Lecture 8: Summary Measures

Case-control studies C&H 16

Sections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Chapter 19. Agreement and the kappa statistic

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only

Lab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d

BIOMETRICS INFORMATION

STA6938-Logistic Regression Model

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Inference for Binomial Parameters

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007

Three-Way Contingency Tables

E509A: Principle of Biostatistics. GY Zou

Simple logistic regression

An introduction to biostatistics: part 1

CDA Chapter 3 part II

One-stage dose-response meta-analysis

Lecture 12: Effect modification, and confounding in logistic regression

Measures of Association and Variance Estimation

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Small n, σ known or unknown, underlying nongaussian

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.

Categorical Data Analysis Chapter 3

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

WORKSHOP 3 Measuring Association

Lecture 25: Models for Matched Pairs

Logistic Regression Analyses in the Water Level Study

Measures of Association for I J tables based on Pearson's 2 Φ 2 = Note that I 2 = I where = n J i=1 j=1 J i=1 j=1 I i=1 j=1 (ß ij ß i+ ß +j ) 2 ß i+ ß

ij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as

Reports of the Institute of Biostatistics

6 Applying Logistic Regression Models

SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ;

Unit 9: Inferences for Proportions and Count Data

Logistic regression: Miscellaneous topics

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

Unit 9: Inferences for Proportions and Count Data

Some comments on Partitioning

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

Statistics 3858 : Contingency Tables

Session 3 The proportional odds model and the Mann-Whitney test

MSUG conference June 9, 2016

Correlation and Simple Linear Regression

Statistical Methods in Clinical Trials Categorical Data

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Case-control studies

APPENDIX B Sample-Size Calculation Methods: Classical Design

Describing Contingency tables

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Topic 21 Goodness of Fit

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Lecture 3.1 Basic Logistic LDA

Appendix: Computer Programs for Logistic Regression

Chapter 4: Generalized Linear Models-I

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Statistics in medicine

Measuring relationships among multiple responses

Lecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Sociology 362 Data Exercise 6 Logistic Regression 2

Simultaneous Confidence Intervals for Risk Ratios in the Many-to-One Comparisons of Proportions

Meta-analysis of epidemiological dose-response studies

Longitudinal Modeling with Logistic Regression

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Collated responses from R-help on confidence intervals for risk ratios

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Discrete Multivariate Statistics

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

STAT 7030: Categorical Data Analysis

Review of One-way Tables and SAS

Meta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health

Logistic Regression - problem 6.14

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

13.1 Categorical Data and the Multinomial Experiment

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

Chapter 11: Models for Matched Pairs

Marginal Screening and Post-Selection Inference

Multinomial Logistic Regression Models

More Statistics tutorial at Logistic Regression and the new:

Chapter 11: Analysis of matched pairs

Page: 3, Line: 24 ; Replace 13 Building... with 13 More on Multiple Regression. Page: 83, Line: 9 ; Replace The ith ordered... with The jth ordered...

Transcription:

Epidemiology 9509 Wonders of Biostatistics Chapter 13 - Effect Measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario

What is being covered 1. risk factors 2. risk differences 3. relative odds - odds ratio 4. relative risk - risk ratios

Risk factors factor which can lead to (bad) outcome Since risk and outcome are binary can think of risk as probability of presence of risk factor leading to bad outcome hence smoking is a risk factor for the outcome respiratory disease think of risk at two levels of smoking smokers, π 1 and non-smokers, π 2

Risk differences differences in risk for two levels of risk factor δ π = π 1 π 2 have already considered this 1. test of hypothesis (two-sided alternative) 1.1 Fisher exact test 1.2 test of association/independence with continuity correction S Y 2. test of hypothesis (one-sided alternative) 2.1 Fisher exact test 2.2 test of association/independence S Y 3. estimation 3.1 Wilson/Adjusted Wald estimators for π 1,π 2 3.2 then Newcombe combination of these two into estimator for π 1 π 2

Relative Odds odds ω = π 1 1 π 1 eg π 1 = 0.6, so that (1 π) = 0.4 odds ω = 1.5 often quoted as 3:2 relative odds φ = ω 1 ω 2 relative odds for group 1 compared to group 2 eg π 2 = 0.5, so (1 π 2 ) = 0.5 odds ω 2 = 1(1 : 1) relative odds φ = (1.5)/(1) = 1.5

Odds ratio - estimating the Relative Odds odds o i = p i 1 p i eg p 1 = 0.6, so that (1 p 1 ) = 0.4 odds o 1 = 1.5 often quoted as 3:2 odds ratio OR = o 1 o 2 odds ratio for group 1 compared to group 2 eg p 2 = 0.5, so (1 p 2 ) = 0.5 odds o 2 = 1(1 : 1) odds ratio OR = (1.5)/(1) = 1.5

shortcut computation of Odds Ratio if entries in 2x2 contingency table a, b, c,d p 1 = a/(a+b) p 2 = c/(c +d) so that o 1 = a a+b / b a+b = a b o 2 = c c+d / d c+d = c d then OR = o 1 o 2 = a b /c d = ad bc

Inference - test of hypothesis test of φ = 1 ie of ω 1 = ω 2 ie of π 1 = π 2 1. Fisher exact test 2. test of association S Y

inference - confidence interval can use odds ratio, OR, to estimate φ, the relative odds need standard error (1 se(or) = OR a + 1 b + 1 c + 1 ) d useful only for very large samples example, a=15, b=8, c=10,d=12 OR = ad bc = 15(12) 8(10 = 2.25 ( se(or) = OR 1 15 + 1 8 + 1 12 + 1 10 = 2.25 0.375 = 2.25(0.6124) = 1.3793 )

confidence interval (continued) 95% Confidence interval (2.25 ± 1.96(1.3793) = 2.25±2.70 = ( 0.45, 4.95) a very strange interval

confidence interval (better) use l = log(or) and its se l = log(or) = log(2.25) = 0.811 (1 se(l) = a + 1 b + 1 c + 1 ) d ( = 1 15 + 1 8 + 1 12 + 1 ) 10 = 0.6124 95% CI 0.811 ± 1.96(0.6124) = 0.811±1.200 = ( 0.389, 2.011) transform back (exponentiate) (0.68, 7.47)

Relative Risk if π 1 and π 2 are risks Relative Risk is π 1 π 2 if p 1 and p 2 are observed proportions Risk Ratio: RR = p 1 p 2 is point estimator of Relative Risk for example, for a,b,c,d p 1 = a a+b,p 2 = c c+d RR = a a+b / c c+d example RR = 15 = 1.4348 23 /12 22

Relative Risk: test of hypothesis test of H o : Relative Risk = 1 ie H o : π 1 π 2 = 1 can be rewritten as H o : π 1 = π 2 same hypothesis as for Risk Difference Hence use same tests: 1. Fisher s Exact Test 2. S Y, Yates continuity-corrected version of Pearson test

confidence interval for Relative Risk again using RR ± 1.96se(RR) produces strange interval for small samples use l RR = log(rr) and its standard error (1 p1 se(l RR ) = ) n 1 p 1 + 1 p 2 n 2 p 2 for contingency tables entries a,b,c,d ( ) se(l RR ) = b a(a+b) + d c(c+d)

example of RR estimation RR = 1.4388 l RR = log(1.4288) = 0.3610 ( ) se(l RR ) = 8 15(23) + 12 10(22) = 0.23188+0.54545 = 0.77733 95% confidence interval 0.3610±1.96 0.77733 = 0.3610±0.54646 = ( 0.18545, 0.90742) exponentiate to get 95% CI for relative risk (0.831,2.478)

summary of estimates Parameter point estimate interval estimate Risk difference 0.198 (-0.088,0.442) Relative odds 2.250 (0.68,7.47) Relative risk 1.435 (0.831,2.478)

SAS for effects title advanced contingency table ; DATA marj; INPUT r o freq; DATALINES; 0 0 15 0 1 8 1 0 10 1 1 12 ; PROC FREQ; WEIGHT freq; TABLES r*o/chisq RISKDIFF RELRISK NOROW NOCOL NOPERCENT; add RELRISK to get estimated of Relative odds AND Relative Risk

Output of SAS effects program The FREQ Procedure Table of r by o r o Frequency 0 1 Total 0 15 8 23 1 10 12 22 Total 25 20 45 Statistics for Table of r by o Statistic DF Value Prob ----------------------------------------------- Chi-Square 1 1.7787 0.1823 Likelihood Ratio Chi-Square 1 1.7900 0.1809 Continuity Adj. Chi-Square 1 1.0683 0.3013 Mantel-Haenszel Chi-Square 1 1.7391 0.1872 Phi Coefficient 0.1988 Contingency Coefficient 0.1950 Cramer s V 0.1988

Output of SAS effects program II Fisher s Exact Test ----------------------------------- Cell (1,1) Frequency (F) 15 Left-sided Pr <= F 0.9493 Right-sided Pr >= F 0.1507 Table Probability (P) 0.1000 Two-sided Pr <= P 0.2362

Output of SAS effects program III column 1 Risk Estimates (Asymptotic)95% Exact) 95% Risk ASE Confid Limits Confid Limits --------------------------------------------------- Row 1 0.6522 0.0993 0.4575 0.8468 0.4273 0.8362 Row 2 0.4545 0.1062 0.2465 0.6626 0.2439 0.6779 Total 0.5556 0.0741 0.4104 0.7007 0.4000 0.7036 Difference 0.1976 0.1454-0.0873 0.4825 Column 2 Risk Estimates (Asymptotic)95% Exact) 95% Risk ASE Confid Limits Confid Limits --------------------------------------------------- Row 1 0.3478 0.0993 0.1532 0.5425 0.1638 0.5727 Row 2 0.5455 0.1062 0.3374 0.7535 0.3221 0.7561 Total 0.4444 0.0741 0.2993 0.5896 0.2964 0.6000 Difference-0.1976 0.1454-0.4825 0.0873

Output of SAS effects program IV Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confid Limits ---------------------------------------------- Case-Control (Odds Ratio) 2.2500 0.6775 7.4720 Cohort (Col1 Risk) 1.4348 0.8307 2.4780 Cohort (Col2 Risk) 0.6377 0.3239 1.2553 Sample Size = 45