Clustered Measurement in Cluster Randomized Trials

Size: px

Start display at page:

Download "Clustered Measurement in Cluster Randomized Trials"

Antony Robbins
5 years ago
Views:

1 in Cluster Randomized Trials Robert W. Department of Epidemiology, Biostatistics, and Occupational Health Department of Pediatrics University June 2009

2 Outline Background Cluster RCTs Intra-cluster correlation Measurement in cluster RCTs Case Study: PROBIT PROBIT: design and analysis Results of 6.5 yr follow-up Model to adjust for clustered measurement Model Conclusions and future work Conclusions

3 Cluster RCTs Cluster RCTs Clusters (hospitals, schools, communities) randomized to intervention/control Very useful for group-level interventions Useful for individual-level interventions when contamination is a problem Contact between subjects randomized to different groups can create overlap and bias towards the null

4 Intra-cluster correlation Clustering Disadvantage: individuals within same group are correlated Similar populations Similar intervention delivery Exposed to similar co-interventions Intra-cluster correlation (ICC) measures degree of correlation between cluster members σ2 b ICC ρ = where σ 2 is within cluster variance, σ 2 σ 2 +σb 2 b between cluster variance.

5 Intra-cluster correlation ICC ρ = σ2 b σ 2 +σ 2 b ρ = 0 (i.e., σb 2 = 0) means no correlation (independent observations) ρ = 1 (i.e., σ 2 = 0) means complete correlation (within-cluster observations provide no additional info) ρ measures degree to which individual (within-cluster) observations represent independent observations

6 Measurement in cluster RCTs Measurement Issues In most studies, measurement is clustered Observer - eg IQ measures Instrument - eg blood pressure/poorly calibrated sphygmomanometers If inter-observer variability, will result in correlation between observations measured by same observer (i.e., σb 2 > 0, and ρ > 0

7 Measurement in cluster RCTs Double Jeopardy Cluster trials: often a single observer per cluster Often prohibitive (location, time) to have single observer for all clusters Problem: observer-driven clustering and inherent clustering are indistinguishable Additive effect of clustering can cause substantial problems

8 PROBIT: design and analysis Design Randomized cluster trial of breastfeeding promotion intervention 31 hospitals in Belarus randomized to promotion or standard care, 17,046 subjects. Original outcomes: infection, growth in first year of life. Significant effect on GI infection; trend for respiratory infection Growth similar in intervention vs. control group

9 PROBIT: design and analysis Analyses Originally matched by centre Analysis to be based on pair-matched centres Data problems, dropouts (two centres) broke matching Revised analysis plan: linear/generalized linear mixed models accounting for clustering by hospital And repeated measures by subject (for some outcomes) Adjusted for cluster-level covariates

10 PROBIT: design and analysis PROBIT - Audit data Audit done for QC/validation purposes Aside: audit detected significant data problems at one site; site excluded from further analyses Design Small number of subjects per site (5-10) Single observer across ALL sites

11 PROBIT: design and analysis Long-term follow up Children followed to age 6.5 (± 0.5) Outcomes: IQ, school performance, behaviour, anthropometry, allergy, dental health Significant effect on verbal IQ; null on most other outcomes Some ICCs MUCH higher than expected ( vs ). Substantial impact on statistical power

12 PROBIT: design and analysis Future follow up Pre-puberty (9-11 yrs old) follow up underway Anthropometry, blood draws, metabolic syndrome.

13 PROBIT: design and analysis IQ measurement Single observer (trained pediatrician) assessed all children at each site Pediatricians were trained and calibrated by a central team Didn t work...

14 Results of 6.5 yr follow-up BMI - almost no clustering?

15 Results of 6.5 yr follow-up Triceps skinfold - some clustering

16 Results of 6.5 yr follow-up IQ measurement - evidence of clustering

17 Results of 6.5 yr follow-up Selected results at 6.5 yrs ITT analysis Mean Outcome (Units) Exp Ctl ICC Adj (95% CI) BMI (kg/m2) (-0.2, 0.3) Tr skinfold (mm) (-1.8, 1.0) Verbal IQ (0.8, 14.3) Note: ICC varies substantially ICC of 0.31 implausible based on expected socioeconomic/cultural/genetic clustering ICC appears to be correlated with degree of observer influence

18 Model Random effects models for clustered data where Y ij = β 0 + β 1 T + b i + ǫ ij (1) T = 1 for the intervention group, 0 for the control group, b i and ǫ ij error terms at the cluster and individual level. Assume b i and ǫ ij normally distributed and independent of each other, with variances σb 2 and σ2 ρ = σ2 b σ 2 + σb 2, (2)

19 Model Clustered measurement where Y ij = β 0 + β 1 T + b i + d i + ǫ ij (3) d i random effect due to measurement (variance σ 2 d ) b i and d i are not directly identifiable at the individual level. Assume b i and ǫ ij normally distributed and independent of each other, with variances σb 2 and σ2 ρ c = σ2 b + σ2 d σ 2 + σb 2 +. (4) σ2 d

20 Model Problem... Observed data generated by equation 3 but what if n i subjects per cluster, n i N i, are measured by a single auditor for all sites? And we assume these audited measures come from equation 1 (i.e, σ 2 d = 0) Y ija = β 0 + β 1 T + b i + ǫ ij (5) This was done in PROBIT!

21 Model Solution Bayesian inference Problem is akin to missing data Bayesian methods (using WinBUGS) easily implemented for these problems Set up: n i subjects per cluster have both Y ij and Y ija ; N i n i have Y ij only. For the latter, Y ija is a missing value Inference - Details Diffuse priors on all parameters Model relatively straightforward to fit

22 Model IQ Results - Summary Parameter Mean SD Credible Int. β (96.88, ) β (1.121,11.19) σ (197.3, 206.7) σb (20.88, 98.89) σd (16.08, 86.46) ˆρ c = 0.31, ˆρ = 0.20 About half of the ICC is due to clustered measurement! Main effect - adjusted result is shifted towards null, but 95% CI narrower.

23 Conclusions Conclusions In cluster RCTs clustered measurement matters Calibration of observers across sites is critical Multiple observers per site, or single observer for all sites, can help Having some observers test subjects at multiple sites can help Allows identification of σ 2 b, σ 2 d Additional data can help

24 Conclusions Future work How bad can it be, and how much can extra data help? Test other outcomes Theoretical results? Assume ni, σb 2, σ2 d. Can we compute efficiency? Simulations? Design considerations - how to design the next cluster trial? Efficient use of observers for a fixed cost, or minimize cost for a fixed variance

25 Conclusions Thanks Michael Kramer and the PROBIT study investigators and funders Richard Martin Jonathan Sterne Stan Shapiro

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline