Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013
Overview Marginal vs Conditional What is TMLE? Key Estimation procedure Some test simulations Hidden assumptions Hidden Statistical Properties
Brief Review: Time-to-event outcomes S(t) = Pr(T > t) = 1 F (t) Pr[t T < t + t T t] λ(t) = lim t 0 t Event of Interest: Infection/AE at clinic visit Right censored data: Non-ignorable missing data.
Discrete Failure Time (Zhang and Gilbert, 2010) T T 0 Start 1 5 6 T 10 End 12.5 T λ(t j ) = Pr(T = t j T > t j 1 ) T : True failure time (Unobserved due to discrete follow-up) T = t j if T [t j 1, t j ) with t j for j = 1,, 10. T = min(t, C): where C is our censoring time. = I (T C) : Indicator of subject not being censored
Objectives (Moore and van der Laan, 2009b) Estimate marginal treatment specific survival at a fixed end point Use covariates to gain efficiency Provide a consistent estimator in the presence of informative censoring
Question we want to address Approximate Scientific question: Pr(T 1 > t k ) Pr(T 0 > t k ) Event of Interest: First record of adverse event reported Simplest analysis: Kaplan Meier Survival curves Assumptions: Random censoring Adjusted analysis: Cox-PH or Logistic regression (More assumptions)
RCT: Marginal or Conditional Scientific question: Pr(T > t k A = 1) Pr(T > t k A = 0) Marginal P(T > t A = a) = S 0 (t A = a) Estimated probability of survival past time t for treatment a for the entire population. Conditional P(T > t A = a, W ) = S 0 (t A = a, W ) Estimated probability of survival past time t for treatment a while holding W fixed.
What does TMLE estimate Let Ψ : M R is pathwise differentiable at any density p 0 M Marginal Conditional {}}{{}}{ Ψ 1 (p 0 )(t k ) = Pr(T 1 > t k ) = E W [ S 0 (t k A = 1, W)] Ψ 0 (p 0 )(t k ) = Pr(T 0 > t k ) = E W [S 0 (t k A = 0, W]) Ψ AD (p 0 )(t k ) = Ψ 1 (p 0 )(t k ) Ψ 0 (p 0 )(t k ) Consider the treatment group A = 1, Pr(T 1 > t k W) = S 0 (t k A = 1, W): Probability of surviving beyond t k when treatment is 1 given the covariates W. Pr(T 1 > t k ) = E[S 0 (t k A = 1, W)]: Probability of surviving beyond t k when treatment is 1 averaging over the covariates W.
Example: Random treatment assignment of 0.5 W U(0.2, 1.2) λ(t A, W ) = expit( 3 A + W 2 )I (t < 10) + I (t = 10)
Idea behind TMLE (Van Der Laan, 2011) -.)%#/%'('"&"( #"0',+( /"#*".1%)!"#$%&(2"#"+%&%#( +"2 O1,...,On Ψ()!"#$%& 50*&*"1(%)&*+"&,#(,6( &7%(2#,.".*1*&4( '*)&#*.3&*,0(,6(&7%( '"&" P 0 n!"#$%&%'(%)&*+"&,#(,6(&7%(2#,.".*1*&4( '*)&#*.3&*,0(,6(&7%( '"&"!#3%(2#,.".*1*&4( '*)&#*.3&*,0 P0 P n &%'%!&%!(')* +,-.) :%&(,6(2,))*.1%( 2#,.".*1*&4( '*)&#*.3&*,0)(,6( &7%('"&" 50*&*"1(%)&*+"&,# Ψ(P 0 n) Ψ(P0) Ψ(P n)!"#$%&%'(%)&*+"&,#!#3%(/"13%( 8%)&*+"0'9(,6( &"#$%&( 2"#"+%&%# /')$.&*,0* %'12.%* #'1'+.%.1 ;"13%)(+"22%'( &,(&7%(#%"1(1*0%( <*&7(.%&&%#( %)&*+"&%)(=1,)%#( &,(&7%(&7 Fig. 5.1 TMLE flow chart.
Likelihood representation for 1 observation W: Covariates A: Treatment ( T, ): Time and event indicator Ḡ(t A, W) = Pr(C t A, W)(Moore and van der Laan, 2009a) S(t k A, W ) = t t k [1 λ(t A, W )] P 0 (O) = Pr(W, A, T, ) = Pr( T, W, A)Pr(W)Pr(A W) Q 20 {}}{ [ ] t k 1 δ [ tk ] 1 δ = λ(t k A, W) (1 λ(t k A, W)) (1 λ(t k A, W)) t=1 t=1 Ḡ(t A, W) δ Pr(C = t k A, W) 1 δ Pr(W) Pr(A W) }{{}}{{}}{{} g 20 Q 10 g 10
Key Results for TMLE: Doubly robust Q 0 Q 10 : Distribution of the baseline covariates Q 20 : Conditional distribution of the hazard given treatment and baseline covariates g 0 g 10 : Treatment mechanism g 20 : Conditional distribution of the censoring distribution given treatment and baseline covariates Either Q 0 or g 0 is correct, then TMLE is consistent
TMLE as Plug-in estimators 1. Estimate the ĝ 0 (A = 1 W ) = 1 n n i=1 A i 2. Estimate the conditional probability of censoring Ḡ 0 (t A, W ) 3. Estimate the ˆλ 0 (t A, W ) via logistic regression logit[ˆλ 0 (t A, W )] = K α i I (t = i) + β A A + β W W i=1 4. Update the above model using ˆλ 0 (t A, W ) by including the penalty ɛ = {ɛ 0, ɛ 1 } and the clever covariate ĥ 0 (t, A, W ) = {ĥ 0, ĥ 1 } logit[ˆλ 1 (t A, W )] = logit[ˆλ 0 (t A, W )] + ɛ T ĥ 0 (t, A, W ) I (A = i)i (t t k ) Ŝ(t k A, W ) ĥ i (t, A, W ) = ĝ 0 (A = i W )Ḡ 0 (t A, W ) Ŝ(t A, W )
TMLE as Plug-in estimators continued 5. Use current estimate of λ 1 (t A, W ) to update ĥ 1 (t, A, W ) Ŝ(t k A, W ) = [1 λ 1 (t A, W )] t t k 6. Iterate 4 & 5 until ɛ 0 7. Plug-in final estimate of Ŝ i (t k A = i, W ) for i = 0, 1 Ψ 1 (p 0 )(t k ) = 1 n Ψ 0 (p 0 )(t k ) = 1 n n Ŝ (t k A = 1, W i ) i=1 n Ŝ (t k A = 0, W i ) i=1
Test simulations with no covariates Ta Da!!! Ta Da!!! Probability of Survival 0.0 0.2 0.4 0.6 0.8 1.0 Group 0 Group 1 Probability of Survival 0.0 0.2 0.4 0.6 0.8 1.0 Group 0 Group 1 Est 0 Est 1 0 2 4 6 8 10 0 2 4 6 8 10 Time of record of Adverse Event Time of record of Adverse Event
Test Simulations for weak covariates Elixir of life Probability of Survival 0.5 0.6 0.7 0.8 0.9 1.0 Group 0 Group 1 Est 0 Est 1 0 2 4 6 8 10 Time of record of Adverse Event Simulations W U(0.2, 1.2); A Bin(0, 1 2 ) λ(t A, W ) = expit( 3 A + W 2 )I (t < 10) + I (t = 10)
Problem Survival function never increases over time. Acknowledgements: Taken from bartsblackboard.com
Next steps Compute the variance/bootstrap Rerun simulations Fill in the gaps in the paper Extensions
References Moore, K. and van der Laan, M. (2009a). Increasing power in randomized trials with right censored outcomes through covariate adjustment. Journal of biopharmaceutical statistics, 19(6):1099 1131. Moore, K. L. and van der Laan, M. J. (2009b). Application of time-to-event methods in the assessment of safety in clinical trials (chapter 20). In Peace, K. E., editor, Design and Analysis of Clinical Trials with Time-to-Event Endpoints, pages 455 482. Chapman and Hall/CRC; Biostatistics Series. Stitelman, O. and van der Laan, M. (2010). Collaborative targeted maximum likelihood for time to event data. The international journal of biostatistics, 6(1):Article 21. Van Der Laan, M. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data. Springer. Zhang, M. and Gilbert, P. B. (2010). Increasing the efficiency of prevention trials by incorporating baseline covariates. Statistical communications in infectious diseases, 2(1).
Thinking in counterfactuals You take the red pill - you stay in Wonderland and I show you how deep the rabbit hole goes. You take the blue pill - the story ends, you wake up in your bed and believe whatever you want to believe. Morpheus, The Matrix
Hidden Assumptions: Thinking in counterfactuals For an individual in the placebo arm... IC 0t k (p O ) = t t k [I ( T = t, = 1) I ( T t)λ(t A = 0, W )]h 0 (t, A, W ) + S 0 (t k A = 0, W ) Ψ 0 (p O )(t k ) I (A = i) h i (t, A, W ) = g(1)ḡ(t A, W ) S(t k A, W ) S(t A, W ) I (t t k)
Key assumptions 1: Coarsening at random Definition of CAR (Stitelman and van der Laan, 2010) Coarsened data structures are data structures where the full data is not observed. Coarsening mechanism is only a function of the full data, i.e. the data in which you would have seen all counterfactuals, through the observed data Implications dp 0 (O) = Q 0 (O)g 0 (O X ) Q 0 is the density associated with full data g 0 contains the censoring and treatment mechanism.
Statistical definitions M is the set of possible probability distribution of O with probability distribution P 0 M where M is dominated by common measure µ. Hence, density p = dp 0 dµ O 1,, O n are n i.i.d realizations of O. O 1,, O n can be represented by the empirical probability distribution P n placing mass 1/n on each of the n observations.
Statistical properties TMLE uses solves the efficient Influence curve. IDEA: If we can linearize our estimator Linearity of estimator An estimator ψ n is an asymptotically linear estimator of a parameter ψ if ψ n ψ = 1 n n i=1 IC P (O i ) + o P ( 1 n ) where the influence curve IC P (O) has expectation 0 and finite variance i.e. IC P (O) L 2 0 (P) and moreover this should hold for all P M.
Efficient Influence curve Asymptotic Distribution The asymptotic distribution (via linearity of the influence curve) n ( ˆψ (t k ) Ψ(p 0 )(t k )) N(0, σ 2 ) Estimate σ 2 by empirical variance ˆσ 2 = 1 n n IC P (O i ) 2 i=1 Implications Compute Wald-like confidence intervals ˆψ (t k ) ± 1.96 ˆσ2 n