A Sampling of IMPACT Research:

Similar documents
Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs. Anastasios (Butch) Tsiatis and Marie Davidian

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Modification and Improvement of Empirical Likelihood for Missing Response Problem

A Robust Method for Estimating Optimal Treatment Regimes

Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012)

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Covariate Balancing Propensity Score for General Treatment Regimes

Extending causal inferences from a randomized trial to a target population

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

5 Methods Based on Inverse Probability Weighting Under MAR

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Improved Doubly Robust Estimation when Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout

7 Sensitivity Analysis

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

Causal Inference Basics

arxiv: v1 [stat.me] 15 May 2011

This is the submitted version of the following book chapter: stat08068: Double robustness, which will be

High Dimensional Propensity Score Estimation via Covariate Balancing

Combining multiple observational data sources to estimate causal eects

2 Naïve Methods. 2.1 Complete or available case analysis

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Targeted Maximum Likelihood Estimation in Safety Analysis

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Methods for Interpretable Treatment Regimes

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

Supplement to Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes

Using Estimating Equations for Spatially Correlated A

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Harvard University. Harvard University Biostatistics Working Paper Series

6 Pattern Mixture Models

Lecture 2: Constant Treatment Strategies. Donglin Zeng, Department of Biostatistics, University of North Carolina

Robustness to Parametric Assumptions in Missing Data Models

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Bootstrapping Sensitivity Analysis

Some methods for handling missing values in outcome variables. Roderick J. Little

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Propensity Score Weighting with Multilevel Data

Integrated approaches for analysis of cluster randomised trials

Missing Data: Theory & Methods. Introduction

Biomarkers for Disease Progression in Rheumatology: A Review and Empirical Study of Two-Phase Designs

ST 790, Homework 1 Spring 2017

A weighted simulation-based estimator for incomplete longitudinal data models

arxiv: v2 [stat.me] 17 Jan 2017

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Comment: Understanding OR, PS and DR

Estimating the Marginal Odds Ratio in Observational Studies

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes

Statistical Analysis of Binary Composite Endpoints with Missing Data in Components

PSC 504: Dynamic Causal Inference

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN

Q learning. A data analysis method for constructing adaptive interventions

Simulation-Extrapolation for Estimating Means and Causal Effects with Mismeasured Covariates

Survival Analysis for Case-Cohort Studies

arxiv: v1 [math.st] 29 Jul 2014

Longitudinal analysis of ordinal data

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Sample size considerations for precision medicine

Discussion of Papers on the Extensions of Propensity Score

Package Rsurrogate. October 20, 2016

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Adaptive Trial Designs

University of Michigan School of Public Health

Group Sequential Designs: Theory, Computation and Optimisation

Sample Size and Power Considerations for Longitudinal Studies

Models for binary data

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Causal modelling in Medical Research

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Gov 2002: 13. Dynamic Causal Inference

Causal inference in epidemiological practice

Estimating direct effects in cohort and case-control studies

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Global Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach

Empirical likelihood methods in missing response problems and causal interference

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to mtm: An R Package for Marginalized Transition Models

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Classification. Chapter Introduction. 6.2 The Bayes classifier

Introduction An approximated EM algorithm Simulation studies Discussion

Longitudinal + Reliability = Joint Modeling

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

MISSING or INCOMPLETE DATA

Robustifying Trial-Derived Treatment Rules to a Target Population

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Comparing Group Means When Nonresponse Rates Differ

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Power and Sample Size Calculations with the Additive Hazards Model

University of California, Berkeley

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Targeted Group Sequential Adaptive Designs

RANDOMIZATIONN METHODS THAT

Transcription:

A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/ davidian Sampling of IMPACT Research 1

Outline Overview: Projects 2 and 5 Methods for primary and longitudinal analyses in the presence of dropout Identifying optimal treatment regimes from a restricted, feasible set Computational Resource and Dissemination Core Sampling of IMPACT Research 2

Overview IMPACT: P01 Program Project grant from NCI Five research projects Three cores Focus here: Research being carried out in two of the projects Project 2 : Methods for Missing and Auxiliary Covariates in Clinical Trials Project 5 : Methods for Discovery and Analysis of Dynamic Treatment Regimes Of necessity : Simplest cases Sampling of IMPACT Research 3

Project 2 Specific aims: 1. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates 2. Methods for primary and longitudinal analyses in the presence of drop-out 3. Diagnostic measures for longitudinal and joint models in the presence of missing data 4. Inference for sensitivity analyses of missing data Sampling of IMPACT Research 4

Doubly robust methods in the presence of dropout Motivation: Subject drop-out is commonplace in clinical trials Particularly problematic in studies of longitudinal markers, e.g., QOL measures, biomarkers Monotone pattern of missingness Missing at random (MAR): Probability of drop-out depends only on information observed prior to drop-out Likelihood methods: Do not require specification of drop-out mechanism but do require correct full data model Inverse weighted methods : Do not require full data model but do require correct drop-out model Doubly robust methods: Require both, but only one need be correct Sampling of IMPACT Research 5

Doubly robust methods in the presence of dropout Doubly robust methods: Obvious appeal But usual doubly robust methods can exhibit disastrous performance under slight model misspecification (Kang and Schafer, 2007) Goal: Can alternative doubly robust methods be developed that do not suffer this shortcoming? Sampling of IMPACT Research 6

The simplest setting Clinical trial: Outcome Y, interested in µ = E(Y ) Full data: (Y i, X i ), i = 1,...,n, iid, X i = baseline covariates for subject i But Y i is missing for some i (e.g., due to drop-out) Observed data: (R i, R i Y i, X i ), i = 1,...,n, iid, R i = I(Y i observed) MAR assumption: R i Y i X i, implies µ = E(Y ) = E{E(Y X)} = E{E(Y R = 1, X)} (1) Sampling of IMPACT Research 7

Estimators for µ Outcome regression estimator: MAR (1) suggests positing a model m(x, β) for E(Y X) n µ OR = n 1 m(x i, β) i=1 for some β By MAR (1), can use complete cases with R i = 1; e.g. least squares n i=1 R i {Y i m(x i, β)} m β (X i, β) = 0, m β (X, β) = m(x i, β) β µ OR consistent for µ if m(x, β) is correct Sampling of IMPACT Research 8

Estimators for µ Inverse propensity score weighted estimator: Propensity score P(R = 1 X) If π(x) is the true propensity score, by MAR n 1 n i=1 R i Y i π(x i ) p µ Posit a model π(x, γ), estimate γ by ML on (R i, X i ), i = 1,...,n µ IPW = n 1 R i Y i π(x i, γ) µ IPW consistent for µ if π(x, γ) is correct Sampling of IMPACT Research 9

Semiparametric theory Robins et al. (1994): If the propensity model is correct, with no additional assumptions on the distribution of the data All consistent and asymptotically normal estimators are asymptotically equivalent to estimators of the form n { 1 R i Y i π(x i, γ) + R } i π(x i, γ) h(x i ) for some function h(x) π(x i, γ) Optimal h(x) leading to smallest variance (asymptotically ) is h(x) = E(Y X) Suggests modeling E(Y X) by m(x, β), estimating β, and estimating µ by n { 1 R i Y i π(x i, γ) R } i π(x i, γ) m(x i, π(x i, γ) β) (2) Sampling of IMPACT Research 10

New perspective Double robustness: DR Such estimators are consistent for µ if either model is correct Kang and Schafer (2007): Simulation scenario where the usual DR estimator of form (2) with β estimated by least squares is severely biased and inefficient when m(x, β) and π(x, γ) are only slightly misspecified or some π(x i, γ) are close to 0 µ OR performed much better, even under misspecification of m(x, β) Key finding: With DR estimators, the method for estimating β matters The method that is best for estimating β is not best for estimating µ Instead : Find an estimator for β that minimizes the (large sample) variance of DR estimators of form (2)... Sampling of IMPACT Research 11

New perspective Idea: Assume π(x) fixed (no unknown γ) and consider estimators n { 1 R i Y i π(x i ) R } i π(x i ) m(x i, β) indexed by β (3) π(x i ) If π(x) is correct but m(x, β) may not be, all estimators of form (3) are consistent with asymptotic variance [{ } ] 1 π0 (X) var(y ) + E {Y m(x, β)} 2 (4) π 0 (X) Minimize (4) in β = β opt satisfies [{ } ] 1 π0 (X) E {Y m(x, β opt )}m β (X, β opt ) π 0 (X) = 0 (5) Find an estimator β p β opt under these conditions and β p true β 0 if m(x, β) is correct even if π(x) is not Sampling of IMPACT Research 12

New perspective Result: Instead of estimating β by least squares solving n R i {Y i m(x i, β)}m β (X i, β) = 0, i=1 estimate β by a form of weighted least squares solving n { } 1 π(xi ) R i π 2 {Y i m(x i, β)}m β (X i, β) = 0 (6) (X i ) i=1 Estimating equation (6) has expectation (5) when π(x) is correct The resulting β satisfies the required conditions Can be generalized to case of π(x, γ) with γ (ML) All this extends to more general µ (e.g., treatment effect) Sampling of IMPACT Research 13

New perspective Details: My website and Cao, W., Tsiatis, A.A. and Davidian, M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika 96, 723 734. The DR estimator using this β greatly improved on the usual DR estimator and exhibited superior performance (to µ OR ) in the Kang and Schafer and other scenarios Sampling of IMPACT Research 14

Longitudinal study Extension: Longitudinal study with drop-out Ideally : Collect data L j at time t j, j = 1,...,M + 1 Full data: L = L M+1 = (L 1,...,L M+1 ) Dropout: If subject is last seen at time t j, dropout indicator D = j, observe only L j = (L 1,...,L j ) Observed data: iid (D i, L Di ), i = 1,...,n Interest : Parameter µ in a semiparametric model for the full data Full data estimator for µ : Solve n ϕ(l i, µ) = 0, E{ϕ(L, µ)} = 0 i=1 MAR: pr(d = j L) depends only on L j, j = 1,...,M + 1 Drop-out model : pr(d = j L) = π(j, L j ), π(m + 1, L) = π(l) Sampling of IMPACT Research 15

Longitudinal study If drop-out model correct: All consistent and asymptotically normal estimators for µ solve n I(D i = M + 1)ϕ(L i, µ) M dm ji (L ji ) + π(l i ) K ji (L ji ) L j(l ji ) = 0 i=1 j=1 dm ji (L ji ), K ji (L ji ) are functions of π(j, L j ) These estimators are DR Optimal L j (L j ) = E{ϕ(L, µ) L j }; model by L j (L j, β), j = 1,...,M Result: Can derive optimal estimator for β by analogy to the previous Tsiatis, A.A., Davidian, M. and Cao, W. (2011). Improved doubly robust estimation when the data are monotonely coarsened, with application to longitudinal studies with dropout. Biometrics 67, 536 545. Sampling of IMPACT Research 16

Project 5 Specific aims: 1. Learning methods for optimal dynamic treatment regimes 2. Identifying optimal dynamic treatment regimes from a restricted, feasible set 3. Inferential methods for dynamic treatment regimes 4. Design of sequentially randomized trials for dynamic treatment regimes Sampling of IMPACT Research 17

Optimal treatment regimes from a feasible set Motivation: Individualized (personalized ) treatment Premise : Different subgroups of patients may respond differently to treatments Treatment decisions tailored to individual patients based on their characteristics, disease status, medical history, etc Ideally : Use all relevant information in decision rules Realistically : Use a key subset of information feasibly collected in clinical practice, simple-to-implement, interpretable decision rules Goal: Methods for estimating such feasible dynamic treatment regimes from data from clinical trials or observational databases Sampling of IMPACT Research 18

The simplest setting A single decision: Two treatment options Observed data: (Y i, X i, A i ), i = 1,...,n, iid Y i outcome, X i baseline covariates, A i = 0, 1 Treatment regime: A function g : X {0, 1} Simple example : g(x) = I(X 50) g G, the class of all such regimes Optimal regime : If followed by all patients in the population, would lead to best average outcome among all regimes in G Sampling of IMPACT Research 19

Potential outcomes Formalize: Y (1) = outcome if patient were to receive 1; similarly Y (0) Thus, E{Y (1)} is the average outcome if all patients in the population received 1; similarly E{Y (0)} Assume we observe Y = Y (1)A + Y (0)(1 A) Assume Y (0), Y (1) A X (no unmeasured confounders); automatic in a randomized trial = E{Y (1)} = E{ E(Y A = 1, X) }; similarly E{Y (0)} For any g G, define Y (g) = Y (1)g(X) + Y (0){1 g(x)} (1) Optimal regime : Leads to largest E{Y (g)} among all g G; i.e., g opt (X) = arg max g G E{Y (g)} Sampling of IMPACT Research 20

Optimal regime [ ] (1): E{Y (g)} = E E(Y A = 1, X)g(X) + E(Y A = 0, X){1 g(x)} g opt (X) = I{E(Y A = 1, X) E(Y A = 0, X) 0} Thus: If E(Y A, X) is known can find g opt Posit a model µ(a, X, β) for E(Y A, X) and estimate β based on observed data = β Estimate g opt by ĝ opt (X) = I{µ(1, X, β) µ(0, X, β) 0} Regression estimator But: µ(a, X, β) may be misspecified, so ĝ opt could be far from g opt Alternative perspective: µ(a, X, β) defines a class of regimes, indexed by β, that may or may not contain g opt But may be feasible and interpretable Sampling of IMPACT Research 21

Optimal restricted regime For example: Suppose in truth E(Y A, X) = exp{1 + X 1 + 2X 2 + 3X 1 X 2 + A(1 2X 1 + X 2 )} = g opt (X) = I(X 2 2X 1 1) Posit µ(a, X, β) = β 0 + β 1 X 1 + β 2 X 2 + A(β 3 + β 4 X 1 + β 5 X 2 ) Defines class G η with elements I(X 2 η 1 X 1 +η 0 ) I(X 2 η 1 X 1 +η 0 ), η 0 = β 3 /β 5, η 1 = β 4 /β 5 Thus, in general: Consider class G η = {g(x, η)} indexed by η Write g η (X) = g(x, η) Optimal restricted regime g opt η (X) = g(x, η opt ), η opt = arg max η E{Y (g η )} Sampling of IMPACT Research 22

Optimal restricted regime Approach: Estimate η opt by maximizing a good (DR ) estimator for E{Y (g η )} Missing data analogy: Full data are {Y (g η ), X}; observed data are (C η, C η Y, X), where C η = Ag(X, η) + (1 A){1 g(x, η)} π(x) = pr(a = 1 X); known in a randomized trial; otherwise model and estimate π(x, γ) π c (X) = pr(c η = 1 X) = π(x)g(x, η) + {1 π(x)}{1 g(x, η)} Sampling of IMPACT Research 23

Optimal restricted regime Estimator for E{Y (g η )}: n { n 1 Cη,i Y i π c (X i, γ) C } η,i π c (X i, γ) m(x i, π c (X i, γ) β, η) i=1 (2) m(x, β, η) = µ(1, X, β)g(x, η) + µ(0, X, β){1 g(x, η)} Consistent if either π(x, γ) or µ(a, X, β) is correct Maximize (2) in η to obtain η opt Current work: Approaches to maximizing (2) Simulations : Almost equals performance of correct regression estimator and is superior with misspecified µ(a, X, β) Extension to multiple decision points Zhang, Tsiatis, Davidian (2011), in preparation Sampling of IMPACT Research 24

Computational Resource and Dissemination Core Goals: Efficient, robust, reliable code implementing project methodology Creation and dissemination of public-use software (to be made available on the IMPACT website) E.g., R packages, SAS macros, specialized implementations (e.g., FORTRAN, c) Programmers: Scientific programmers at UNC-CH and NCSU dedicated to these activities Sampling of IMPACT Research 25