Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

Size: px

Start display at page:

Download "Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models."

Christiana Malone
5 years ago
Views:

1 Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness with arbitrarily censored data 5. Smooth semiparametric regression analysis with arbitrarily censored data davidian (Joint work with X. Huang, L. Stefanski, K. Doehler, L. Tang, M. Zhang) Greenberg Lecture IV: Latent Variable/Survival 1 Greenberg Lecture IV: Latent Variable/Survival 2 1. Introduction Two mainstays of biostatistical methodology and practice: Latent-variable models e.g., measurement error models, models with random effects Survival analysis Two mini-talks: Research by my PhD students Tools for checking whether inference in latent variable models is robust to assumptions on the latent variable distribution with Xianzheng Huang and Len Stefanski Methods for survival analysis based on mild smoothness assumptions with Kirsten Doehler, Lihua Tang, and Min Zhang Latent-Model Robustness in Structural Measurement Error Models Xianzheng Huang, Len Stefanski, and Marie Davidian Department of Statistics North Carolina State University Greenberg Lecture IV: Latent Variable/Survival 3 Greenberg Lecture IV: Latent Variable/Survival 4 Particular latent variable model: Structural measurement error model Y = observed response X = true predictor (q 1), with true density fx (x) W = observed predictor (q 1) Usual assumptions: Take q = 1 for simplicity Conditional density of Y X is f Y X (y x; θ), true value θ W = X + U, U N (0, 2 ), σ2 U known conditional density of W X is f W X (w x; 2 ) (normal ) f Y,W X (y, w x; θ) = f Y X (y x; θ)f W X (w x; 2 ) (surrogacy ) Interested in inference on θ Observed data: (Y j, W j), j = 1,..., n, iid Greenberg Lecture IV: Latent Variable/Survival 5 X is a latent variable: Assumptions on X? One approach to inference on θ: Make a parametric assumption about the true density of X (i.e., the latent variable model ) Assumed parametric latent variable model: f (a) X (x; τ (a) ), depending on a parameter vector τ (a) Likelihood inference: Estimate θ, τ (a) by θ, τ (a) maximizing n L(θ, τ (a) ) = f Y,W (Y j, W j; θ, τ (a) ) j=1 n = f Y X (Y j x; θ)f W X (W j x; )f 2 (a) X (x; τ (a) ) dx j=1 X (x; τ (a) ) is correctly specified θ is consistent and asymptotically efficient Greenberg Lecture IV: Latent Variable/Survival 6

2 What if f (a) X (x; τ (a) ) is incorrectly specified? θ can be inconsistent (and hence asymptotically biased ) Our definition of latent-model robustness: The estimator θ and more generally the model are said to be robust if this doesn t happen! I.e., Latent-model robustness means lack of asymptotic bias The estimator under a correct model is trivially robust Asymptotic bias is only possible if both f (a) X (x; τ (a) ) is misspecified and 2 > 0 So we are interested in whether there is an interaction between these factors nonrobustness Definition: Full latent-model robustness Score for assumed model ψ(y, w, θ, τ (a) ) = / (θ, τ (a) ){ log f Y,W (y, w; θ, τ (a) ) } θ(σ U), τ (a) (σ U ) satisfy E[ ψ{y, W, θ(σ U), τ (a) (σ U)} ] = 0 (wrt to the true dist n) Under conditions, θ p θ(σ U) In general, if f (a) X (x; τ (a) ) is incorrect and σ U > 0, θ(σ U) θ The MLE for θ under f (a) X (x; τ (a) ) is robust if θ(σ U) θ σ U 0 Greenberg Lecture IV: Latent Variable/Survival 7 Greenberg Lecture IV: Latent Variable/Survival 8 Remarks: As we noted already X (x; τ (a) ) is correctly specified, then this condition will hold... but it can also hold when f (a) X (x; τ (a) ) is incorrectly specified! E.g., if f (a) X (x; τ (a) ) is incorrect but is sufficiently flexible to capture moments of the true model on which θ(σ U) depends Full model robustness: Only verifiable in simple models; not very practically useful A little easier: First-order latent-model robustness θ(σ U) = θ + σ 2 U θ (0) + o(σ 2 U) Can get by implicit differentiation of E{ψ( )} as in Stefanski (1985, Biometrika) Implies a necessary, first-order condition for robustness is θ (0) = 0 Example where this holds (and can be shown analytically ) Y X N (β 0 + β 1X, σe), 2 f (a) X (x; τ (a) ) = τ (a) 2 h(τ (a) 1 + τ (a) 2 x), h( ) an arbitrary density (see Huang et al. (2006, Biometrika) Greenberg Lecture IV: Latent Variable/Survival 9 Greenberg Lecture IV: Latent Variable/Survival 10 Realistically: First-order robustness is still too hard to be practically useful for fancier models arising in real applications Need an accessible way to assess robustness to the choice of the model f (a) X (x; τ (a) ) that can be used in data analysis Idea: Exploit these concepts of theoretical robustness If θ is robust, then a plot of θ(σ U) vs. σ U should be flat! If not robust, θ(σ U) will change with σ U Construct an empirical plot in this spirit based on data by exploiting the simulation step of simulation-extrapolation (SIMEX )... Remeasured data: Add additional increments of measurement error Actual observed data (Y, W ), var(w X) = σ 2 U Remeasured data {Y, W ()}, var{w () X} = (1 + )σ 2 U W () = W + 1/2 σ UZ, Z N (0, 1), > 0 Key : If the assumed model f Y X (y x; θ)f W X (w x; )f 2 (a) X (x; τ (a) ) dx is correct for (Y, W ), then f Y X (y x; θ)f W X {w x; (1 + ) 2 }f (a) X (x; τ (a) ) dx is correct for {Y, W ()} Greenberg Lecture IV: Latent Variable/Survival 11 Greenberg Lecture IV: Latent Variable/Survival 12

3 Result: If the assumed model f (a) X (x; τ (a) ) is correct or yields robust inferences, an estimator based on remeasured data should be approximately unbiased regardless of the size of Thus, estimators based on remeasured data for different should show no dependence on Inspect such estimators for a range of in a plot Write θ() for an estimator based on -remeasured data Observed data: (Y j, W j), j = 1,..., n, = 0 MLE θ(0) (estimates θ when measurement error variance = σ 2 U ) Remeasurement method: For each on a grid of [0, max], 1 max 3 Construct B remeasured data sets, where the bth remeasured data set is {Y j, W b,j()}, j = 1,..., n W b,j() = W j + 1/2 σ U Z b,j, Z b,j iid N (0, 1), j = 1,..., n b = 1,..., B (B = 50 or 100 suffices) For each b, compute MLE θ b() using {Y j, W b,j()}, j = 1,..., n Compute θ B() = B 1 B b=1 θ b() (estimates θ when measurement error variance = (1 + )σ 2 U ) Greenberg Lecture IV: Latent Variable/Survival 13 Greenberg Lecture IV: Latent Variable/Survival 14 Proposed plot for checking robustness: Plot θ B() vs. X (x; τ (a) ) is correct or robust the plot should be approximately flat across the range [0, max] X (x; τ (a) ) is nonrobust the plot will exhibit change with In fact : Can apply the remeasurement method to any estimation technique for measurement error models (not just likelihood estimators) Example: Y binary, P (Y = 1 X = x) = Φ(β 0 + β 1x), θ = (β 0, β 1) T True density of X is a bimodal mixture of two normals Three estimators for θ: Take f (a) X (x; τ (a) ) to be a normal density (n) the flexible SNP density (s) a normal mixture density (m), which is the correct specification Plots: For β 1 (β 0 plots similar) Theoretical, 1 ( ) β(m) 1 (σ U ) vs. σ U Remeasurement method, B = 100, σ U = 0.4 β ( ) 1,B (m) () β 1,B () vs., Greenberg Lecture IV: Latent Variable/Survival 15 Greenberg Lecture IV: Latent Variable/Survival 16 Theoretical plot: Empirical plot: = 0 corresponds to σ U = 0.4 ( ) β(m) 1 ( ) ( ) β(m) 1 ( ) b β ( ) 1,B () b β (m) 1,B () Solid = β (n) 1 (σ U), Dashed = β (s) 1 () Greenberg Lecture IV: Latent Variable/Survival 17 1 ( ) β(m) 1 ( ) Solid = (n) (s) β 1 (), Dashed = β 1 () Greenberg Lecture IV: Latent Variable/Survival 18 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B ()

Test statistic: In addition to visual assessment θ B(0) t( ) = θ B( ) ŜE{ θ B(0) θ B( )}, > 0 Choose in accordance with B; we have used = 1 or 3 with little difference Large t( ) indicates lack of

4 Test statistic: In addition to visual assessment θ B(0) t( ) = θ B( ) ŜE{ θ B(0) θ B( )}, > 0 Choose in accordance with B; we have used = 1 or 3 with little difference Large t( ) indicates lack of robustness Reasonable operating characteristics in simulations 1 ( ) β(m) 1 ( ) Summary: The remeasurement method for empirically checking robustness can be applied to any measurement error model, e.g., multiplicative error, additional error-free covariates, etc. Currently : Extension to more complicated joint models for longitudinal data and a primary/survival endpoint Example/details: Huang, X., Stefanski, L., and Davidian, M. (2006) Latent-model robustness in structural measurement error models. Biometrika 93, ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 19 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 20 b 1,B () β b(m) 1,B () Smooth Inference for Arbitrarily Censored Time-to-Event Data Kirsten Doehler, Min Zhang, Lihua Tang and Marie Davidian Department of Statistics North Carolina State University 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () 1 ( ) β(m) 1 ( ) Survival analysis: Tradition Parametric models too restrictive Nonparametric or semiparametric models and methods Advantage : Minimal assumptions robustness Disadvantage : Includes implausible models as possibilities, computational/inferential difficulties Perspective: Impose mild smoothness assumptions Disadvantage : More restrictive (but not too much ) Advantage : Computational ease, unified handling of arbitrary censoring, possible efficiency gains 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 21 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 22 b 1,B () β b(m) 1,B () Assume: Time-to-event random variable T, values in (0, ) Survival function S(t) = P (T > t), 0 < t < Density f(t), f H, where H is a class of smooth densities Objective : Estimate S(t) under these assumptions Class H: Gallant and Nychka (1987) Sufficiently differentiable No unusual behavior, e.g., oscillations, jumps, other weirdness May be multimodal, skewed, fat- or thin-tailed q-dimensional; we consider q = 1PSfrag for nowreplacements 1 ( ) β(m) 1 ( ) Representation of h H: h(z) = P 2 (z)ψ(z) + lower bound Infinite Hermite series + lower bound governing tails P ( ) infinite-dimensional polynomial ψ( ) is a density with moment generating function; the base density Almost always in published applications : ψ( ) is the standard normal density ϕ( ) (but doesn t have to be... ) SemiNonParametric (SNP ) 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 23 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 24 b 1,B () β b(m) 1,B ()

5 Practical use: Truncate Standardized version h K(z) = P 2 K(z)ψ(z) E.g., K = 2, P K(z) = a 0 + a 1z + a 2z 2 Flexible : K = 1, 2 often suffices to approximate almost any shape Approximation has same support as the base density h K(z) dz = 1 ensured automatically by a reparameterization of polynomial coefficients via a spherical transformation (Zhang and Davidian, 2001) in terms of K angles φ K selected via information criteria. PSfrag.. replacements 1 ( ) β(m) 1 ( ) Approximate any f H by shifting/scaling of Z with this density Representation of survival density: Consider two base density representations: log(t ) = µ + σz, Z has density h H Approximate h by h K(z; φ) with ψ(z) = ϕ(z) = (2π) 1/2 e z2 /2, the standard normal density T = µz σ, Z has density h H Approximate h by h K(z; φ) with ψ(z) = E(z) = e z, the standard exponential density Alternatively, on the log scale with extreme value base density In either case approximate f(t) by f K(t; θ), θ = (µ, σ, φ). Evidence : Virtually any plausiblepsfrag survival replacements density can be approximated with small K and oneβ of ( ) 1 ( these ) β(m) base 1 ( densities ) Greenberg Lecture IV: Latent Variable/Survival 25 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 26 b 1,B () β b(m) 1,B () Survival function approximation: S K(t; θ) = f t K(u; θ) du E.g., Normal base density S K(t; θ) = PK(z)ϕ(z) 2 dz (log t µ)/σ Linear combination of easy integrals I(k, r) = z k ϕ(z) dz, r I(0, r) = 1 Φ(r), I(1, r) = ϕ(r), I(k, r) = r k 1 ϕ(r) + (k 1)I(k 2, r), k > 2 Similar recursion for exponential base representation Result: Straightforward approximation to S(t) Trivial computation 1 ( ) β(m) 1 ( ) Straightforward likelihood: Any censoring/truncation pattern Right-censored data: Observe iid (V i, i), i = 1,..., n, V i = min(t i, C i), T i C i, i = I(T i C i) Likelihood for θ based on observed data for fixed K and base n [ ] l K(θ) = i log{f K(V i; θ)} + (1 i) log{s K(V i; θ)} i=1 Interval-censored data: T i known to lie in [L i, R i] Etc. l K(θ) = n i=1 [ log{s K(L i; θ) S K(R + i ; θ)} ] 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 27 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 28 b 1,B () β b(m) 1,B () Choosing K-base: Standard information criteria Fit K = 0, 1,..., K max for each base density, K max = 3 generally suffices; i.e., estimate θ Choose K-base optimizing a given information criterion, e.g., AIC, BIC, HQ = 2l K( θ) + 2dim(θ) log log n Starting values over a grid to ensure global maximum 1 ( ) β(m) 1 ( ) Details/remarks: For chosen K-base, standard errors for S K(t; θ) via delta method treating as standard parametric problem work well Computation : standard optimization routines (e.g., SAS IML nlpqn), very fast Test statistic for comparing two groups: integrated weighted difference t0 0 T = w(u){s1,k1(u; θ 1) S 2,K2 (u; θ 2)} du [ t0 ŜE 0 w(u){s1,k1(u; θ 1) S 2,K2 (u; θ ] 2)} du compare to standard normal critical values 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 29 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 30 b 1,B () β b(m) 1,B ()

6 Representative Monte Carlo simulations: S(t) is Weibull, n = 200, 1000 data sets, estimation at S(t 0) = 0.9, 0.8,..., 0.1 Right censored Interval-censored 30% Right cens 75% Interval cens 25% Right cens S(t0) Rel eff KM Cov prob Rel eff NPML Cov prob SNP bias < 1.5% 1 ( ) β(m) 1 ( ) 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () probability 1 ( ) β(m) 1 ( ) time b 1,B () β b(m) 1,B () probability time 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 31 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 32 b 1,B () β b(m) 1,B () ACTG 175: Time to AIDS or death. ZDV monotherapy (n 1 = 617, 68% right censored) vs. combination therapy (n 2 = 1847, 80% right censored) ACTG 175 Data Breast Cosmesis data (Finkelstein and Wolfe, 1985): Time to cosmetic deterioration. Radiation (n 1 = 46, 25 RC, 21 IC) vs. Radiation+chemo (n 2 = 48, 13 RC, 35 IC) Cosmetic Deterioration Data 1 ( ) β(m) 1 ( ) Survival Probability ( ) β(m) 1 ( ) Survival Probability b 1,B () β b(m) 1,B () time (days) T 2 1 ( ) β(m) 1 ( ) = 39.9, p-value < (logrank test = 47.2, p-value < ) Greenberg Lecture IV: Latent Variable/Survival 33 b 1,B () β b(m) 1,B () b 1,B () β b(m) 1,B () time (months) T 2 1 ( ) β(m) 1 ( ) = 7.84, p-value = (FW test = 6.83, p-value < 0.01) Greenberg Lecture IV: Latent Variable/Survival 34 b 1,B () β b(m) 1,B () 5. Smooth semiparametric regression 5. Smooth semiparametric regression Regression analysis: Consider right-censoring T i, C i as before, V i = min(t i, C i), i = I(T i C i) Observed data : (V i, i, X i), X i (p 1) vector of covariates Usual assumption : T i C i X i Interested in a model that describes the association between T i and X i Greenberg Lecture IV: Latent Variable/Survival 35 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () Popular models: With meaningful interpretation Accelerated failure time (AFT) model log T i = X T i β + e i, e i has density f 0(t) Represent f 0(t) by SNP Proportional hazards model (PH) unspecified baseline survival function S 0(t) S(t X i) = S 0(t) exp(xt i β) Represent density f 0(t) of S 0(t) by SNP Proportional odds model (PI) unspecified baseline log odds a 0(t) logit{s(t X i)} = a 0(t)+Xi T β, a 0(t) = logit[s 0(t)/{1 S 0(t)}] 1 ( ) β(m) 1 ( ) Represent density f 0(t) of S 0(t) by SNP Greenberg Lecture IV: Latent Variable/Survival 36 b 1,B () β b(m) 1,B ()

7 5. Smooth semiparametric regression 5. Smooth semiparametric regression Remarks: Arbitrary censoring straightforward All models in a common framework model selection via information criteria Standard errors, confidence intervals, etc. straightforward Easy computation 1 ( ) β(m) 1 ( ) Extensions: Heteroscedastic AFT model Subject-specific AFT model for clustered data log T ij = X T ijβ + b i + e ij, b i N (0, σ 2 b ), e ij iid f0(t) Bivariate survival data: T 1, T 2 have smooth density f(t 1, t 2) represent by bivariate (q = 2) SNP Joint longitudinal-survival models Etc. 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 37 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 38 b 1,B () β b(m) 1,B ()

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,