Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.
|
|
- Christiana Malone
- 5 years ago
- Views:
Transcription
1 Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness with arbitrarily censored data 5. Smooth semiparametric regression analysis with arbitrarily censored data davidian (Joint work with X. Huang, L. Stefanski, K. Doehler, L. Tang, M. Zhang) Greenberg Lecture IV: Latent Variable/Survival 1 Greenberg Lecture IV: Latent Variable/Survival 2 1. Introduction Two mainstays of biostatistical methodology and practice: Latent-variable models e.g., measurement error models, models with random effects Survival analysis Two mini-talks: Research by my PhD students Tools for checking whether inference in latent variable models is robust to assumptions on the latent variable distribution with Xianzheng Huang and Len Stefanski Methods for survival analysis based on mild smoothness assumptions with Kirsten Doehler, Lihua Tang, and Min Zhang Latent-Model Robustness in Structural Measurement Error Models Xianzheng Huang, Len Stefanski, and Marie Davidian Department of Statistics North Carolina State University Greenberg Lecture IV: Latent Variable/Survival 3 Greenberg Lecture IV: Latent Variable/Survival 4 Particular latent variable model: Structural measurement error model Y = observed response X = true predictor (q 1), with true density fx (x) W = observed predictor (q 1) Usual assumptions: Take q = 1 for simplicity Conditional density of Y X is f Y X (y x; θ), true value θ W = X + U, U N (0, 2 ), σ2 U known conditional density of W X is f W X (w x; 2 ) (normal ) f Y,W X (y, w x; θ) = f Y X (y x; θ)f W X (w x; 2 ) (surrogacy ) Interested in inference on θ Observed data: (Y j, W j), j = 1,..., n, iid Greenberg Lecture IV: Latent Variable/Survival 5 X is a latent variable: Assumptions on X? One approach to inference on θ: Make a parametric assumption about the true density of X (i.e., the latent variable model ) Assumed parametric latent variable model: f (a) X (x; τ (a) ), depending on a parameter vector τ (a) Likelihood inference: Estimate θ, τ (a) by θ, τ (a) maximizing n L(θ, τ (a) ) = f Y,W (Y j, W j; θ, τ (a) ) j=1 n = f Y X (Y j x; θ)f W X (W j x; )f 2 (a) X (x; τ (a) ) dx j=1 X (x; τ (a) ) is correctly specified θ is consistent and asymptotically efficient Greenberg Lecture IV: Latent Variable/Survival 6
2 What if f (a) X (x; τ (a) ) is incorrectly specified? θ can be inconsistent (and hence asymptotically biased ) Our definition of latent-model robustness: The estimator θ and more generally the model are said to be robust if this doesn t happen! I.e., Latent-model robustness means lack of asymptotic bias The estimator under a correct model is trivially robust Asymptotic bias is only possible if both f (a) X (x; τ (a) ) is misspecified and 2 > 0 So we are interested in whether there is an interaction between these factors nonrobustness Definition: Full latent-model robustness Score for assumed model ψ(y, w, θ, τ (a) ) = / (θ, τ (a) ){ log f Y,W (y, w; θ, τ (a) ) } θ(σ U), τ (a) (σ U ) satisfy E[ ψ{y, W, θ(σ U), τ (a) (σ U)} ] = 0 (wrt to the true dist n) Under conditions, θ p θ(σ U) In general, if f (a) X (x; τ (a) ) is incorrect and σ U > 0, θ(σ U) θ The MLE for θ under f (a) X (x; τ (a) ) is robust if θ(σ U) θ σ U 0 Greenberg Lecture IV: Latent Variable/Survival 7 Greenberg Lecture IV: Latent Variable/Survival 8 Remarks: As we noted already X (x; τ (a) ) is correctly specified, then this condition will hold... but it can also hold when f (a) X (x; τ (a) ) is incorrectly specified! E.g., if f (a) X (x; τ (a) ) is incorrect but is sufficiently flexible to capture moments of the true model on which θ(σ U) depends Full model robustness: Only verifiable in simple models; not very practically useful A little easier: First-order latent-model robustness θ(σ U) = θ + σ 2 U θ (0) + o(σ 2 U) Can get by implicit differentiation of E{ψ( )} as in Stefanski (1985, Biometrika) Implies a necessary, first-order condition for robustness is θ (0) = 0 Example where this holds (and can be shown analytically ) Y X N (β 0 + β 1X, σe), 2 f (a) X (x; τ (a) ) = τ (a) 2 h(τ (a) 1 + τ (a) 2 x), h( ) an arbitrary density (see Huang et al. (2006, Biometrika) Greenberg Lecture IV: Latent Variable/Survival 9 Greenberg Lecture IV: Latent Variable/Survival 10 Realistically: First-order robustness is still too hard to be practically useful for fancier models arising in real applications Need an accessible way to assess robustness to the choice of the model f (a) X (x; τ (a) ) that can be used in data analysis Idea: Exploit these concepts of theoretical robustness If θ is robust, then a plot of θ(σ U) vs. σ U should be flat! If not robust, θ(σ U) will change with σ U Construct an empirical plot in this spirit based on data by exploiting the simulation step of simulation-extrapolation (SIMEX )... Remeasured data: Add additional increments of measurement error Actual observed data (Y, W ), var(w X) = σ 2 U Remeasured data {Y, W ()}, var{w () X} = (1 + )σ 2 U W () = W + 1/2 σ UZ, Z N (0, 1), > 0 Key : If the assumed model f Y X (y x; θ)f W X (w x; )f 2 (a) X (x; τ (a) ) dx is correct for (Y, W ), then f Y X (y x; θ)f W X {w x; (1 + ) 2 }f (a) X (x; τ (a) ) dx is correct for {Y, W ()} Greenberg Lecture IV: Latent Variable/Survival 11 Greenberg Lecture IV: Latent Variable/Survival 12
3 Result: If the assumed model f (a) X (x; τ (a) ) is correct or yields robust inferences, an estimator based on remeasured data should be approximately unbiased regardless of the size of Thus, estimators based on remeasured data for different should show no dependence on Inspect such estimators for a range of in a plot Write θ() for an estimator based on -remeasured data Observed data: (Y j, W j), j = 1,..., n, = 0 MLE θ(0) (estimates θ when measurement error variance = σ 2 U ) Remeasurement method: For each on a grid of [0, max], 1 max 3 Construct B remeasured data sets, where the bth remeasured data set is {Y j, W b,j()}, j = 1,..., n W b,j() = W j + 1/2 σ U Z b,j, Z b,j iid N (0, 1), j = 1,..., n b = 1,..., B (B = 50 or 100 suffices) For each b, compute MLE θ b() using {Y j, W b,j()}, j = 1,..., n Compute θ B() = B 1 B b=1 θ b() (estimates θ when measurement error variance = (1 + )σ 2 U ) Greenberg Lecture IV: Latent Variable/Survival 13 Greenberg Lecture IV: Latent Variable/Survival 14 Proposed plot for checking robustness: Plot θ B() vs. X (x; τ (a) ) is correct or robust the plot should be approximately flat across the range [0, max] X (x; τ (a) ) is nonrobust the plot will exhibit change with In fact : Can apply the remeasurement method to any estimation technique for measurement error models (not just likelihood estimators) Example: Y binary, P (Y = 1 X = x) = Φ(β 0 + β 1x), θ = (β 0, β 1) T True density of X is a bimodal mixture of two normals Three estimators for θ: Take f (a) X (x; τ (a) ) to be a normal density (n) the flexible SNP density (s) a normal mixture density (m), which is the correct specification Plots: For β 1 (β 0 plots similar) Theoretical, 1 ( ) β(m) 1 (σ U ) vs. σ U Remeasurement method, B = 100, σ U = 0.4 β ( ) 1,B (m) () β 1,B () vs., Greenberg Lecture IV: Latent Variable/Survival 15 Greenberg Lecture IV: Latent Variable/Survival 16 Theoretical plot: Empirical plot: = 0 corresponds to σ U = 0.4 ( ) β(m) 1 ( ) ( ) β(m) 1 ( ) b β ( ) 1,B () b β (m) 1,B () Solid = β (n) 1 (σ U), Dashed = β (s) 1 () Greenberg Lecture IV: Latent Variable/Survival 17 1 ( ) β(m) 1 ( ) Solid = (n) (s) β 1 (), Dashed = β 1 () Greenberg Lecture IV: Latent Variable/Survival 18 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B ()
4 Test statistic: In addition to visual assessment θ B(0) t( ) = θ B( ) ŜE{ θ B(0) θ B( )}, > 0 Choose in accordance with B; we have used = 1 or 3 with little difference Large t( ) indicates lack of robustness Reasonable operating characteristics in simulations 1 ( ) β(m) 1 ( ) Summary: The remeasurement method for empirically checking robustness can be applied to any measurement error model, e.g., multiplicative error, additional error-free covariates, etc. Currently : Extension to more complicated joint models for longitudinal data and a primary/survival endpoint Example/details: Huang, X., Stefanski, L., and Davidian, M. (2006) Latent-model robustness in structural measurement error models. Biometrika 93, ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 19 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 20 b 1,B () β b(m) 1,B () Smooth Inference for Arbitrarily Censored Time-to-Event Data Kirsten Doehler, Min Zhang, Lihua Tang and Marie Davidian Department of Statistics North Carolina State University 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () 1 ( ) β(m) 1 ( ) Survival analysis: Tradition Parametric models too restrictive Nonparametric or semiparametric models and methods Advantage : Minimal assumptions robustness Disadvantage : Includes implausible models as possibilities, computational/inferential difficulties Perspective: Impose mild smoothness assumptions Disadvantage : More restrictive (but not too much ) Advantage : Computational ease, unified handling of arbitrary censoring, possible efficiency gains 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 21 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 22 b 1,B () β b(m) 1,B () Assume: Time-to-event random variable T, values in (0, ) Survival function S(t) = P (T > t), 0 < t < Density f(t), f H, where H is a class of smooth densities Objective : Estimate S(t) under these assumptions Class H: Gallant and Nychka (1987) Sufficiently differentiable No unusual behavior, e.g., oscillations, jumps, other weirdness May be multimodal, skewed, fat- or thin-tailed q-dimensional; we consider q = 1PSfrag for nowreplacements 1 ( ) β(m) 1 ( ) Representation of h H: h(z) = P 2 (z)ψ(z) + lower bound Infinite Hermite series + lower bound governing tails P ( ) infinite-dimensional polynomial ψ( ) is a density with moment generating function; the base density Almost always in published applications : ψ( ) is the standard normal density ϕ( ) (but doesn t have to be... ) SemiNonParametric (SNP ) 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 23 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 24 b 1,B () β b(m) 1,B ()
5 Practical use: Truncate Standardized version h K(z) = P 2 K(z)ψ(z) E.g., K = 2, P K(z) = a 0 + a 1z + a 2z 2 Flexible : K = 1, 2 often suffices to approximate almost any shape Approximation has same support as the base density h K(z) dz = 1 ensured automatically by a reparameterization of polynomial coefficients via a spherical transformation (Zhang and Davidian, 2001) in terms of K angles φ K selected via information criteria. PSfrag.. replacements 1 ( ) β(m) 1 ( ) Approximate any f H by shifting/scaling of Z with this density Representation of survival density: Consider two base density representations: log(t ) = µ + σz, Z has density h H Approximate h by h K(z; φ) with ψ(z) = ϕ(z) = (2π) 1/2 e z2 /2, the standard normal density T = µz σ, Z has density h H Approximate h by h K(z; φ) with ψ(z) = E(z) = e z, the standard exponential density Alternatively, on the log scale with extreme value base density In either case approximate f(t) by f K(t; θ), θ = (µ, σ, φ). Evidence : Virtually any plausiblepsfrag survival replacements density can be approximated with small K and oneβ of ( ) 1 ( these ) β(m) base 1 ( densities ) Greenberg Lecture IV: Latent Variable/Survival 25 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 26 b 1,B () β b(m) 1,B () Survival function approximation: S K(t; θ) = f t K(u; θ) du E.g., Normal base density S K(t; θ) = PK(z)ϕ(z) 2 dz (log t µ)/σ Linear combination of easy integrals I(k, r) = z k ϕ(z) dz, r I(0, r) = 1 Φ(r), I(1, r) = ϕ(r), I(k, r) = r k 1 ϕ(r) + (k 1)I(k 2, r), k > 2 Similar recursion for exponential base representation Result: Straightforward approximation to S(t) Trivial computation 1 ( ) β(m) 1 ( ) Straightforward likelihood: Any censoring/truncation pattern Right-censored data: Observe iid (V i, i), i = 1,..., n, V i = min(t i, C i), T i C i, i = I(T i C i) Likelihood for θ based on observed data for fixed K and base n [ ] l K(θ) = i log{f K(V i; θ)} + (1 i) log{s K(V i; θ)} i=1 Interval-censored data: T i known to lie in [L i, R i] Etc. l K(θ) = n i=1 [ log{s K(L i; θ) S K(R + i ; θ)} ] 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 27 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 28 b 1,B () β b(m) 1,B () Choosing K-base: Standard information criteria Fit K = 0, 1,..., K max for each base density, K max = 3 generally suffices; i.e., estimate θ Choose K-base optimizing a given information criterion, e.g., AIC, BIC, HQ = 2l K( θ) + 2dim(θ) log log n Starting values over a grid to ensure global maximum 1 ( ) β(m) 1 ( ) Details/remarks: For chosen K-base, standard errors for S K(t; θ) via delta method treating as standard parametric problem work well Computation : standard optimization routines (e.g., SAS IML nlpqn), very fast Test statistic for comparing two groups: integrated weighted difference t0 0 T = w(u){s1,k1(u; θ 1) S 2,K2 (u; θ 2)} du [ t0 ŜE 0 w(u){s1,k1(u; θ 1) S 2,K2 (u; θ ] 2)} du compare to standard normal critical values 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 29 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 30 b 1,B () β b(m) 1,B ()
6 Representative Monte Carlo simulations: S(t) is Weibull, n = 200, 1000 data sets, estimation at S(t 0) = 0.9, 0.8,..., 0.1 Right censored Interval-censored 30% Right cens 75% Interval cens 25% Right cens S(t0) Rel eff KM Cov prob Rel eff NPML Cov prob SNP bias < 1.5% 1 ( ) β(m) 1 ( ) 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () probability 1 ( ) β(m) 1 ( ) time b 1,B () β b(m) 1,B () probability time 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 31 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 32 b 1,B () β b(m) 1,B () ACTG 175: Time to AIDS or death. ZDV monotherapy (n 1 = 617, 68% right censored) vs. combination therapy (n 2 = 1847, 80% right censored) ACTG 175 Data Breast Cosmesis data (Finkelstein and Wolfe, 1985): Time to cosmetic deterioration. Radiation (n 1 = 46, 25 RC, 21 IC) vs. Radiation+chemo (n 2 = 48, 13 RC, 35 IC) Cosmetic Deterioration Data 1 ( ) β(m) 1 ( ) Survival Probability ( ) β(m) 1 ( ) Survival Probability b 1,B () β b(m) 1,B () time (days) T 2 1 ( ) β(m) 1 ( ) = 39.9, p-value < (logrank test = 47.2, p-value < ) Greenberg Lecture IV: Latent Variable/Survival 33 b 1,B () β b(m) 1,B () b 1,B () β b(m) 1,B () time (months) T 2 1 ( ) β(m) 1 ( ) = 7.84, p-value = (FW test = 6.83, p-value < 0.01) Greenberg Lecture IV: Latent Variable/Survival 34 b 1,B () β b(m) 1,B () 5. Smooth semiparametric regression 5. Smooth semiparametric regression Regression analysis: Consider right-censoring T i, C i as before, V i = min(t i, C i), i = I(T i C i) Observed data : (V i, i, X i), X i (p 1) vector of covariates Usual assumption : T i C i X i Interested in a model that describes the association between T i and X i Greenberg Lecture IV: Latent Variable/Survival 35 1 ( ) β(m) 1 ( ) b 1,B () β b(m) 1,B () Popular models: With meaningful interpretation Accelerated failure time (AFT) model log T i = X T i β + e i, e i has density f 0(t) Represent f 0(t) by SNP Proportional hazards model (PH) unspecified baseline survival function S 0(t) S(t X i) = S 0(t) exp(xt i β) Represent density f 0(t) of S 0(t) by SNP Proportional odds model (PI) unspecified baseline log odds a 0(t) logit{s(t X i)} = a 0(t)+Xi T β, a 0(t) = logit[s 0(t)/{1 S 0(t)}] 1 ( ) β(m) 1 ( ) Represent density f 0(t) of S 0(t) by SNP Greenberg Lecture IV: Latent Variable/Survival 36 b 1,B () β b(m) 1,B ()
7 5. Smooth semiparametric regression 5. Smooth semiparametric regression Remarks: Arbitrary censoring straightforward All models in a common framework model selection via information criteria Standard errors, confidence intervals, etc. straightforward Easy computation 1 ( ) β(m) 1 ( ) Extensions: Heteroscedastic AFT model Subject-specific AFT model for clustered data log T ij = X T ijβ + b i + e ij, b i N (0, σ 2 b ), e ij iid f0(t) Bivariate survival data: T 1, T 2 have smooth density f(t 1, t 2) represent by bivariate (q = 2) SNP Joint longitudinal-survival models Etc. 1 ( ) β(m) 1 ( ) Greenberg Lecture IV: Latent Variable/Survival 37 b 1,B () β b(m) 1,B () Greenberg Lecture IV: Latent Variable/Survival 38 b 1,B () β b(m) 1,B ()
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More informationSmooth Semiparametric Regression Analysis for Arbitrarily Censored Time-to-Event Data
Smooth Semiparametric Regression Analysis for Arbitrarily Censored Time-to-Event Data Min Zhang and Marie Davidian Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695-8203,
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationLatent-model Robustness in Joint Models for a Primary Endpoint and a Longitudinal Process
Latent-model Robustness in Joint Models for a Primary Endpoint and a Longitudinal Process Xianzheng Huang, 1, * Leonard A. Stefanski, 2 and Marie Davidian 2 1 Department of Statistics, University of South
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationSEMIPARAMETRIC APPROACHES TO INFERENCE IN JOINT MODELS FOR LONGITUDINAL AND TIME-TO-EVENT DATA
SEMIPARAMETRIC APPROACHES TO INFERENCE IN JOINT MODELS FOR LONGITUDINAL AND TIME-TO-EVENT DATA Marie Davidian and Anastasios A. Tsiatis http://www.stat.ncsu.edu/ davidian/ http://www.stat.ncsu.edu/ tsiatis/
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationSimple techniques for comparing survival functions with interval-censored data
Simple techniques for comparing survival functions with interval-censored data Jinheum Kim, joint with Chung Mo Nam jinhkim@suwon.ac.kr Department of Applied Statistics University of Suwon Comparing survival
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationSTAT 6350 Analysis of Lifetime Data. Probability Plotting
STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationStep-Stress Models and Associated Inference
Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationBiost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation
Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest
More informationMixed model analysis of censored longitudinal data with flexible random-effects density
Biostatistics (2012), 13, 1, pp. 61 73 doi:10.1093/biostatistics/kxr026 Advance Access publication on September 13, 2011 Mixed model analysis of censored longitudinal data with flexible random-effects
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationOn Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data
On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao
More informationStat 451 Lecture Notes Monte Carlo Integration
Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:
More informationWeb-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes
Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric
More informationCOMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION
(REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department
More informationLikelihood Construction, Inference for Parametric Survival Distributions
Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make
More informationIntegrated likelihoods in survival models for highlystratified
Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationREGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520
REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationJoint Modeling of Longitudinal Item Response Data and Survival
Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationBetter Bootstrap Confidence Intervals
by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose
More informationAnalysis of Time-to-Event Data: Chapter 4 - Parametric regression models
Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationSTATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University
STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationPairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion
Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationPackage Rsurrogate. October 20, 2016
Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast
More informationModelling Survival Events with Longitudinal Data Measured with Error
Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group
More informationA general framework for regression analysis of time-to-event data subject to arbitrary patterns of censoring is proposed. The approach is relevant
ABSTRACT ZHANG, MIN. Semiparametric Methods for Analysis of Randomized Clinical Trials and Arbitrarily Censored Time-to-event Data. (Under the direction of Dr. Marie Davidian and Dr. Anastasios A. Tsiatis.)
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationJoint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited
Biometrics 62, 1037 1043 December 2006 DOI: 10.1111/j.1541-0420.2006.00570.x Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited Fushing Hsieh, 1 Yi-Kuan Tseng, 2 and Jane-Ling
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationSurvival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University
Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation
More informationPart III Measures of Classification Accuracy for the Prediction of Survival Times
Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationand Comparison with NPMLE
NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/
More informationGroup Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology
Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationMeasurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007
Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η
More informationIntroduction to Algorithmic Trading Strategies Lecture 10
Introduction to Algorithmic Trading Strategies Lecture 10 Risk Management Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Value at Risk (VaR) Extreme Value Theory (EVT) References
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationImpact of serial correlation structures on random effect misspecification with the linear mixed model.
Impact of serial correlation structures on random effect misspecification with the linear mixed model. Brandon LeBeau University of Iowa file:///c:/users/bleb/onedrive%20 %20University%20of%20Iowa%201/JournalArticlesInProgress/Diss/Study2/Pres/pres.html#(2)
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationSemiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models for Unemployment Spells
Semiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models for Unemployment Spells Jason Abrevaya University of Chicago Graduate School of Business Chicago, IL 60637,
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering
ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:
More informationβ j = coefficient of x j in the model; β = ( β1, β2,
Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationApplied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid
Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationMultivariate Assays With Values Below the Lower Limit of Quantitation: Parametric Estimation By Imputation and Maximum Likelihood
Multivariate Assays With Values Below the Lower Limit of Quantitation: Parametric Estimation By Imputation and Maximum Likelihood Robert E. Johnson and Heather J. Hoffman 2* Department of Biostatistics,
More information