On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data
|
|
- Lindsay Walker
- 5 years ago
- Views:
Transcription
1 On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao Guan, Rajita Sinha Yale University
2 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time
3 Cocaine dependence Cocaine dependence remains a serious public health problem in the US with more than one million individuals meeting criteria for current dependence (SAMSHA, 2004).
4 Cocaine dependence Cocaine dependence remains a serious public health problem in the US with more than one million individuals meeting criteria for current dependence (SAMSHA, 2004). Cocaine abuse causes serious concerns to the society because of its association with criminal activities as well as the high cost to treat health problems related to it.
5 Cocaine dependence Cocaine dependence remains a serious public health problem in the US with more than one million individuals meeting criteria for current dependence (SAMSHA, 2004). Cocaine abuse causes serious concerns to the society because of its association with criminal activities as well as the high cost to treat health problems related to it. Despite the availability of efficacious behavioral treatments, relapse rates remain high (Sinha, 2001, 2007).
6 Cocaine relapse and baseline usage behavior Previous studies have revealed that one s baseline cocaine use behavior is predictive of cocaine relapse, along with many other risk factors such as age and gender, cocaine withdrawal severity, and stress and negative mood.
7 Cocaine relapse and baseline usage behavior Previous studies have revealed that one s baseline cocaine use behavior is predictive of cocaine relapse, along with many other risk factors such as age and gender, cocaine withdrawal severity, and stress and negative mood. To describe the baseline cocaine use behavior, daily cocaine use trajectory data are collected in a short period prior to treatment.
8 Cocaine relapse and baseline usage behavior Previous studies have revealed that one s baseline cocaine use behavior is predictive of cocaine relapse, along with many other risk factors such as age and gender, cocaine withdrawal severity, and stress and negative mood. To describe the baseline cocaine use behavior, daily cocaine use trajectory data are collected in a short period prior to treatment. statistics derived from these trajectories are then used as predictors in a subsequent analysis to explain cocaine craving and cocaine relapse.
9 The data The research was conducted at the Clinical Neuroscience Research Unit (CNRU) of the Connecticut Mental Health Center.
10 The data The research was conducted at the Clinical Neuroscience Research Unit (CNRU) of the Connecticut Mental Health Center. The study was conducted in two separate periods with 59 enrolling in the first period and 83 enrolling in the second.
11 The data The research was conducted at the Clinical Neuroscience Research Unit (CNRU) of the Connecticut Mental Health Center. The study was conducted in two separate periods with 59 enrolling in the first period and 83 enrolling in the second. At the baseline, demographic variables (e.g. age, gender, race, years of cocaine use, anxiety level) and daily cocaine use history in the 90 days prior to admission were collected.
12 The data The research was conducted at the Clinical Neuroscience Research Unit (CNRU) of the Connecticut Mental Health Center. The study was conducted in two separate periods with 59 enrolling in the first period and 83 enrolling in the second. At the baseline, demographic variables (e.g. age, gender, race, years of cocaine use, anxiety level) and daily cocaine use history in the 90 days prior to admission were collected. The 59 subjects in the first period were interviewed 90 days after treatment.
13 The data The research was conducted at the Clinical Neuroscience Research Unit (CNRU) of the Connecticut Mental Health Center. The study was conducted in two separate periods with 59 enrolling in the first period and 83 enrolling in the second. At the baseline, demographic variables (e.g. age, gender, race, years of cocaine use, anxiety level) and daily cocaine use history in the 90 days prior to admission were collected. The 59 subjects in the first period were interviewed 90 days after treatment. The 83 subjects in the second study were interviewed 14, 30, 90 and 180 days after treatment. Urine screening were conducted on these interviews to ensure the accuracy of the first relapse time.
14 Response variables of interest Post-treatment cocaine craving score: desire for using cocaine at this moment, measured by the Tiffany Cocaine Craving Questionnaire-Brief (CCQ-Brief) (Tiffany et al., 1993).
15 Response variables of interest Post-treatment cocaine craving score: desire for using cocaine at this moment, measured by the Tiffany Cocaine Craving Questionnaire-Brief (CCQ-Brief) (Tiffany et al., 1993). First relapse time: the first time that the subject use cocaine after the treatment.
16 Response variables of interest Post-treatment cocaine craving score: desire for using cocaine at this moment, measured by the Tiffany Cocaine Craving Questionnaire-Brief (CCQ-Brief) (Tiffany et al., 1993). First relapse time: the first time that the subject use cocaine after the treatment. At each post treatment interview, the subject will report whether he/she used cocaine, and when.
17 Response variables of interest Post-treatment cocaine craving score: desire for using cocaine at this moment, measured by the Tiffany Cocaine Craving Questionnaire-Brief (CCQ-Brief) (Tiffany et al., 1993). First relapse time: the first time that the subject use cocaine after the treatment. At each post treatment interview, the subject will report whether he/she used cocaine, and when. In the second period, urine test was used to check the validity of the self-reported relapse time.
18 Response variables of interest Post-treatment cocaine craving score: desire for using cocaine at this moment, measured by the Tiffany Cocaine Craving Questionnaire-Brief (CCQ-Brief) (Tiffany et al., 1993). First relapse time: the first time that the subject use cocaine after the treatment. At each post treatment interview, the subject will report whether he/she used cocaine, and when. In the second period, urine test was used to check the validity of the self-reported relapse time. The relapse time are interval censored.
19 Assessing baseline cocaine use pattern Upon treatment entry, all subjects were interviewed by well-trained psychologists to collect baseline daily cocaine use history in the 90 days prior to admission, which is documented using a 90-day time-line follow-back (TLFB) Substance Use Calendar.
20 Assessing baseline cocaine use pattern Upon treatment entry, all subjects were interviewed by well-trained psychologists to collect baseline daily cocaine use history in the 90 days prior to admission, which is documented using a 90-day time-line follow-back (TLFB) Substance Use Calendar. It has been shown that the TLFB can provide reliable daily cocaine use data that have high 1) retest reliability, 2) correlation with other cocaine use measures and 3) agreement with collateral informants reports of patients cocaine use as well as results obtained from urine assays
21 Assessing baseline cocaine use pattern Upon treatment entry, all subjects were interviewed by well-trained psychologists to collect baseline daily cocaine use history in the 90 days prior to admission, which is documented using a 90-day time-line follow-back (TLFB) Substance Use Calendar. It has been shown that the TLFB can provide reliable daily cocaine use data that have high 1) retest reliability, 2) correlation with other cocaine use measures and 3) agreement with collateral informants reports of patients cocaine use as well as results obtained from urine assays All research assistants had been trained by PhD level psychologists and had over three-year experience in administration of similar assessments and they were closely supervised when conducting these interviews.
22 Assessing baseline cocaine use pattern All subjects had been informed upfront that all data are coded and confidential that they can not be summoned in court. They were also informed that they would be removed from the study if they were found out not being truthful about their drug use.
23 Baseline usage pattern
24 statistics and measurement error The true covariates are characteristics of the patients long term cocaine use pattern. The surrogates are summary statistics from the 90 day baseline usage trajectory, e.g. average daily use amount, use frequency. These trajectories are assumed to be stationary, and we consider two kinds of trajectories marked point pattern: use time and amount; point pattern: use time only. The point pattern is more reliable because a) people tend to under report the use amount, but there is less incentive to deny a use; b) the subjects use different ways to use cocaine, it is hard to convert use amount into equivalent grams.
25 Difference with the classical measurement error problems The baseline trajectories are subject specific stochastic processes. The estimation error in the summary statistics are heteroscedastic, non-gaussian, and dependent on the true covariate. We only have one realization of the random process. In the joint modeling literature, there is work on using the random slop and intercept of longitudinal processes in a second level regression model. But a strong parametric assumption on the measurement error need to be made. In functional data analysis, characteristics of random curves are used in second level regression (Crainiceanu et al. 2009).
26 Notation and setup Outline Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let N i be a stationary stochastic process that has generated the ith cocaine use trajectory over the baseline period (0, τ]. W i = 1 τ τ 0 N i (dt), (1) where Ni is either N i or a different stationary process that is derived from N i. The distribution of N i depends on a set of parameters Λ i, X i = E(W i Λ i ) is the true predictor. W i = X i + U i, σi 2 = var(u i ) might depends on X i and is different across the subjects.
27 Notation and setup (Cont.) Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let S(t, p) = (t, t + pτ] be a subinterval, where 0 < p 1 and 0 t (1 p)τ, W i (t, p) is the summary statistics defined on S(t, p), and U i (t, p) = W i (t, p) X i.
28 Notation and setup (Cont.) Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let S(t, p) = (t, t + pτ] be a subinterval, where 0 < p 1 and 0 t (1 p)τ, W i (t, p) is the summary statistics defined on S(t, p), and U i (t, p) = W i (t, p) X i. Define σ i 2 = lim τ (τσu 2 i ). Weak Dependency Assumption: We assume that (pτ)var[u i (t, p) Λ i ] = σ 2 i 1 pτ α i + o ( 1 pτ where α i is a constant that is typically nonnegative. ), (2)
29 Notation and setup (Cont.) Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let S(t, p) = (t, t + pτ] be a subinterval, where 0 < p 1 and 0 t (1 p)τ, W i (t, p) is the summary statistics defined on S(t, p), and U i (t, p) = W i (t, p) X i. Define σ i 2 = lim τ (τσu 2 i ). Weak Dependency Assumption: We assume that (pτ)var[u i (t, p) Λ i ] = σ 2 i 1 pτ α i + o ( 1 pτ where α i is a constant that is typically nonnegative. By the weak dependence assumption, ), (2) Var[U i (t, p) X i ] σ2 u i p 1 p p 2 τ 2 α i, (3) if pτ is sufficiently large.
30 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Method of moment bias correction in linear model We consider Y i = X i β + Z T i η + ɛ i, θ = (β, η). The naive estimator using the surrogate W, [ W ˆθ naive = T W W T Z Z T W Z T Z ] 1 [( W T Z T ) ] Y A 1 B.
31 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Method of moment bias correction in linear model We consider Y i = X i β + Z T i η + ɛ i, θ = (β, η). The naive estimator using the surrogate W, [ W ˆθ naive = T W W T Z Z T W Z T Z ] 1 [( W T Z T ) ] Y A 1 B. ( X E(A X, Z) = T X + σ 2 X T Z Z T X Z T Z where σ 2 = E(U T U) = n i=1 σ2 u i. ) ( X, E(B X, Z) = T X X T Z Z T X Z T Z ) θ,
32 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Method of moment bias correction in linear model We consider Y i = X i β + Z T i η + ɛ i, θ = (β, η). The naive estimator using the surrogate W, [ W ˆθ naive = T W W T Z Z T W Z T Z ] 1 [( W T Z T ) ] Y A 1 B. ( X E(A X, Z) = T X + σ 2 X T Z Z T X Z T Z ) ( X, E(B X, Z) = T X X T Z Z T X Z T Z where σ 2 = E(U T U) = n i=1 σ2 u i. The method of moment estimator θ mom is obtained by subtracting σ 2 from the first entry of A ) θ,
33 Subsampling estimation of the variance Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let t be an arbitrary time point with 0 t < τ kpτ, where k = [1/p]. We then partition the interval (t, t + kpτ] into k nonoverlapping subintervals each of length pτ. We require p 1/2 so that k 2. Let W i (t + jpτ, p) be the summary statistic defined on the (j + 1)th subinterval for the ith trajectory, where j = 0, 1,, k 1. Define σ u 2 i (t, p) = 1 k 1 [W i (t + jpτ, p) W i (t, kp)] 2. k j=0
34 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Subsampling estimation of the variance Let t be an arbitrary time point with 0 t < τ kpτ, where k = [1/p]. We then partition the interval (t, t + kpτ] into k nonoverlapping subintervals each of length pτ. We require p 1/2 so that k 2. Let W i (t + jpτ, p) be the summary statistic defined on the (j + 1)th subinterval for the ith trajectory, where j = 0, 1,, k 1. Define σ u 2 i (t, p) = 1 k 1 [W i (t + jpτ, p) W i (t, kp)] 2. k σ 2 u i (p) = j=0 1 τ kpτ τ kpτ 0 σ 2 u i (t, p)dt,
35 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Subsampling estimation of the variance (cont.) By condition (1), and define α = n i=1 α i, then [ n ] E σ u 2 i (p) 1 ( 1 1 ) σ 2 1 p k p 2 τ 2 i=1 ( 1 p 1 kp k 2 ) α. (4)
36 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Subsampling estimation of the variance (cont.) By condition (1), and define α = n i=1 α i, then [ n ] E σ u 2 i (p) 1 ( 1 1 ) σ 2 1 p k p 2 τ 2 i=1 ( 1 p 1 kp k 2 ) α. (4) Scheme 1: Ignore correlation, and σ 2 = p/(1 1 k ) n i=1 σ2 u i (p).
37 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Subsampling estimation of the variance (cont.) By condition (1), and define α = n i=1 α i, then [ n ] E σ u 2 i (p) 1 ( 1 1 ) σ 2 1 p k p 2 τ 2 i=1 ( 1 p 1 kp k 2 ) α. (4) Scheme 1: Ignore correlation, and σ 2 = p/(1 1 k ) n i=1 σ2 u i (p). Scheme 2: Take into account of correlation, let Ỹ (p) = p n i=1 σ2 u i (p) 1 1/k n Ỹ i (p) and X (p) = i=1 1 p (1 kp)/k2 pτ 2. (1 1/k) (5) Regress Ỹ (p) on X (p) at some preselected values (p 1,, p J ) where J 2. The resulting intercept and (minus) slope estimators, are ˆσ 2 and ˆα.
38 Within trajectory dependence Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Figure: Scatter plot of X (p) and Ỹ (p) for p = k/τ where k = 30, 31,, 40 and τ = 80.
39 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Subsampling extrapolation Like SIMEX, we inflate the error in the estimator, find out how the estimator change as the level of error increases. Not knowing the true distribution of measurement error, we use subsampling to create surrogate with inflated error variance, instead of adding simulated Gaussian error.
40 Subsampling extrapolation Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Like SIMEX, we inflate the error in the estimator, find out how the estimator change as the level of error increases. Not knowing the true distribution of measurement error, we use subsampling to create surrogate with inflated error variance, instead of adding simulated Gaussian error. Let W i (t, p) be the summary statistic defined over the subintervals S(t, p) = (t, t + pτ], Var[W i (t, p) Λ i ] σ2 u i p 1 p ( 1 p 2 τ 2 α i = p 1 p p 2 τ 2 α i σ 2 u i ) σ 2 u i x i (p)σ 2 u i. (6)
41 SUBEX (cont.) Outline Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Let ˆθ(p; t 1,, t n ) be the estimated regression coefficient based on {Y i, W i (t i, p), Z i }, where t i [0, (1 p)τ], i = 1,, n, without accounting for measurement error. ˆθ(p) = 1 [(1 p)τ] n (1 p)τ 0 (1 p)τ The integral is evaluated by Monte-Carlo. 0 ˆθ(p; t 1,, t n )dt 1 dt n. The procedure is repeated for a set of preselected (p 1,..., p J ). We fit a parametric model for (ˆx(p), ˆθ(p)), e.g. G Q (p, Γ) = γ 1 + ˆx(p)γ 2 + ˆx(p) 2 γ 3, where Γ = (γ 1, γ 2, γ 3 ). Normally, a quadratic or cubic extrapolant function is used. ˆθ subex = G Q (p =, ˆΓ). (7)
42 SUBEX (cont.) Outline Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time The variance inflation factor is x(p) = 1 ( 1 1 p ) α p pτ 2 σ 2. Scheme 1: Ignore correlation, x i (p) 1/p. This is valid if α i /(τ 2 σ 2 u i ) is small and/or pτ is large. Scheme 2: Plug in ˆσ 2 and ˆα from the regression between Ỹ (p) and X (p).
43 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Two naive alternatives that ignore within-subject correlation Let W i (0, 1/2) and W i (τ/2, 1/2) be the counterparts of W i based on the data in (0, τ/2] and (τ/2, τ]. Naive MOM: estimate σ 2 by ˆσ 2 naive = n σ u 2 i (1/2) = 1 4 i=1 n [W i (0, 1/2) W i (τ/2, 1/2)] 2. i=1
44 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Two naive alternatives that ignore within-subject correlation Let W i (0, 1/2) and W i (τ/2, 1/2) be the counterparts of W i based on the data in (0, τ/2] and (τ/2, τ]. Naive MOM: estimate σ 2 by n ˆσ naive 2 = σ u 2 i (1/2) = 1 n [W i (0, 1/2) W i (τ/2, 1/2)] 2. 4 i=1 i=1 Empirical SIMEX (Devanarayan and Stefanski, 2002): use the pseudo remeasurements ζ W i (ζ) = W i + 2 [W i(0, 1/2) W i (τ/2, 1/2)], for each ζ > 0.
45 Interval censored relapse time Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time We model the first relapse time T i through a Cox proportional hazard model (Cox, 1972), λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η). (8)
46 Interval censored relapse time Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time We model the first relapse time T i through a Cox proportional hazard model (Cox, 1972), λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η). (8) In our data, about 50.6% of the subjects have observed relapse time, 31.6% are interval censored and 17.8% are right censored.
47 Interval censored relapse time Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time We model the first relapse time T i through a Cox proportional hazard model (Cox, 1972), λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η). (8) In our data, about 50.6% of the subjects have observed relapse time, 31.6% are interval censored and 17.8% are right censored. Following Ruppert et al. (2003), Cai and Betensky (2003), ψ(t) = log λ 0 (t) = a 0 + a 1 t + K b k (t κ k ) +, (9) k=1 where x + max(x, 0), and κ k s are the knots.
48 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Penalized spline approach of Cai and Betensky (2003) We observe n independent tuples, (Ti l, T i r, δ i), where [Ti l, T i r ] gives the censoring interval, δ i is the indicator for right censoring. When δ i = 0 and T l i when δ i = 1 and T l i when δ i = 1 and T l i = T r i, the event time T i is right censored at T r < Ti r, T i is interval censored within [Ti l, T i r ]; = Ti r, T i is observed at Ti r. In addition, let δ 0i = Ti r ). be the indicator for observed data, i.e. δ 0i = I (δ i = 1, T l i i ;
49 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Penalized spline approach of Cai and Betensky (2003) We observe n independent tuples, (Ti l, T i r, δ i), where [Ti l, T i r ] gives the censoring interval, δ i is the indicator for right censoring. When δ i = 0 and T l i when δ i = 1 and T l i when δ i = 1 and T l i = T r i, the event time T i is right censored at T r < Ti r, T i is interval censored within [Ti l, T i r ]; = Ti r, T i is observed at Ti r. In addition, let δ 0i = Ti r ). be the indicator for observed data, i.e. δ 0i = I (δ i = 1, T l i Denote X = (X T, Z T ) T and Λ 0 (t) = t 0 λ 0(u)du. l(θ) = n i=1 δ 0i {log λ 0 (T r i ) + XT i θ} exp(x T i θ)λ 0 (T r i ) +δ i (1 δ 0i ) log where Θ is the collection of all parameters. ( ) exp[{λ 0 (Ti r ) Λ 0(Ti l )} exp(xt i θ)] 1,(10) i ;
50 Method of moment estimator Subsampling extrapolation (SUBEX) Interval censored failure time Penalized spline approach of Cai and Betensky (Cont.) The spline coefficients b are modeled as random effects with the joint distribution Normal(0, σb 2 I ). The joint likelihood is proportional to l p (Θ) = l(θ) K 2 log(σ2 b) 1 2σb 2 b T b. (11) Θ = (θ T, a T, b T ) T is estimated by maximizing the penalized likelihood (11). The tuning parameter σb 2 is chosen by maximizing the marginal likelihood using Laplace approximation (Breslow and Clayton, 1993). When X is measured with error, SUBEX method is applicable.
51 Simulation 1: Linear model Let N i ( ) be a stationary point process with intensity X i, where X i 1+ Log Normal (0, 0.5); Z i is an error-free Bernoulli random variable independent of X i, with P(Z i = 1) = 0.5. Y i = θ 0 + θ 1 X i + θ 2 Z i + ɛ i, i = 1, 2,, n, where θ = (θ 0, θ 1, θ 2 ) T = (1, 1, 1) T and ɛ i Normal(0, ).
52 Simulation 1: Linear model Let N i ( ) be a stationary point process with intensity X i, where X i 1+ Log Normal (0, 0.5); Z i is an error-free Bernoulli random variable independent of X i, with P(Z i = 1) = 0.5. Y i = θ 0 + θ 1 X i + θ 2 Z i + ɛ i, i = 1, 2,, n, where θ = (θ 0, θ 1, θ 2 ) T = (1, 1, 1) T and ɛ i Normal(0, ). We assume that X i is unknown but can be estimated by W i = N i ((0, τ])/τ, where N i ((0, τ]) denotes the number of events of N i contained in the time interval (0, τ].
53 Simulation 1: Linear model (Cont.) We consider two scenarios for N i. Scenario 1: N i is a homogeneous Poisson process with intensity X i. Scenario 2: N i is a homogeneous Poisson cluster process: we generate a homogeneous Poisson process with intensity ρ i = X i /3 as the parent process; each parent generates m = 3 children on average according to a Poisson distribution, and disperse the children locations independently following a normal distribution centered at the parent location and with standard deviation ω = 2. The final process contains only the children event times.
54 Simulation 1 (Cont.) We set n = 200 and repeat the simulation 200 times in each case. We apply the naive estimator, the empirical SIMEX method, two versions of Method of Moment estimator and two versions of SUBEX (MOM 1 and SUBEX 1 ignore within-trajectory correlation, MOM 2 and SUBEX 2 take into account of the correlation). For SUBEX, we set the p values to be from 0.6 to 0.95 with an increment of 0.05, and use a quadratic function for the extrapolation.
55 Simulation results for linear model Scenario 1: Poisson process, τ = 10 θ 1 θ 2 Bias SE Rel. Eff. Bias SE Rel. Eff. Naive MOM MOM SUBEX (.1334) (.1308).5084 SUBEX (.1317) (.1297).5158 SIMEX (.0896) (.0913).9237 Scenario 2: Poisson cluster process θ 1, τ = 10 θ 1, τ = 20 Bias SE Rel. Eff. Bias SE Rel. Eff. Naive MOM MOM SUBEX (.1227) (.1463) SUBEX (.1556) (.1692) SIMEX (.0802) (.0913)
56 Simulation 2: interval-censored failure time data We simulate X i, N i and Z i as in Simulation 1, and simulate the failure time T i through a Cox model λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η), (12) where λ 0 (t) = t and (β, η) T = (1, 1) T.
57 Simulation 2: interval-censored failure time data We simulate X i, N i and Z i as in Simulation 1, and simulate the failure time T i through a Cox model λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η), (12) where λ 0 (t) = t and (β, η) T = (1, 1) T. We assume censoring at random and set the censoring times to be (0.2, 0.5, 1). Let the censoring indicator δ i be a binary variable independent of X i and Z i, with P(δ i = 1) = 0.5. When δ i = 1, the event time T i is censored in the interval between the two censoring times closest to T i ; if T i is less than 0.2, it is censored in [Ti l = 0, Ti r = 0.2]; if an event time is over 1, it is automatically right censored at 1.
58 Simulation 2: interval-censored failure time data We simulate X i, N i and Z i as in Simulation 1, and simulate the failure time T i through a Cox model λ(t X i, Z i ) = λ 0 (t) exp(x i β + Z i η), (12) where λ 0 (t) = t and (β, η) T = (1, 1) T. We assume censoring at random and set the censoring times to be (0.2, 0.5, 1). Let the censoring indicator δ i be a binary variable independent of X i and Z i, with P(δ i = 1) = 0.5. When δ i = 1, the event time T i is censored in the interval between the two censoring times closest to T i ; if T i is less than 0.2, it is censored in [Ti l = 0, Ti r = 0.2]; if an event time is over 1, it is automatically right censored at 1. Overall, about 12% of the observations are right censored, 43% are interval censored, and the rest 45% are observed.
59 Simulation results in the interval-censored failure time data Poisson, τ = 10, n = 200 Poisson cluster, τ = 10, n = 200 Bias SE Rel. Eff. Bias SE Rel. Eff. No Error (.1469) (.1462) Naive (.1129) (.0800) SUBEX (.2587) (.2016) SUBEX (.2570) (.2590) SIMEX (.1732) (.1237) Poisson cluster, τ = 20, n = 200 Poisson cluster, τ = 20, n = 400 Bias SE Rel. Eff. Bias SE Rel. Eff. No Error (.1465) (.1015) Naive (.0971) (.0695) SUBEX (.2602) (.1843) SUBEX (.3052) (.2178) SIMEX (.1558) (.1107)
60 Results for the analysis of the CCQ-Brief score data NAIVE MOM 1 SIMEX MOM 2 SUBEX 1 SUBEX 2 frequency.476 (.176).529 (.195).537 (.223).565 (.211).577 (.204).662 (.257) indicator (.062) (.062) (.064) (.062) (.064) (.065) gender (.093) (.093) (.091) (.093) (.092) (.095) race.305 (.099).306 (.100).311 (.101).307 (.100).289 (.100).271 (.103) age (.081) (.082) (.081) (.083) (.081) (.083) cocyrs (.087) (.087) (.085) (.088) (.083) (.086) curanxs.139 (.085).142 (.085).134 (.079).145 (.086).163 (.084).178 (.086)
61 Results for the analysis of the time to first relapse data NAIVE SIMEX SUBEX 1 SUBEX 2 cocuse (.081) (.096) (.095).017 (.206) gender (.299) (.301) (.319) (.347) race (.274) (.277) (.272) (.276) age (.024) (.024) (.024) (.024) cocyrs.110 (.028).111 (.028).111 (.028).109 (.029) curanxs.279 (.221).275 (.224).308 (.218).352 (.232)
62 In many scientific problems in particular substance use research, it is common to have summary statistics derived from some stochastic processes as covariates. The estimation error in these summary statistics causes estimation bias in the regression coefficients like in classical measurement error problems. We propose a new method-of-moment approach for linear models and a subsampling extrapolation method that is generally applicable to both linear and nonlinear models. The proposed methods are based on novel subsampling techniques that take into account of the correlation within individual processes, and have shown good performance in both simulation and real data analysis.
arxiv: v1 [stat.ap] 21 Aug 2015
Submitted to the Annals of Applied Statistics arxiv: arxiv:. JOINT MODELING OF LONGITUDINAL DRUG USING PATTERN AND TIME TO FIRST RELAPSE IN COCAINE DEPENDENCE TREATMENT DATA arxiv:158.5412v1 [stat.ap]
More informationRegularization in Cox Frailty Models
Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationSupport Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationStatistical Methods for Alzheimer s Disease Studies
Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationNon-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
More informationJoint Modeling of Longitudinal Item Response Data and Survival
Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationCensoring mechanisms
Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationEstimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004
Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure
More informationLecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Lecture 9: Learning Optimal Dynamic Treatment Regimes Introduction Refresh: Dynamic Treatment Regimes (DTRs) DTRs: sequential decision rules, tailored at each stage by patients time-varying features and
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationFrailty Probit model for multivariate and clustered interval-censor
Frailty Probit model for multivariate and clustered interval-censored failure time data University of South Carolina Department of Statistics June 4, 2013 Outline Introduction Proposed models Simulation
More informationUnivariate shrinkage in the Cox model for high dimensional data
Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)
More informationDynamic Prediction of Disease Progression Using Longitudinal Biomarker Data
Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationSurvival Analysis for Case-Cohort Studies
Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationSurvival Analysis. Stat 526. April 13, 2018
Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined
More informationA simple bivariate count data regression model. Abstract
A simple bivariate count data regression model Shiferaw Gurmu Georgia State University John Elder North Dakota State University Abstract This paper develops a simple bivariate count data regression model
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationQuasi-likelihood Scan Statistics for Detection of
for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationBias Study of the Naive Estimator in a Longitudinal Binary Mixed-effects Model with Measurement Error and Misclassification in Covariates
Bias Study of the Naive Estimator in a Longitudinal Binary Mixed-effects Model with Measurement Error and Misclassification in Covariates by c Ernest Dankwa A thesis submitted to the School of Graduate
More informationForecasting Data Streams: Next Generation Flow Field Forecasting
Forecasting Data Streams: Next Generation Flow Field Forecasting Kyle Caudle South Dakota School of Mines & Technology (SDSMT) kyle.caudle@sdsmt.edu Joint work with Michael Frey (Bucknell University) and
More informationStatistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle
Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health
More informationSome New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.
Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationGENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University
GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationELEMENTS OF PROBABILITY THEORY
ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationA Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes
A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes Achmad Efendi 1, Geert Molenberghs 2,1, Edmund Njagi 1, Paul Dendale 3 1 I-BioStat, Katholieke Universiteit
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationTime-varying proportional odds model for mega-analysis of clustered event times
Biostatistics (2017) 00, 00, pp. 1 18 doi:10.1093/biostatistics/kxx065 Time-varying proportional odds model for mega-analysis of clustered event times TANYA P. GARCIA Texas A&M University, Department of
More informationModelling Survival Events with Longitudinal Data Measured with Error
Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationModeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use
Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and
More informationABHELSINKI UNIVERSITY OF TECHNOLOGY
Cross-Validation, Information Criteria, Expected Utilities and the Effective Number of Parameters Aki Vehtari and Jouko Lampinen Laboratory of Computational Engineering Introduction Expected utility -
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationBIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationA Bivariate Point Process Model with Application to Social Media User Content Generation
1 / 33 A Bivariate Point Process Model with Application to Social Media User Content Generation Emma Jingfei Zhang ezhang@bus.miami.edu Yongtao Guan yguan@bus.miami.edu Department of Management Science
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationModel Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial
Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Jin Kyung Park International Vaccine Institute Min Woo Chae Seoul National University R. Leon Ochiai International
More informationLab 8. Matched Case Control Studies
Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage
More informationMultistate models and recurrent event models
Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationAdvanced Statistics II: Non Parametric Tests
Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationRegression Pitfalls. Pitfall Noun: A hidden or unsuspected danger or difficulty. A covered pit used as a trap.
Regression Pitfalls Pitfall Noun: A hidden or unsuspected danger or difficulty. A covered pit used as a trap. Multiple regression is a widely used and powerful tool. It is also one of the most abused statistical
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationChapter 7: Model Assessment and Selection
Chapter 7: Model Assessment and Selection DD3364 April 20, 2012 Introduction Regression: Review of our problem Have target variable Y to estimate from a vector of inputs X. A prediction model ˆf(X) has
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationLongitudinal + Reliability = Joint Modeling
Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,
More informationChapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University
Chapter 17 Failure-Time Regression Analysis William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors
More informationSimple techniques for comparing survival functions with interval-censored data
Simple techniques for comparing survival functions with interval-censored data Jinheum Kim, joint with Chung Mo Nam jinhkim@suwon.ac.kr Department of Applied Statistics University of Suwon Comparing survival
More informationarxiv: v2 [stat.me] 4 Jun 2016
Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics
More informationMeasurement error modeling. Department of Statistical Sciences Università degli Studi Padova
Measurement error modeling Statistisches Beratungslabor Institut für Statistik Ludwig Maximilians Department of Statistical Sciences Università degli Studi Padova 29.4.2010 Overview 1 and Misclassification
More informationMarginal Screening and Post-Selection Inference
Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2
More informationWeb-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros
Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves
More informationDigital Southern. Georgia Southern University. Varadan Sevilimedu Georgia Southern University. Fall 2017
Georgia Southern University Digital Commons@Georgia Southern Electronic Theses & Dissertations Graduate Studies, Jack N. Averitt College of Fall 2017 Application of the Misclassification Simulation Extrapolation
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationMultiple imputation to account for measurement error in marginal structural models
Multiple imputation to account for measurement error in marginal structural models Supplementary material A. Standard marginal structural model We estimate the parameters of the marginal structural model
More informationIndividualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models
Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018
More informationEnsemble estimation and variable selection with semiparametric regression models
Ensemble estimation and variable selection with semiparametric regression models Sunyoung Shin Department of Mathematical Sciences University of Texas at Dallas Joint work with Jason Fine, Yufeng Liu,
More information