Dynamic Disease Screening Peihua Qiu pqiu@ufl.edu Department of Biostatistics University of Florida December 10, 2014, NCTS Lecture, Taiwan p.1/25
Motivating Example SHARe Framingham Heart Study of NHLBI. Many residents at Framingham MA were involved. Major risk factors of cardiovascular diseases: blood pressure, total cholesterol level (TCL), smoking, obesity,... Identify patients with irregular longitudinal patterns of the disease risk factors as early as possible. Disease early detection and prevention December 10, 2014, NCTS Lecture, Taiwan p.1/25 Dynamic screening (DS) problem
DS Problem DS problem is popular Many products (e.g., airplanes, cars) are checked regularly or occasionally about certain variables related to their quality and/or performance. If the observed values of a product are significantly worse than the values of a typical well-functioning product of the same age, then some adjustments or interventions should be made to avoid unpleasant consequences. December 10, 2014, NCTS Lecture, Taiwan p.2/25
Possible Statistical Methods Confidence interval of the mean responses by longitudinal data analysis. This method uses the cross-sectional comparison approach. It does not make use of all history data of a subject. It cannot detect a shift sequentially. December 10, 2014, NCTS Lecture, Taiwan p.3/25
Possible Statistical Methods Statistical process control (SPC) methods Monitor each subject sequentially Use all history data of the subject They monitor subjects separately and cannot compare different subjects The process mean and variance may not be constants even when the subject is IC December 10, 2014, NCTS Lecture, Taiwan p.4/25
Dynamic Screening System (DySS) Estimate regular longitudinal pattern from an IC dataset Standardize observations of a new subject to monitor Monitor the standardized observations by a control chart December 10, 2014, NCTS Lecture, Taiwan p.5/25
References Qiu, P., and Xiang, D. (2014), Dynamic screening system: an approach for dynamically identifying irregular individuals, Technometrics, 56, 248-260. Qiu, P., and Xiang, D. (2014), Surveillance of cardiovascular diseases using a multivariate dynamic screening system, revised for Statistics in Medicine. Qiu, P., Zi, X., and Zou, C. (2014), Dynamic nonparametric curve monitoring, submitted. Li, J., and Qiu, P. (2014), Nonparametric dynamic screening system for monitoring correlated longitudinal data, submitted. Xiang, D., Qiu, P., and Pu, X. (2013), Nonparametric regression analysis of multivariate longitudinal data, Statistica Sinica, 23, 769 789. December 10, 2014, NCTS Lecture, Taiwan p.6/25
MDySS Qiu, P., and Xiang, D. (2014) Multivariate dynamic screening system (MDySS) December 10, 2014, NCTS Lecture, Taiwan p.7/25
Estimate regular longitudinal pattern IC data: observations of m well-functioning subjects For i = 1,2,...,m,j = 1,2,...,J i,t ij [0,1], y(t ij ) = µ(t ij )+ε(t ij ) y(t ij ) = (y 1 (t ij ),...,y q (t ij )) µ(t ij ) = (µ 1 (t ij ),...,µ q (t ij )) Regular pattern: µ(t) and Σ(s, t) = Cov(y(s), y(t)) Xiang, Qiu, and Pu (2013): estimation of µ(t) December 10, 2014, NCTS Lecture, Taiwan p.8/25 and Σ(s,t).
Standardize Observations New subject s y values are observed at t 1,t 2,... over [0,1]. When s/he is IC, y(t j) = µ(t j)+σ 1 2 (t j,t j)ǫ(t j) Standardized observations: ǫ(t j) = Σ ( ) 1 2 (t j,t j) y(t j) µ(t j; Σ) December 10, 2014, NCTS Lecture, Taiwan p.9/25
A Note By using the standardized observations of the new subject, we have actually compared its longitudinal pattern cross-sectionally with the estimated regular longitudinal pattern at the time points t 1,t 2,... December 10, 2014, NCTS Lecture, Taiwan p.10/25
Sequential Monitoring Zou and Qiu (2009): LASSO-based MEWMA chart MEWMA statistic U j = λ L ǫ(t j)+(1 λ L )U j 1 min α R q(u j α) (U j α)+ γ k q l=1 α l U jl Q j = max k=1,...,q W j, γk E(W j, γk ) Var(Wj, γk ) > h L December 10, 2014, NCTS Lecture, Taiwan p.11/25
Performance Evaluation Performance measures: IC average run length ARL 0 OC average run length ARL 1 December 10, 2014, NCTS Lecture, Taiwan p.12/25
Performance Evaluation (Con d) If {t j,j = 1,2,...} are unequally spaced, ARL 0 and ARL 1 may not be appropriate Basic time unit ω: largest time unit that all observation times are integer multiples of ω Define n j = t j /ω, for j = 0,1,2,..., where n 0 = t 0 = 0. t j = n jω, for all j. December 10, 2014, NCTS Lecture, Taiwan p.13/25
Performance Evaluation (Con d) IC: If a signal is given at the sth observation time, then E(n s) measures the IC average time to signal (ATS), denoted as ATS 0. OC: If a shift occurs at the τth observation time and a signal is given at the sth observation time with s τ, then E(n s n τ) is the OC ATS, denoted as ATS 1. December 10, 2014, NCTS Lecture, Taiwan p.14/25
SHARe Framingham Heart Study m = 945 non-stroke patients (IC data) 27 stroke patients (new subjects) each patient was followed 7 times (i.e., J = 7) Four medical indices: systolic blood pressure (mmhg), diastolic blood pressure (mmhg), total cholesterol level (mg/100ml), and glucose level (mg/100ml) December 10, 2014, NCTS Lecture, Taiwan p.15/25
SHARe Framingham Heart Study (con d) Qj 0 5 10 15 20 Patient 1 Patient 2 Patient 3 Patient 4 Qj 0 10 20 30 Patient 5 Patient 6 Patient 7 Patient 8 Qj 0 10 20 30 Patient 9 Patient 10 Patient 11 Patient 12 Qj 0 10 20 30 Patient 13 Patient 14 Patient 15 Patient 16 Qj 0 5 10 15 Patient 17 Patient 18 Patient 19 Patient 20 Qj 0 5 10 15 Patient 21 Patient 22 Patient 23 Patient 24 1 2 3 4 5 6 7 j Qj 0 5 10 15 Patient 25 Patient 26 Patient 27 December 10, 2014, NCTS Lecture, Taiwan p.16/25 1 2 3 4 5 6 7 j 1 2 3 4 5 6 7 j 1 2 3 4 5 6 7 j
SHARe Framingham Heart Study (con d) DySS approach: 26 out of 27 stroke patients got signals; 131 out of 945 non-stroke patients got signals. The average signal time is 11.84 years. December 10, 2014, NCTS Lecture, Taiwan p.17/25
Dynamic Curve Monitoring Qiu, Zi, and Zou (2014) Model µ(t i )+σ(t i )ε(t i ), for t i [0,τ], y(t i )= µ(t i )+σ(t i )g(t i )+σ(t i )ε(t i ), for t i (τ,t], After the transformation {y(t i ) µ(t i )}/σ(t i ), y(t i )= { ε(ti ), for t i [0,τ], g(t i )+ε(t i ), for t i (τ,t]. H 0 : τ > T versus H 1 : τ [0,T] December 10, 2014, NCTS Lecture, Taiwan p.18/25
Test and Estimation of g(t i ) Loss function: Q(t m ;λ) argmin a R m i=1 m i=1 {y(t i ) g(t i )} 2 (1 λ) t m t i {y(t i ) a} 2 (1 λ) t m t i ĝ λ (t m ) = m i=1 w i(t m )y(t i )/ m i=1 w i(t m ) Q H1 (t m ;λ) = m i=1 {y(t i) ĝ(t i )} 2 w i (t m ) Q H0 (t m ;λ) = m i=1 {y(t i)} 2 w i (t m ) December 10, 2014, NCTS Lecture, Taiwan p.19/25
Test and Estimation of g(t i ) Weighted GLR test statistic (WGLR): W λ (t m ) =Q H0 (t m ;λ) Q H1 (t m ;λ) m = w i (t m ){2y(t i ) ĝ(t i )}ĝ(t i ). i=1 Recursive formulas: W λ (t m ) = w m 1 (t m )W λ (t m 1 )+{2y(t m ) ĝ(t m )}ĝ(t m ), ĝ(t m ) = {α m 1 ĝ(t m 1 )+y(t m )}/α m where α m = m i=1 w i(t m ) = w m 1 (t m )α m 1 +1. December 10, 2014, NCTS Lecture, Taiwan p.20/25
Test of g(t i ) (con d) Dynamic EWMA (DEWMA) chart: W λ(t m ) = {W λ (t m ) E λ (t m )}/ V λ (t m ) > L When the observation times are equally spaced, DEWMA is the conventional EWMA chart. Benefits: Accommodate the unequally spaced observation times by using the weights (1 λ) t m t i Wλ (t m) is robust when g(t) values change December 10, 2014, NCTS Lecture, Taiwan p.21/25 much over time
Simulation Results t [0,1000], d = 1 IC model: µ(t) = 1+0.3t 1/2, σ 2 (t) = µ 2 (t) OC models: (I) Step Shift: g(t) = δ, for t > τ (II) Quadratic Drift: g(t) = (t τ) 2 δ (III) Sine Drift: g(t) = sin(0.003π(t τ))δ December 10, 2014, NCTS Lecture, Taiwan p.22/25
Model (I) Model (II) Model (III) Y 1.0 1.1 1.2 1.3 1.4 1.5 δ = 0 δ = 0.05 δ = 0.1 Y 1.0 1.1 1.2 1.3 1.4 1.5 δ = 0 δ = 0.5 δ = 1 Y 1.0 1.1 1.2 1.3 1.4 1.5 δ = 0 δ = 0.05 δ = 0.1 0 100 300 500 t (a) 0 100 300 500 t (b) 0 100 300 500 t (c) Y 0.5 0.0 0.5 1.0 True function local linear EWMA 0 100 300 500 Time (d) Y 0.5 0.0 0.5 1.0 0 100 300 500 Time (e) Y 0.5 0.0 0.5 1.0 0 100 300 500 December 10, 2014, NCTS Lecture, Taiwan p.23/25 Time (f)
Model I (equally) Model II (equally) Model III (equally) log(ats) 0 1 2 3 4 5 DEWMA(λ = 0.05) DEWMA(λ = 0.2) EWMA(λ = 0.05) EWMA(λ = 0.2) log(ats) 0 1 2 3 4 5 log(ats) 0 1 2 3 4 5 0.0 1.0 2.0 3.0 δ (a) Model I (random) 0 5 10 20 30 δ (b) Model II (random) 0 1 2 3 4 5 6 δ (c) Model III (random) log(ats) 0 1 2 3 4 5 log(ats) 0 1 2 3 4 5 log(ats) 0 1 2 3 4 5 0.0 1.0 2.0 3.0 δ (d) 0 5 10 20 30 δ (e) 0 1 2 3 4 5 6 December 10, 2014, NCTS Lecture, δ Taiwan p.24/25 (f)
Future Research Autocorrelation Nonparametric charts Accommodation of covariates December 10, 2014, NCTS Lecture, Taiwan p.25/25