Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Jeffrey S. Morris University of Texas, MD Anderson Cancer Center Joint wor with Marina Vannucci, Philip J. Brown, and Raymond J. Carroll 4-Aug-05 Jeffrey S. Morris 1
Outline Introduction Colon Carcinogenesis Studies Hierarchical Functional Data Wavelets and Wavelet Regression Overview of Method Choice of Smoothing Parameters Application Conclusions 4-Aug-05 Jeffrey S. Morris
Colon Carcinogenesis Studies Colon cancer asymptomatic until advanced Limited treatment options for advanced disease Preventive approaches crucial Better understand carcinogenic mechanisms Identify & understand ris factors Dietary factors ey in colon cancer ris 4-Aug-05 Jeffrey S. Morris 3
Colon Carcinogenesis Studies Carcinogen-induced colon cancer in rats model used to study biological mechanisms in vivo Rats given treatments of interest (diet) over time Exposed to carcinogen Animals euthanized, colon removed and examined. IHC staining used to quantify responses Slides treated with chemical Staining intensity indicates level of response. 4-Aug-05 Jeffrey S. Morris 4
Architecture of Colon Lumen crypts Stem Cells: Mother cells near bottom Depth in crypt ~ age of cells Relative Cell Position (Depth): t (0,1) 4-Aug-05 Jeffrey S. Morris 5
Colon Carcinogenesis Studies Data structure: Multiple treatment groups (diets) Multiple rats for each treatment Multiple crypts sampled for each rat Response quantified over fixed grid of t along side of each selected crypt Hierarchical functional data Multi-level random effects model with functional observations 4-Aug-05 Jeffrey S. Morris 6
Hierarchical Functional Data -level HF model: Y g g abc abc ab = g ( t) + e ( t) = ( t) = abc g g a ab ( t) ( t) + ξ abc + η ab, abc ( t) ( t) where eabc ~ MVN( 0, σ e I ), η abc ( ) and ξ ab ( ): mean 0 with covariance matrices and 1 1 1 4-Aug-05 Jeffrey S. Morris 7 Σ ( t, t ) Σ ( t, t ).
Hierarchical Functional Data Parametric model (linear): g(t) = X(t) β at each level of hierarchy: e.g. line or parabola at each level. i.e. g abc (t) = X(t) β abc, Cov(β abc β ab ) = Σ β1 g ab (t) = X(t) β ab, Cov(β ab β a ) = Σ β g a (t) = X(t) β a Can be fit using standard mixed models software. Efficient and easy IF you can find a parametric form for the g( ) functions. 4-Aug-05 Jeffrey S. Morris 8
Hierarchical Functional Data DNA Adduct Level (Damage) Fish,T 6,R 1,C Fish,T 6,R 1,C 6 Fish,T 6,R 1,C 11 Fish,T 6,R 1,C 15 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 x x x x Corn,T 3,R,C 1 Corn,T 3,R,C 6 Corn,T 3,R,C 9 Corn,T 3,R,C 11 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 x x x x Corn,T 1,R 1,C 4 Corn,T 1,R 1,C 9 Corn,T 1,R 1,C 14 Corn,T 1,R 1,C 16 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 repair 0 0 40 60 x x x x 4-Aug-05 Jeffrey S. Morris 9
Hierarchical Functional Data MGMT Level (DNA Repair Enzyme) Fish, T 3, R 1, C 6 Fish, T 9, R 3, C 8 Corn, T 9, R 3, C 19 MGMT 0 0 40 60 80 MGMT 0 0 40 60 80 MGMT 0 0 40 60 80 Fish, T 3, R 1, C 3 t Fish, T 9, R 3, C 19 t Corn, T 9, R 3, C 0 t MGMT 0 0 40 60 80 MGMT 0 0 40 60 80 MGMT 0 0 40 60 80 t t t 4-Aug-05 Jeffrey S. Morris 10
Goals Develop method to analyze H.F.D. Est. population-level functions g a (t) (nonparametrically) Perform Bayesian inference Uncover relative variability at each hierarchical level. Wor for hierarchical models with 1 or levels Adust for possibly unequal sample sizes Model spie-lie data at lowest hierarchical level Our method first to do all these Apply to colon carcinogenesis data set Use of wavelets fundamental to our approach 4-Aug-05 Jeffrey S. Morris 11
Wavelets & Wavelet Regression Wavelets: families of orthonormal basis functions. Wavelet series for f (t): Wavelet basis functions: ψ, Mother wavelet: ψ (t), f ( t) = d ψ / ( t) = ψ(, I t ), ( t) Wavelet coefficients: d, = f ( t) ψ, ( t) dt 4-Aug-05 Jeffrey S. Morris 1
Wavelets & Wavelet Regression Wavelet coefficients calculated using Discrete Wavelet Transform (DWT) Efficient -- O(n) n data points y n wavelet coefficients d Indexed by: resolution level (scale), wavelet number (location). Can be written: d = W y (W orthogonal) Inverse Transform (IDWT): y= W t d 4-Aug-05 Jeffrey S. Morris 13
Wavelets & Wavelet Regression Key Properties: Compact support -- model local features in time (spies) Whitening property -- less correlation in wavelet space, allowing parsimonious modeling Sparsity -- most wavelet coefficients 0 for given function Can be used to perform Nonparametric Regression (Donoho & Johnstone, 1994, many others) 4-Aug-05 Jeffrey S. Morris 14
Wavelets & Wavelet Regression Data space model: y = f(t) + e t = equally spaced grid, length n= J, on (0,1) e ~ MVN(0,σ I) In wavelet space: d = Wy = θ + e d = empirical wavelet coefficients θ = true wavelet coefficients By orthogonality, e ~ MVN(0,σ I) 4-Aug-05 Jeffrey S. Morris 15
Wavelets & Wavelet Regression Wavelet Regression procedure: (1) Transform to wavelet space () Threshold/shrin coefficients 0 (3) Transform bac to data space (IDWT) Thresholding/shrinage filters out noise, performs regularization. Shrinage can be done by placing a certain prior structure on the θ. 4-Aug-05 Jeffrey S. Morris 16
Method: Overview 1. Convert data y abc to wavelet space d abc - Involves 1 DWT for each crypt. Fit hierarchical model in wavelet space to obtain Posterior distribution of true wavelet coefficients θ a corresponding to g a (t) Variance component estimates to assess relative variability 3. Use IDWT to obtain posterior distributions of g a (t) for estimation and inference 4-Aug-05 Jeffrey S. Morris 17
Method: Model in Wavelet Space d abc = { d, } = W y abc abc d, abc ~ iid N ( θ, abc, σ e ) θ, abc ~ iid N ( θ, ab, σ 1, ) θ, ab ~ iid N ( θ, a, σ, ) 1, σ e assumed nown, in practice estimated by Var( d abc ) IG priors on σ and σ 1,, when unnown 4-Aug-05 Jeffrey S. Morris 18
Method; Shrinage Prior, Prior on θ thresholding/shrinage a Mixture of point mass at 0 and Normal,, θ ~ N (0, γ τ a a, γ a ~ Bernoulli( p ) ) p and τ : smoothing parameters 4-Aug-05 Jeffrey S. Morris 19
Method: Estimating g a (t) If VC nown closed form posterior ( θ ˆ Θ, a, NS h(.), a σ d, σ µ = = 1,, σ ˆ Θ, a, NS Var = MLE of θ a, ( ) Θ ˆ,, a, NS Normal( µ, σ ( ), τ, p, Θˆ a, NS assuming no shrinage 4-Aug-05 Jeffrey S. Morris 0 h h ( ), τ, p, Θˆ a, NS = shrinage function Using inverse wavelet transform, we get entire posterior distribution of g a (t). ) ~ )
Method: Estimating θ, a VC Unnown -- no closed form posterior Metropolis-within-Gibbs Sampler used on marginalized model. Sampling scheme: (1) Sample from Let and {, Θ, = 1, K, K } 4-Aug-05 Jeffrey S. Morris 1 Ω Θ () Sample from Θ a = { σ = 1, a, σ, ( ), Θa d, Ω for each, ( Ω d, ) for each a }
Method: Estimating Variability Relative variability at different hierarchical levels can help guide design of future experiments Assessed using the trace: { } { } { tr Σ, tr Σ, and tr σ I} Orthogonality Compute trace in wavelet space V 0 V 1 V = = = tr tr tr { } σ I = nσ e { Σ1} = { Σ} = K σ K σ e 1,, 4-Aug-05 Jeffrey S. Morris 1 Relative variability at level i: V /( V V ) i + + 0 1 V e
Choice of Smoothing Parameters Shrinage Function: Pr Shrinage BF a BF + 1 p 1 Z T 1+ T exp 1 p 44314 444 4444 T + 1 3 (,, γ = 1 ) a d =, BF = Bayes Factor ( ) { BF = Bayes Factor Z h( Z, T, p ) Prior Odds ˆ, / Var( ˆ = θ θ, a, NS a, NS ) T = Pr γ a a T 14444 3 1 + 443 Nonlinear Lielihood Ratio {, = 1 d, } 1 Shrinage Linear, a, NS 4-Aug-05 Jeffrey S. Morris 3 T / Var( ˆ = τ θ )
Choice of Smoothing Parameters p : Expected proportion of non-negligible wavelet coefficients at resolution p Belief that many coeffs. are essentially noise. Results in more smoothing of features at scale Generally -- expect p to decrease in Higher frequency behavior mostly noise 4-Aug-05 Jeffrey S. Morris 4
Choice of Smoothing Parameters τ : Variation of nonzero wavelet coeffs. ( ), = τ / Var ˆ a, NS T Θ Must tae care in choosing Too small too much linear shrinage Too large Lindley s paradox Too much nonlinear shrinage for some Z T (10,100) seem to wor quite well. 4-Aug-05 Jeffrey S. Morris 5
Choice of Smoothing Parameters (a) Shrinage functions for p= 0.5 (b) Shrinage functions for p= 0.1 h(z,p,t) T= T= 5 T= 0 T= 100 T= 1000 h(z,p,t) T= T= 5 T= 0 T= 100 T= 1000-4 - 0 4-4 - 0 4 Z Z 4-Aug-05 Jeffrey S. Morris 6
Example: Data MGMT expression (DNA repair enzyme) diets (fish/corn) x 5 times (0h,3h,6h,9h,1h) 3 rats per diet x time combination ~ 5 crypts per rat, function sampled at 56= 8 equally spaced t Daubechies-8 wavelet chosen p={1,.0,.07,.03,.015,.01,0.005}, T = 0 MCMC 5000 iterations; Burn-in=5000 4-Aug-05 Jeffrey S. Morris 7
Example: Results Estimates & 90% posterior bounds by diet/time Fish Oil MGMT level 0 5 10 15 0 5 30 Fish Oil 0 hr Fish Oil 3 hr Fish Oil 6 hr Fish Oil 9 hr Fish Oil 1 hr 0 h 3 h 6 h 9 h 1 h MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 Relative Cell Position Relative Cell Position Relative Cell Position Relative Cell Position Relative Cell Position Corn Oil 0 hr Corn Oil 3 hr Corn Oil 6 hr Corn Oil 9 hr Corn Oil 1 hr Corn Oil MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 MGMT level 0 5 10 15 0 5 30 Relative Cell Position Relative Cell Position Relative Cell Position Relative Cell Position Relative Cell Position 4-Aug-05 Jeffrey S. Morris 8
Conclusion: Biological Results More MGMT at top of crypts Diet difference at 1 h near luminal surface (t ~ 1) Fish oil-fed rats -- MGMT from 9 h to 1 h Corn oil-fed rats -- MGMT from 9h to 1 h Fish MGMT > Corn MGMT Crypt-level dominates variability 90.0% at crypt level, 9.6% at rat level, 0.4% residual Need lots of crypts!! MGMT operates on cell-by-cell basis 4-Aug-05 Jeffrey S. Morris 9
Conclusion: Summary Method to fit hierarchical functional data nonparametrically estimate mean functions w/ bounds evaluate relative variability at hierarchical levels perform Bayesian inference can handle 1 or hierarchical levels & unbalanced designs appropriate when functions at base level irregular Fitting done in wavelet space Whitening property allows parsimonious modeling of Σ i Can represent spatially heterogeneous functions well Motivated by colon carc., but applicable elsewhere Possible extensions of method 4-Aug-05 Jeffrey S. Morris 30