A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression data 2. Examples I: Categorical forest health data 3. Structured additive regression models 4. Mixed model based inference 5. Software 6. Examples II: Leukemia survival times 16.12.2004
Spatio-temporal regression data Spatio-temporal regression data Regression in a general sense: Generalized linear models, multicategorical models, survival models. Usual regression data: Some response with categorical and continuous covariates. Spatio-temporal regression data: in addition spatial and temporal information. Special issues: Spatial and spatio-temporal correlations, Time- or space-varying effects, Non-linear effects, Complex interactions, Unobserved heterogeneity. A general mixed model approach for spatio-temporal regression data 1
Example I: Categorical forest health data Example I: Categorical forest health data Yearly forest health inventories carried out from 1983 to 2001. 83 beeches within a 15 km times 10 km area. Response: defoliation degree of beech i in year t, measured in three ordered categories: y it = 1 no defoliation, y it = 2 defoliation 25% or less, y it = 3 defoliation above 25%. Cumulative probit model: P (y it r) = Φ [θ r f 1 (t) f 2 (age it ) f 3 (t, age it ) f spat (s i ) u itγ] with standard normal cdf Φ and thresholds = θ 0 < θ 1 < θ 2 < θ 3 = A general mixed model approach for spatio-temporal regression data 2
Example I: Categorical forest health data 2.5 Time trend. 0-2.5 1983 1989 1995 2001 calendar time 3 0-3 Age effect. -6 7 39 71 103 135 167 199 231 age in years A general mixed model approach for spatio-temporal regression data 3
-1.5-1 -0.5 0 0.5 1 1.5 Thomas Kneib Example I: Categorical forest health data Spatial effect. -2.5 0 2.5 200 Interaction effect. 150 100 age of the tree 50 1985 1990 1995 2000 calendar time A general mixed model approach for spatio-temporal regression data 4
Structured additive regression Structured additive regression General Idea: Replace usual parametric predictor with a flexible semiparametric predictor containing Nonparametric effects of time scales and continuous covariates (P-splines, random walks), Spatial effects (Markov random fields, stationary Gaussian random fields), Interaction surfaces (2d P-splines), Varying coefficient terms (continuous and spatial effect modifiers), Random intercepts and random slopes. In a Bayesian context, all effects can be cast into one general framework. A general mixed model approach for spatio-temporal regression data 5
Mixed model based inference Mixed model based inference Each term in the predictor is associated with a vector of regression coefficients with improper multivariate Gaussian prior: p(β j τ 2 j ) exp ( 1 2τ 2 j β jk j β j ) Reparametrize the model to a proper mixed model. Obtain empirical Bayes estimates via iterating Penalized maximum likelihood for regression coefficients. Restricted Maximum / Marginal likelihood for variance parameters. A general mixed model approach for spatio-temporal regression data 6
Software Software Implemented in the public domain software package BayesX. Allows for spatio-temporal regression in the context of Univariate responses from exponential families, Multicategorical responses with ordered and unordered categories, Continuous time survival analysis. Available from http://www.stat.uni-muenchen.de/~lang/bayesx A general mixed model approach for spatio-temporal regression data 7
Examples II: Leukemia survival times Examples II: Leukemia survival times Survival time of adults after diagnosis of acute myeloid leukemia. 1,043 cases diagnosed between 1982 and 1998 in Northwest England. Spatial information in different resolutions. Model: λ(t; ) = λ 0 (t) exp[f 1 (age) + f 2 (wbc) + f 3 (tpi) + f spat (s) + γsex] A general mixed model approach for spatio-temporal regression data 8
Examples II: Leukemia survival times 4 log(baseline) Log-baseline hazard. 1-2 -5 0 7 14 time in years.5 effect of townsend deprivation index.25 0 -.25 Effect of deprivation. -.5-6 -2 2 6 10 townsend deprivation index A general mixed model approach for spatio-temporal regression data 9
Examples II: Leukemia survival times -0.44 0 0.3 District-level analysis 0.44 0.3 Individual-level analysis A general mixed model approach for spatio-temporal regression data 10
Discussion Discussion Bayesian treatment of complex regression models for spatio-temporal data without relying on MCMC simulation techniques. Closely related to penalized likelihood in a frequentist setting. Future work: Extended modelling for categorical responses, e.g. with correlated latent utilities. More general censoring schemes for survival times, e.g. interval censoring. Anisotropic spatial effects. 3d extensions of P-splines. A general mixed model approach for spatio-temporal regression data 11
References References Kneib, T. and Fahrmeir, L. (2004): A mixed model approach for structured hazard regression. SFB 386 Discussion Paper 400, University of Munich. Kneib, T. and Fahrmeir, L. (2004): Structured additive regression for categorical space-time data: A mixed model approach. Under revision for Biometrics. Fahrmeir, L., Kneib, T. and Lang, S. (2004): Penalized structured additive regression for space-time data: A Bayesian perspective. Statistica Sinica, 14, 715-745. Available from http://www.stat.uni-muenchen.de/~kneib A general mixed model approach for spatio-temporal regression data 12