Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University September 17, 2010 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 1 / 23

Outline Outline 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 2 / 23

Outline Introduction: The STME Model 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 3 / 23

Notation Introduction: The STME Model Hidden spatio-temporal process y t (s) at time t and location s Measurements z t (s i,t ) = y t (s i,t ) + ɛ t (s i,t ) i = 1,..., n t t = 1,..., T In vector notation: z 1:T := [z 1,..., z T ], where z t := [z(s 1,t ),..., z(s nt,t)] Goal: Predict y t (s 0 ); t {1,..., T } Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 4 / 23

Introduction: The STME Model Motivating Example: Remote-Sensing Data Day 1 400 Example: Global satellite measurements of CO2 395 390 385 Challenges of global remote-sensing data: Massiveness Day 2 380 Need dimension reduction Sparseness Need to take advantage of spatial and temporal correlations Nonstationarity Need a flexible model 375 370 365 Day 3 360 355 350 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 5 / 23

Introduction: The STME Model Spatio-Temporal Mixed Effects Model (Cressie et al., 2010) Process Model: y t (s) = x(s) β t + b(s) η t + γ t (s) x(s) β t : large-scale trend b(s) := [b 1 (s),..., b r (s)] : vector of known spatial basis functions η t = Hη t 1 + δ t ; t = 1, 2,... η 0 N r (0, K 0 ) δ t N r (0, U) γ t (s) N(0, σ 2 γv γ (s)): fine-scale variation Unknown parameters: θ := { {β t }, σ 2 γ, K 0, H, U } Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 6 / 23

Introduction: The STME Model Previous Approaches to Massive S-T Data Sets Many ad-hoc methods used outside the statistics literature (non-optimal, no measures of uncertainty) Other statistical spatio-temporal dimension-reduction models are less general (e.g., Nychka et al., 2002) STME model: Parameter estimation via binned-method-of-moments (Kang et al., 2010): Many arbitrary choices have to be made Estimates have to be modified to be valid Does not fully exploit temporal dependence in the data Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 7 / 23

Outline Parameter Estimation 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 8 / 23

Outline Parameter Estimation EM Estimation 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 9 / 23

Parameter Estimation Maximum-Likelihood Estimation EM Estimation Goal: Find ˆθ ML = arg max f (z 1:T θ) θ where recall z t = X t β t + B t η t + γ t + ɛ t Problem: Likelihood f (z 1:T θ) is quite complicated Solution: Expectation-maximization algorithm (Dempster et al., 1977) Maximization: Complete-data likelihood f (η 1:T, γ 1:T θ) is easy to maximize Expectation: E θ ( f (η 1:T, γ 1:T θ) z 1:T ) is obtained via FRS, a rapid sequential updating technique based on the Kalman filter (Kalman, 1960) Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 10 / 23

Parameter Estimation EM Estimation EM Estimation (Katzfuss & Cressie, 2010) The EM algorithm: Choose initial value θ [0] For l = 0, 1, 2,... (until convergence): 1. E-Step: Run FRS with θ [l] to obtain E θ [l]( f (η 1:T, γ 1:T θ) z 1:T ) 2. M-Step: θ [l+1] = arg max E θ [l]( f (η 1:T, γ 1:T θ) z 1:T ) θ 3. Go back to 1. Properties of the resulting estimates: Parameter estimates guaranteed to be valid Here, convergence to a (possibly local) maximum of the likelihood function Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 11 / 23

Outline Parameter Estimation Bayesian Estimation 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 12 / 23

Bayesian Inference Parameter Estimation Bayesian Estimation Parameters θ have a prior distribution Obtain posterior distribution of unknowns y t (s 0 ) and θ given the data z 1:T using Bayes Theorem In almost all cases, have to approximate posterior by sampling from it Shrinkage : Biased, but more efficient estimators Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 13 / 23

Priors and Posteriors Parameter Estimation Bayesian Estimation Prior distributions: Standard priors on {β t } and σ 2 γ Covariance matrices K 0 and U: Multiresolutional Givens-angle prior (Kang & Cressie, 2009) Control extreme eigenvalues Shrink off-diagonal elements toward zero Propagator matrix H: Shrink off-diagonal elements depending on how far corresponding basis functions are apart Posterior distribution: Samples of posterior distribution obtained using MCMC Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 14 / 23

Outline Application: Analysis of CO 2 Data 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 15 / 23

Application: Analysis of CO2 Data The Data Mid-tropospheric CO2 on May 1-4, 2003, as measured by AIRS (nt 14K ) Day 1 Day 2 400 395 390 385 380 375 Day 3 Day 4 370 365 360 355 350 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 16 / 23

Application: Analysis of CO 2 Data Statistical Analysis Trend: x(s) = [1 lat(s)] Make predictions on a hexagonal grid of size 57, 065 for each day Basis functions: r = 380 bisquare functions at 3 spatial resolutions b(s) 0.0 0.2 0.4 0.6 0.8 1.0 Bisquare function in one dimension Res 1 Res 2 Res 3 1.0 0.5 0.0 0.5 1.0 s Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 17 / 23

EM Results Application: Analysis of CO 2 Data Predictions using EM Standard errors using EM EM computation time: 16 iterations one minute each = 16 min total Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 18 / 23

Application: Analysis of CO2 Data Bayesian Results Posterior means Posterior standard deviations 1,500 MCMC iterations 15 seconds each = 6.25 hours total Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 19 / 23

Application: Analysis of CO 2 Data Estimates of the Propagator Matrix H EM H B 50 1 50 1 100 0.5 100 0.5 150 150 200 0 200 0 250 300 0.5 250 300 0.5 350 1 350 1 50 100 150 200 250 300 350 50 100 150 200 250 300 350 Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 20 / 23

Outline Conclusions 1 Introduction: The STME Model 2 Parameter Estimation EM Estimation Bayesian Estimation 3 Application: Analysis of CO 2 Data 4 Conclusions Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 21 / 23

Conclusions Conclusions STME Model Scalable and flexible technique for analysis of massive, nonstationary spatio-temporal data sets Provides uncertainty quantification Here, successful use on CO 2 satellite data Parameter estimation: EM Estimation: Fast, easy Bayesian estimation: Better prediction ( 10% for AIRS data), more accurate uncertainty assessment Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 22 / 23

References Conclusions Cressie, N., Shi, T., & Kang, E. L. (2010). Fixed rank filtering for spatio-temporal data. Journal of Computational and Graphical Statistics. Forthcoming. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1 38. Kalman, R. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35 45. Kang, E. L., & Cressie, N. (2009). Bayesian inference for the spatial random effects model. Department of Statistics Technical Report No. 830. The Ohio State University. Kang, E. L., Cressie, N., & Shi, T. (2010). Using temporal variability to improve spatial mapping with application to satellite data. Canadian Journal of Statistics. Forthcoming. Katzfuss, M., & Cressie, N. (2010). Spatio-Temporal Smoothing and EM Estimation for Massive Remote-Sensing Data Sets. Department of Statistics Technical Report No. 840. The Ohio State University. Nychka, D. W., Wikle, C., & Royle, J. (2002). Multiresolution models for nonstationary spatial covariance functions. Statistical Modelling, 2, 315-331. Matthias Katzfuß (OSU Statistics) STME Parameter Estimation September 17, 2010 23 / 23