Multi-resolution models for large data sets

Similar documents
Multi-resolution models for large data sets

Douglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain. October 10, 2012

A multi-resolution Gaussian process model for the analysis of large spatial data sets.

A Multi-resolution Gaussian process model for the analysis of large spatial data sets.

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

Statistical analysis of regional climate models. Douglas Nychka, National Center for Atmospheric Research

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Multivariate spatial models and the multikrig class

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

The Matrix Reloaded: Computations for large spatial data sets

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research

Hierarchical Modeling for Univariate Spatial Data

Spatial smoothing using Gaussian processes

Hierarchical Modelling for Univariate Spatial Data

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

A short introduction to INLA and R-INLA

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

State Space Representation of Gaussian Processes

Computation fundamentals of discrete GMRF representations of continuous domain spatial models

Gaussian with mean ( µ ) and standard deviation ( σ)

The Matrix Reloaded: Computations for large spatial data sets

Weather generators for studying climate change

Statistics & Data Sciences: First Year Prelim Exam May 2018

Ergodicity in data assimilation methods

Spatio-temporal prediction of site index based on forest inventories and climate change scenarios

The Bayesian approach to inverse problems

Multivariate Gaussian Random Fields with SPDEs

Spatial smoothing over complex domain

Bruno Sansó. Department of Applied Mathematics and Statistics University of California Santa Cruz bruno

1 Isotropic Covariance Functions

arxiv: v4 [stat.me] 14 Sep 2015

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

A spatio-temporal model for extreme precipitation simulated by a climate model

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Statistical Models for Monitoring and Regulating Ground-level Ozone. Abstract

Overview of Spatial Statistics with Applications to fmri

Spatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University

Climate Change: the Uncertainty of Certainty

Fast approximations for the Expected Value of Partial Perfect Information using R-INLA

Of what use is a statistician in climate modeling?

Hierarchical Low Rank Approximation of Likelihoods for Large Spatial Datasets

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

12 - Nonparametric Density Estimation

Gaussian predictive process models for large spatial data sets.

Disease mapping with Gaussian processes

Hierarchical Modeling for Multivariate Spatial Data

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Computer model calibration with large non-stationary spatial outputs: application to the calibration of a climate model

Machine Learning for Data Science (CS4786) Lecture 12

Hierarchical Modelling for Univariate Spatial Data

Multivariate modelling and efficient estimation of Gaussian random fields with application to roller data

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Density Estimation. Seungjin Choi

Probabilities for climate projections

Lecture : Probabilistic Machine Learning

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Forecasting and data assimilation

Optimization Problems

GAUSSIAN PROCESS REGRESSION

Lecture 9: Introduction to Kriging

Hierarchical Modelling for Multivariate Spatial Data

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Uncertainty and regional climate experiments

Tomoko Matsuo s collaborators

Lecture No 1 Introduction to Diffusion equations The heat equat

Chapter 9. Non-Parametric Density Function Estimation

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

STA414/2104 Statistical Methods for Machine Learning II

Covariance Matrix Simplification For Efficient Uncertainty Management

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Engineering. Spring Department of Fluid Mechanics, Budapest University of Technology and Economics. Large-Eddy Simulation in Mechanical

Multiple Random Variables

CS Lecture 19. Exponential Families & Expectation Propagation

1 Cricket chirps: an example

Stochastic Spectral Approaches to Bayesian Inference

Chapter 9. Non-Parametric Density Function Estimation

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Multivariate spatial models and the multikrig class

Hierarchical Modelling for Univariate and Multivariate Spatial Data

A Process over all Stationary Covariance Kernels

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

Nonparametric Bayesian Methods (Gaussian Processes)

Introduction Dual Representations Kernel Design RBF Linear Reg. GP Regression GP Classification Summary. Kernel Methods. Henrik I Christensen

A HIERARCHICAL MODEL FOR REGRESSION-BASED CLIMATE CHANGE DETECTION AND ATTRIBUTION

Disentangling Gaussians

A comparison of U.S. precipitation extremes under two climate change scenarios

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

GWAS V: Gaussian processes

Chapter 5: Spectral Domain From: The Handbook of Spatial Statistics. Dr. Montserrat Fuentes and Dr. Brian Reich Prepared by: Amanda Bell

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Low-rank methods and predictive processes for spatial models

Introduction to emulators - the what, the when, the why

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

Bayesian spatial quantile regression

Gaussian Process Regression and Emulation

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Transcription:

Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012

Credits Steve Sain, NCAR Tia LeRud, UC Davis Dorit Hammerling, U Michigan Soutir Bandyopadhyay, Lehigh Finn Lindgren, NTNU, Norway D. Nychka LatticeKrig 2

Outline A climate data set Kriging... Compact basis functions (Φ), Markov Random fields (H) The multi-resolution model Properties of the spatial process A climate example Key idea: Introduce sparse basis and precision matrices without compromising the spatial model. D. Nychka LatticeKrig 3

Observed mean summer precipitation 1720 stations reporting, mean for 1950-2010 Observed JJA Precipitation (.1 mm) 7000 6000 5000 4000 3000 2000 1000 D. Nychka LatticeKrig 4

Kriging or Gaussian spatial process estimates D. Nychka LatticeKrig 5

Estimating a curve or surface. An additive statistical model: Given n pairs of observations (x i, y i ), i = 1,..., n y i = g(x i ) + ɛ i ɛ i s are random errors and g is an unknown, smooth realization of a Gaussian process. The goals: Estimate g(x) based on the observations Quantify the uncertainty in the estimate. D. Nychka LatticeKrig 6

Random Effects/Linear model for g {Φ j }: m basis functions g(x) = j Φ j (x)c j A linear model: y = Φc + ɛ Random effects: c MN(0, ρp ) and ɛ MN(0, σ 2 I) Implied Covariance: E[g(x)g(x )] = j,k Φ j (x)ρp j,k Φ k (x ) λ = σ 2 /ρ plays an important role as a parameter. D. Nychka LatticeKrig 7

Key ideas for large data sets Inverse of P chosen to be sparse. Basis functions have compact support. Still have a useful spatial model! Why this works Find c by: Ridge regression/ conditional expectation/blue ĝ(x) = E[g(x) y, P ] = k=1,n ĉ k Φ k (x) ĉ = ( Φ T Φ + λp 1) 1 Φ T y, λ = σ 2 /ρ Φ T, Φ T Φ, P 1 are sparse. D. Nychka LatticeKrig 8

Choosing the basis and P D. Nychka LatticeKrig 9

A recipe for radial basis functions Basis function j (x) = ϕ( x u j /θ) ϕ is a positive definite, compactly supported function a nice bump. {u j } basis centers on a regular grid θ scale set to provide some overlap 2-d Wendland bump function Standard Wendland (order=2) Radial 2-d function 1.5 1.0 0.5 0.0 0.5 1.0 1.5 D. Nychka LatticeKrig 10

A recipe for P 1 Markov random field among coefficients: c is a spatial AR 1 (4 + κ 2 )c j l N c l = e j {e j } are uncorrelated N(0,1) and N is 4 nearest neighbors. Hc MN(0, I) or c MN(0, (H T H) 1 ) i.e. P = (H T H) 1 ) Weights in lattice format:....... -1... -1 (4 + κ 2 ) -1... -1....... D. Nychka LatticeKrig 11

What about SPDEs? An alternative way to define Q = P 1 is Q j,k = LΦ j LΦ k dx where L is the differential operator associated with the SPDE. D. Nychka LatticeKrig 12

Realizations of c Simulated fields on a 101 101 lattice 2 1 0 1 2 5 0 5 10 80 70 60 50 40 30 κ =.5 κ =.1 κ =.01 Note: κ acts as range parameter D. Nychka LatticeKrig 13

Combining basis and coefficients g(x) = j Φ j (x)c j Coefficient field and evaluated surface. 0.7 0.8 0.9 1.0 1.1 1.2 Normalized to give a constant marginal variance: E[g(x) 2 )] = j Φ j (x) 2 P j,j = 1 Overlap set to 2.5 units of lattice. D. Nychka LatticeKrig 14

A convolution interpretation g(x) = j Φ j (x)c j = j ϕ((x j u j )/θ)c(u j ) = (1/m) j ϕ((x j u j )/θ)(m)c(u j ) ϕ((x u)/θ)c(u)du Note: this only makes sense if centers are on a uniform grid. D. Nychka LatticeKrig 15

Generalizing to a multi-resolution basis D. Nychka LatticeKrig 16

A 1-d Multi-Res cartoon... First level: 8 basis functions 0 2 4 6 8 Second level: 16 basis function 0 2 4 6 8 More levels: Increasing by factor of 2... D. Nychka LatticeKrig 17

Example of multi-resolution in 2d An example on the unit square starting with 11 11 grid: First level: centers on 11 11 grid scale of 2.5/10 =.25 11 2 = 121 basis functions Second level: centers on 21 21 unit grid scale of 2.5/20 21 2 = 441 basis functions. Four level multi-resolution for this case has 8804 basis functions. D. Nychka LatticeKrig 18

Some assumptions: The multiresolution: g(x) = g 1 (x) + g 2 (x) +... + g L (x) = Φ 1 j (x)c1 j + Φ 2 j (x)c2 j +... + Φ L j (x)cl j (1) Coefficients at each level follow a Markov Random field Coefficients between levels are independent At least two parameters: (κ l, ρ l ) at each level Correlation functions for each level correlation 0.0 0.4 0.8 0.0 0.5 1.0 1.5 2.0 2.5 3.0 log (1 correlation) 1e 05 1e 03 1e 01 0.005 0.020 0.100 distance distance D. Nychka LatticeKrig 19

Properties of spatial process D. Nychka LatticeKrig 20

Flexibility of LatticeKrig model Fitting an exponential (minimizing mean squared error) First level resolution of 10 10 3 levels, 4 levels, target exponential correlation 0.0 0.4 0.8 Error 0.10 0.00 0.05 4 3 0.0 0.2 0.4 0.6 0.8 1.0 distance 0.00 0.10 0.20 0.30 Distance Also works well for approximating smoother covariances. D. Nychka LatticeKrig 21

More Flexibility of LatticeKrig model Fitting a mixture of exponentials First level resolution of 10 10 3 levels, 4 levels, target:.4exp(.1) +.6Exp(3) Correlation 0.2 0.4 0.6 0.8 1.0 Error 0.03 0.01 0.01 4 3 0.0 0.4 0.8 Distance 0.00 0.10 0.20 0.30 Distance D. Nychka LatticeKrig 22

Some Theory Switch to an infinite sum of independent convolution processes. g(x) = g 1 (x) + g 2 (x) + g 3 (x) +... g l (x) has marginal variance ρ l and the spatial correlation range is θ l. What class of covariances can be approximated by letting {ρ l } and {θ l } vary? θ l = 2 l gives the power of 2 scaling of the multi-resolution. ρ l = 2 l/2 gives a process with similar smoothness as an exponential (matches tail behavior of spectral density.) D. Nychka LatticeKrig 23

The details: A theorem θ l = 2 l and ρ l = θ β 1 l c l (u) follows a Matern process with smoothness (ν) 1 and range θ l. ϕ is a K th order Wendland function. β 1 > 0, (β 1 + 1) < 5 + 2K If S(r) is the spectral density for the process g then C 1 r 2(β 1+1) < S(r) < C 2 r 2(β 1+1) as r. Comments: θ l = 2 l gives the power of 2 scaling of the multi-resolution. β 1 = 1/2 gives an exponential covariance-like spectral density. The tail behavior is not directly related to the smoothness of the basis functions or of the lattice process! They just need to be smooth enough. Approximation is accurate after about 4-6 components. D. Nychka LatticeKrig 24

Benefits of the multi-resolution Mixture of different scaled covariance functions can approximate standard covariance famlies. A mixture has the flexibility to approximate more complex covariance functions For irregularly spaced observational data the distances among station may vary widely and a multi-scale covariance model will adjust to these differences. D. Nychka LatticeKrig 25

Back to climate data D. Nychka LatticeKrig 26

Some details Used log transformation and weighted by number of observations Used stereographic projection for locations Elevation included as linear fixed effect. Covariance parameters found by maximum likelihood using space filling designs and partial maximization over ρ and σ. 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 D. Nychka LatticeKrig 27

Estimated correlation functions 0.4 0.6 Matern MLE Solid line 4 level model beginning with 10 10 and a single κ Dotted line 1 level with 73 73 basis functions 0.0 0.2 Correlation 0.8 1.0 In projected coordinates size of spatial domain roughly 1.0 units. 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Distance Why these are so different is still uncertain. D. Nychka LatticeKrig 28

Predicted field and uncertainty D. Nychka LatticeKrig 29 For 4 level covariance model Mean Summer total rainfall (cm) pred. standard error/mean 1 2 5 20 40 60 0.05 0.1 0.25 0.5 1 2 4 Standard errors found by conditional simulation of 100 fields.

Summary Multi-resolution can approximate standard covariance families (e.g. Matern) Computational efficiency gained by compact basis functions and sparse precision matrix. Flexibility in model to account for nonstationary spatial dependence. Transform to preserve nonnegative rainfall and add orographic covariates See LatticeKrig package in R D. Nychka LatticeKrig 30

Thank you! D. Nychka LatticeKrig 31