Introduction to Spatial Data and Models

Size: px
Start display at page:

Download "Introduction to Spatial Data and Models"

Transcription

1 Introduction to Spatial Data and Models Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 Researchers in diverse areas such as climatology, ecology, environmental health, and real estate marketing are increasingly faced with the task of analyzing data that are: highly multivariate, with many important predictors and response variables, geographically referenced, and often presented as maps, and temporally correlated, as in longitudinal or other time series structures. motivates hierarchical modeling and data analysis for complex spatial (and spatiotemporal) data sets. 2 JSM 2009 Hierarchical Modeling and Analysis Type of spatial data Exploration of spatial data point-referenced data, where Y (s) is a random vector at a location s R r, where s varies continuously over D, a fixed subset of R r that contains an r-dimensional rectangle of positive volume; areal data, where D is again a fixed subset (of regular or irregular shape), but now partitioned into a finite number of areal units with well-defined boundaries; point pattern data, where now D is itself random; its index set gives the locations of random events that are the spatial point pattern. Y (s) itself can simply equal for all s D (indicating occurrence of the event), or possibly give some additional covariate information (producing a marked point pattern process). First step in analyzing data First Law of Geography: Mean + Error Mean: first-order behavior Error: second-order behavior (covariance function) EDA tools examine both first and second order behavior Preliminary displays: Simple locations to surface displays 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Exploration of spatial data Exploration of spatial data First Law of Geography Scallops Sites data = mean + error 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis

2 2 Deterministic surface interpolation Spatial surface observed at finite set of locations S = {s, s 2,..., s n } Tessellate the spatial domain (usually with data locations as vertices) Fit an interpolating polynomial: f(s) = i w i (S ; s)f(s i ) Interpolate by reading off f(s 0 ). Issues: Sensitivity to tessellations Choices of multivariate interpolators Numerical error analysis 7 JSM 2009 Hierarchical Modeling and Analysis Scallops data: image and contour plots Longitude Latitude 8 JSM 2009 Hierarchical Modeling and Analysis Scallops data: image and contour plots Drop-line scatter plot 9 JSM 2009 Hierarchical Modeling and Analysis Scallops data: image and contour plots Surface plot Longitude Latitude logsp 0 JSM 2009 Hierarchical Modeling and Analysis Scallops data: image and contour plots Image contour plot Longitude Latitude JSM 2009 Hierarchical Modeling and Analysis Scallops data: image and contour plots Locations form patterns Eastings Northings JSM 2009 Hierarchical Modeling and Analysis

3 Scallops data: image and contour plots Scallops data: image and contour plots Surface features Interesting plot arrangements Shrub Density Northings Eastings N S UTM coordinates E W UTM coordinates 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Elements of point-level modelling Stationary Gaussian processes Point-level modelling refers to modelling of spatial datacollected at locations referenced by coordinates (e.g., lat-long, Easting-Northing). Fundamental concept: Data from a spatial process {Y (s) : s D}, where D is a fixed subset in Euclidean space. Example: Y (s) is a pollutant level at site s Conceptually: Pollutant level exists at all possible sites Practically: Data will be a partial realization of a spatial process observed at {s,..., s n } Statistical objectives: Inference about the process Y (s); predict at new locations. Suppose our spatial process has a mean, µ (s) = E (Y (s)), and that the variance of Y (s) exists for all s D. Strong stationarity: If for any given set of sites, and any displacement h, the distribution of (Y (s ),..., Y (s n )) is the same as, (Y (s + h),..., Y (s n + h)). Weak stationarity: Constant mean µ(s) = µ, and Cov(Y (s), Y (s + h)) = C(h): the covariance depends only upon the displacement (or separation) vector. Strong stationarity implies weak stationarity The process isgaussian if Y = (Y (s ),..., Y (s n )) has a multivariate normal distribution. For Gaussian processes, strong and weak stationarity are equivalent. 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis Stationary Gaussian processes Stationary Gaussian processes Variograms Suppose we assume E[Y (s + h) Y (s)] = 0 and define E[Y (s + h) Y (s)] 2 = V ar (Y (s + h) Y (s)) = 2γ (h). This is sensible if the left hand side depends only upon h. Then we say the process is intrinsically stationary. γ(h) is called the semivariogram and 2γ(h) is called the variogram. Note that intrinsic stationarity defines only the first and second moments of the differences Y (s + h) Y (s). It says nothing about the joint distribution of a collection of variables Y (s ),..., Y (s n ), and thus provides no likelihood. Intrinsic Stationarity and Ergodicity Relationship between γ(h) and C(h): 2γ(h) = V ar(y (s + h)) + V ar(y (s)) 2Cov(Y (s + h), Y (s)) = C(0) + C(0) 2C(h) = 2[C(0) C(h)]. Easy to recover γ from C. The converse needs the additional assumption of ergodicity: lim u C(u) = 0. So lim u γ(u) = C(0), and we can recover C from γ as long as this limit exists. C(h) = lim u γ(u) γ(h). 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis 3

4 Isotropy Isotropy When γ(h) or C(h) depends upon the separation vector only through the distance h, we say that the process is isotropic. In that case, we write γ( h ) or C( h ). Otherwise we say that the process is anisotropic. If the process is intrinsically stationary and isotropic, it is also called homogeneous. Isotropic processes are popular because of their simplicity, interpretability, and because a number of relatively simple parametric forms are available as candidates for C (and γ). Denoting h by t for notational simplicity, the next two tables provide a few examples... Some common isotropic variograms model Variogram, { γ(t) τ Linear γ(t) = 2 + σ 2 t if t > 0 0 otherwise τ 2 + σ 2 if t /φ Spherical γ(t) = τ 2 + σ 2 [ 3 2 φt 2 (φt)3] if 0 < t /φ { 0 otherwise τ Exponential γ(t) = 2 + σ 2 ( exp( φt)) if t > 0 { 0 otherwise Powered τ γ(t) = 2 + σ 2 ( exp( φt p )) if t > 0 exponential { 0 otherwise Matérn τ γ(t) = 2 + σ 2 [ ( + φt) e φt] if t > 0 at ν = 3/2 0 o/w 9 JSM 2009 Hierarchical Modeling and Analysis 20 JSM 2009 Hierarchical Modeling and Analysis Isotropy Isotropy Examples: Spherical Variogram Examples: Spherical Variogram γ(t) = τ 2 + σ 2 if t /φ τ 2 + σ 2 [ 3 2 φt 2 (φt)3] if 0 < t /φ 0 if t = 0. While γ(0) = 0 by definition, γ(0 + ) lim t 0 + γ(t) = τ 2 ; this quantity is the nugget. lim t γ(t) = τ 2 + σ 2 ; this asymptotic value of the semivariogram is called the sill. (The sill minus the nugget, σ 2 in this case, is called the partial sill.) Finally, the value t = /φ at which γ(t) first reaches its ultimate level (the sill) is called the range, R /φ b) spherical; a0 = 0.2, a =, R = 2 JSM 2009 Hierarchical Modeling and Analysis 22 JSM 2009 Hierarchical Modeling and Analysis Isotropy Isotropy Some common isotropic covariograms Model Covariance function, C(t) Linear C(t) does not exist 0 if t /φ Spherical C(t) = σ 2 [ 3 2 φt + 2 (φt)3] if 0 < t /φ { τ 2 + σ 2 otherwise σ Exponential C(t) = 2 exp( φt) if t > 0 { τ 2 + σ 2 otherwise Powered σ C(t) = 2 exp( φt p ) if t > 0 exponential { τ 2 + σ 2 otherwise Matérn σ C(t) = 2 ( + φt) exp( φt) if t > 0 at ν = 3/2 τ 2 + σ 2 otherwise Notes on exponential model { τ C(t) = 2 + σ 2 if t = 0 σ 2 exp( φt) if t > 0 We define the effective range, t 0, as the distance at which this correlation has dropped to only Setting exp( φt 0 ) equal to this value we obtain t 0 3/φ, since log(0.05) 3. Finally, the form of C(t) shows why the nugget τ 2 is often viewed as a nonspatial effect variance, and the partial sill (σ 2 ) is viewed as a spatial effect variance.. 23 JSM 2009 Hierarchical Modeling and Analysis 24 JSM 2009 Hierarchical Modeling and Analysis 4

5 Isotropy Variogram model fitting The Matèrn Correlation Function Much of statistical modelling is carried out through correlation functions rather than variograms The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable) ν is a smoothness parameter (a fractal) controlling process smoothness How do we select a variogram? Can the data really distinguish between variograms? Empirical Variogram: γ(t) = 2 N(t) s i,s j N(t) (Y (s i ) Y (s j )) 2 where N(t) is the number of points such that s i s j = t and N(t) is the number of points in N(t). Grid up the t space into intervals I = (0, t ), I 2 = (t, t 2 ), and so forth, up to I K = (t K, t K ). Representing t values in each interval by its midpoint, we define: N(t k ) = {(s i, s j ) : s i s j I k }, k =,..., K. 25 JSM 2009 Hierarchical Modeling and Analysis 26 JSM 2009 Hierarchical Modeling and Analysis Variogram model fitting Variogram model fitting Empirical variogram: scallops data Empirical variogram: scallops data Parametric Semivariograms Gamma(d) gamma(d) gamma(d) gamma(d) Exponential Gaussian Cauchy Spherical Bessel-J distance Bessel Mixtures - Random Weights Two Three Four Five distance Bessel Mixtures - Random Phi s Two Three Four Five distance distance 27 JSM 2009 Hierarchical Modeling and Analysis 28 JSM 2009 Hierarchical Modeling and Analysis 5

6 Bayesian principles Classical statistics: model parameters are fixed and unknown. Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 A Bayesian thinks of parameters as random, and thus having distributions (just like the data). We can thus think about unknowns for which no reliable frequentist experiment exists, e.g. θ = proportion of US men with untreated prostate cancer. A Bayesian writes down a prior guess for parameter(s) θ, say p(θ). He then combines this with the information provided by the observed data y to obtain the posterior distribution of θ, which we denote by p(θ y). All statistical inferences (point and interval estimates, hypothesis tests) then follow from posterior summaries. For example, the posterior means/medians/modes offer point estimates of θ, while the quantiles yield credible intervals. 2 JSM 2009 Hierarchical Modeling and Analysis Bayesian principles Basics of Bayesian inference The key to Bayesian inference is learning or updating of prior beliefs. Thus, posterior information prior information. Is the classical approach wrong? That may be a controversial statement, but it certainly is fair to say that the classical approach is limited in scope. The Bayesian approach expands the class of models and easily handles: repeated measures unbalanced or missing data nonhomogenous variances multivariate data and many other settings that are precluded (or much more complicated) in classical settings. We start with a model (likelihood) f(y θ) for the observed data y = (y,..., y n ) given unknown parameters θ (perhaps a collection of several parameters). Add a prior distribution p(θ λ), where λ is a vector of hyper-parameters. The posterior distribution of θ is given by: p(θ y, λ) = p(θ λ) f(y θ) p(y λ) We refer to this formula as Bayes Theorem. = p(θ λ) f(y θ) f(y θ)p(θ λ)dθ. 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Basics of Bayesian inference Calculations (numerical and algebraic) are usually required only up to a proportionaly constant. We, therefore, write the posterior as: p(θ y, λ) p(θ λ) f(y θ). If λ are known/fixed, then the above represents the desired posterior. If, however, λ are unknown, we assign a prior, p(λ), and seek: p(θ, λ y) p(λ)p(θ λ)f(y θ). The proportionality constant does not depend upon θ or λ: p(y) = p(λ)p(θ λ)f(y θ)dλdθ The above represents a joint posterior from a hierarchical model. The marginal posterior distribution for θ is: p(θ y) = p(λ)p(θ λ)f(y θ)dλ. Bayesian inference: point estimation Point estimation is easy: simply choose an appropriate distribution summary: posterior mean, median or mode. Mode sometimes easy to compute (no integration, simply optimization), but often misrepresents the middle of the distribution especially for one-tailed distributions. Mean: easy to compute. It has the opposite effect of the mode chases tails. Median: probably the best compromise in being robust to tail behaviour although it may be awkward to compute as it needs to solve: θmedian p(θ y)dθ = 2. 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 6

7 Bayesian inference: interval estimation The most popular method of inference in practical Bayesian modelling is interval estimation using credible sets. A 00( α)% credible set C for θ is a set that satisfies: P (θ C y) = p(θ y)dθ α. C The most popular credible set is the simple equal-tail interval estimate (q L, q U ) such that: ql p(θ y)dθ = α 2 = p(θ y)dθ Then clearly P (θ (q L, q U ) y) = α. This interval is relatively easy to compute and has a direct interpretation: The probability that θ lies between (q L, q U ) is α. The frequentist interpretation is extremely convoluted. 7 JSM 2009 Hierarchical Modeling and Analysis q U A simple example: Normal data and normal priors Example: Consider a single data point y from a Normal distribution: y N(θ, σ 2 ); assume σ is known. f(y θ) = N(y θ, σ 2 ) = σ 2π exp( 2σ 2 (y θ)2 ) θ N(µ, τ 2 ), i.e. p(θ) = N(θ µ, τ 2 ); µ, τ 2 are known. Posterior distribution of θ p(θ y) N(θ µ, τ 2 ) N(y θ, σ 2 ) ( τ = N θ 2 + µ + σ 2 τ 2 ( = N ) σ 2 + y, + σ 2 τ 2 σ 2 τ 2 σ 2 θ σ 2 + τ 2 µ + τ 2 σ 2 + τ 2 y, σ 2 τ 2 ) σ 2 + τ 2. 8 JSM 2009 Hierarchical Modeling and Analysis A simple example: Normal data and normal priors Another simple example: The Beta-Binomial model Interpret: Posterior mean is a weighted mean of prior mean and data point. The direct estimate is shrunk towards the prior. What if you had n observations instead of one in the earlier set up? Say y = (y,..., y n ) iid, where y i N(0, σ 2 ). ( ) ȳ is a sufficient statistic for µ; ȳ N µ, σ2 n Posterior distribution of θ ) p(θ y) N(θ µ, τ 2 ) N (ȳ θ, σ2 n ( ) n τ = N θ 2 σ n + µ + 2 n + ȳ, n + σ 2 τ 2 σ 2 τ 2 σ 2 τ ( 2 σ 2 = N θ σ 2 + nτ 2 µ + nτ 2 σ 2 + nτ 2 ȳ, σ 2 τ 2 ) σ 2 + nτ 2 Example: Let Y be the number of successes in n independent trials. ( ) n P (Y = y θ) = f(y θ) = θ y ( θ) n y y Prior: p(θ) = Beta(θ a, b): p(θ) θ a ( θ) b. Prior mean: µ = a/(a + b); Variance ab/((a + b) 2 (a + b + )) Posterior distribution of θ p(θ y) = Beta(θ a + y, b + n y) 9 JSM 2009 Hierarchical Modeling and Analysis 0 JSM 2009 Hierarchical Modeling and Analysis Sampling-based inference We will compute the posterior distribution p(θ y) by drawing samples from it. This replaces numerical integration (quadrature) by Monte Carlo integration. One important advantage: we only need to know p(θ y) up to the proportionality constant. Suppose θ = (θ, θ 2 ) and we know how to sample from the marginal posterior distribution p(θ 2 y) and the conditional distribution P (θ θ 2, y). How do we draw samples from the joint distribution: p(θ, θ 2 y)? Sampling-based inference We do this in two stages using composition sampling: First draw θ (j) 2 p(θ 2 y), j =,... M. ( Next draw θ (j) p θ θ (j) 2 )., y This sampling scheme produces exact samples, {θ (j), θ(j) 2 }M j= from the posterior distribution p(θ, θ 2 y). Gelfand and Smith (JASA, 990) demonstrated automatic marginalization: {θ (j) }M j= are samples from p(θ y) and (of course!) {θ (j) 2 }M j= are samples from p(θ 2 y). In effect, composition sampling has performed the following integration : p(θ y) = p(θ θ 2, y)p(θ 2 y)dθ. JSM 2009 Hierarchical Modeling and Analysis 2 JSM 2009 Hierarchical Modeling and Analysis 7

8 Bayesian predictions Suppose we want to predict new observations, say ỹ, based upon the observed data y. We will specify a joint probability model p(ỹ, y, θ), which defines the conditional predictive distribution: p(ỹ y, θ) = p(ỹ, y, θ). p(y θ) Bayesian predictions follow from the posterior predictive distribution that averages out the θ from the conditional predictive distribution with respect to the posterior: p(ỹ y) = p(ỹ y, θ)p(θ y)dθ. This can be evaluated using composition sampling: First obtain: θ (j) p(θ y), j =,... M For j =,..., M sample ỹ (j) p(ỹ y, θ (j) ) The {ỹ (j) } M j= are samples from the posterior predictive distribution p(ỹ y). 3 JSM 2009 Hierarchical Modeling and Analysis Some remarks on sampling-based inference Direct Monte Carlo: Some algorithms (e.g. composition sampling) can generate independent samples exactly from the posterior distribution. In these situations there are NO convergence problems or issues. Sampling is called exact. Markov Chain Monte Carlo (MCMC): In general, exact sampling may not be possible/feasible. MCMC is a far more versatile set of algorithms that can be invoked to fit more general models. Note: anywhere where direct Monte Carlo applies, MCMC will provide excellent results too. Convergence issues: There is no free lunch! The power of MCMC comes at a cost. The initial samples do not necessarily come from the desired posterior distribution. Rather, they need to converge to the true posterior distribution. Therefore, one needs to assess convergence, discard output before the convergence and retain only post-convergence samples. The time of convergence is called burn-in. Diagnosing convergence: Usually a few parallel chains are run from rather different starting points. The sample values are plotted (called trace-plots) for each of the chains. The time for the chains to mix together is taken as the time for convergence. Good news! All this is automated in WinBUGS. So, as users, we need to only configure how to specify good Bayesian models and implement them in WinBUGS. 4 JSM 2009 Hierarchical Modeling and Analysis 8

9 Linear regression models: a Bayesian perspective Bayesian Linear Models Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 Ingredients of a linear model include an n response vector y = (y,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x,..., x p ], assumed to have been observed without error. The linear model: y = Xβ + ɛ; ɛ N(0, σ 2 I) The linear model is the most fundamental of all serious statistical models encompassing: ANOVA: y is continuous, x i s are categorical REGRESSION: y is continuous, x i s are continuous ANCOVA: y is continuous, some x i s are continuous, some categorical. Unknown parameters include the regression parameters β and the variance σ 2. We assume X is observed without error and all inference is conditional on X. 2 JSM 2009 Hierarchical Modeling and Analysis Linear regression models: a Bayesian perspective Bayesian regression with flat reference priors The classical unbiased estimates of the regression parameter β and σ 2 are ˆβ = (X T X) X T y; ˆσ 2 = n p (y X ˆβ) T (y X ˆβ). The above estimate of β is also a least-squares estimate. The predicted value of y is given by ŷ = X ˆβ = P X y where P X = X(X T X) X T. P X is called the projector of X. It projects any vector to the space spanned by the columns of X. The model residual is estimated as: ê = (y X ˆβ) T (y X ˆβ) = y T (I P X )y. For Bayesian analysis, we will need to specify priors for the unknown regression parameters β and the variance σ 2. Consider independent flat priors on β and log σ 2 : p(β) ; p(log(σ 2 )) or equivalently p(β, σ 2 ) σ 2. None of the above two distributions are valid probabilities (they do not integrate to any finite number). So why is it that we are even discussing them? It turns out that even if the priors are improper (that s what we call them), as long as the resulting posterior distributions are valid we can still conduct legitimate statistical inference on them. 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Marginal and conditional distributions Marginal and conditional distributions With a flat prior on β we obtain, after some algebra, the conditional posterior distribution: p(β σ 2, y) = N(β (X T X) X T y, σ 2 (X T X) ). The conditional posterior distribution of β would have been the desired posterior distribution had σ 2 been known. Since that is not the case, we need to obtain the marginal posterior distribution by integrating out σ 2 as: p(β y) = p(β σ 2, y)p(σ 2 y)dσ 2 Can we solve this integration using composition sampling? YES: if we can generate samples from p(σ 2 y)! So, we need to find the marginal posterior distribution of σ 2. With the choice of the flat prior we obtain: ) p(σ 2 (n p)s2 y) ( (σ 2 ) ( = IG (n p)/2+ exp σ 2 n p, 2 (n p)s2 2 2σ 2 ), where s 2 = ˆσ 2 = n p yt (I P X )y. This is known as an inverted Gamma distribution (also called a scaled chi-square distribution) IG(σ 2 (n p)/2, (n p)s 2 /2). In other words: [(n p)s 2 /σ 2 y] χ 2 n p (with n p degrees of freedom). A striking similarity with the classical result: The distribution of ˆσ 2 is also characterized as (n p)s 2 /σ 2 following a chi-square distribution. 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 9

10 Composition sampling for linear regression Now we are ready to carry out composittion sampling from p(β, σ 2 y) as follows: Draw M samples from p(σ 2 y): ( ) n p σ 2(j) (n p)s2 IG, (n p), j =,... M 2 2 For j =,..., M, draw from p(β σ 2(j), y): β (j) N ((X T X) X T y, σ 2(j) (X T X) ) The resulting samples {β (j), σ 2(j) } M j= represent M samples from p(β, σ 2 y). {β (j) } M j= are samples from the marginal posterior distribution p(β y). This is a multivariate t density: p(β y) = Γ(n/2) (π(n p)) p/2 Γ((n p)/2) s 2 (X T X) " + (β ˆβ) T (X T X)(β ˆβ) (n p)s 2 # n/2. Composition sampling for linear regression The marginal distribution of each individual regression parameter β j is a non-central univariate t n p distribution. In fact, β j ˆβ j s (X T X) jj t n p. The 95% credible intervals for each β j are constructed from the quantiles of the t-distribution. The credible intervals exactly coicide with the 95% classical confidence intervals, but the intepretation is direct: the probability of β j falling in that interval, given the observed data, is Note: an intercept only linear model reduces to the simple univariate N(ȳ µ, σ 2 /n) likelihood, for which the marginal posterior of µ is: µ ȳ s/ n t n. 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis Bayesian predictions from the linear model Suppose we have observed the new predictors X, and we wish to predict the outcome ỹ. We specify p(ỹ, y θ) to be a normal distribution: ( ) ([ ] ) ỹ N β, σ y X X 2 I Note p(ỹ y, β, σ 2 ) = p(ỹ β, σ 2 ) = N(ỹ Xβ, σ 2 I). The posterior predictive distribution: p(ỹ y) = p(ỹ y, β, σ 2 )p(β, σ 2 y)dβdσ 2 = p(ỹ β, σ 2 )p(β, σ 2 y)dβdσ 2. By now we are comfortable evaluating such integrals: First obtain: (β (j), σ 2(j) ) p(β, σ 2 y), j =,..., M Next draw: ỹ (j) N( Xβ (j), σ 2(j) I). 9 JSM 2009 Hierarchical Modeling and Analysis The Gibbs sampler Suppose that θ = (θ, θ 2 ) and we seek the posterior distribution p(θ, θ 2 y). For many interesting hierarchical models, we have access to full conditional distributions p(θ θ 2, y) and p(θ θ 2, y). The Gibbs sampler proposes the following sampling scheme. Set starting values θ (0) = (θ (0), θ(0) 2 ) For j =,..., M Draw θ (j) p(θ θ (j ) 2, y) Draw θ (j) 2 p(θ 2 θ (j), y) This constructs a Markov Chain and, after an initial burn-in period when the chains are trying to find their way, the above algorithm guarantees that {θ (j), θ(j) 2 }M j=m will be samples from p(θ 0+, θ 2 y), where M 0 is the burn-in period.. 0 JSM 2009 Hierarchical Modeling and Analysis The Gibbs sampler The Gibbs sampler More generally, if θ = (θ,..., θ p ) are the parameters in our model, we provide a set of initial values θ (0) = (θ (0),..., θ(0) p ) and then performs the j-th iteration, say for j =,..., M, by updating successively from the full conditional distributions: θ (j) p(θ (j) θ (j ) 2,..., θ p (j ), y) θ (j) 2 p(θ 2 θ (j), θ(j) 3,..., θ(j ) p, y)... (the generic k th element) p(θ k θ (j),..., θ(j) k, θ(j) k+,..., θ(j ) p, y) θ (j) k θ (j) p p(θ p θ (j),..., θ(j) p, y) Example: Consider the linear model. Suppose we set p(σ 2 ) = IG(σ 2 a, b) and p(β). The full conditional distributions are: p(β y, σ 2 ) = N(β (X T X) X T y, σ 2 (X T X) ) ( p(σ 2 y, β) = IG σ 2 a + n/2, b + ) 2 (y Xβ)T (y Xβ). Thus, the Gibbs sampler will initialize (β (0), σ 2(0) ) and draw, for j =,..., M: Draw β (j) N((X( T X) X T y, σ 2(j ) (X T X) ) ) Draw σ 2(j) IG a + n/2, b + 2 (y Xβ(j) ) T (y Xβ (j) ) JSM 2009 Hierarchical Modeling and Analysis 2 JSM 2009 Hierarchical Modeling and Analysis 0

11 The Gibbs sampler The Metropolis-Hastings algorithm In principle, the Gibbs sampler will work for extremely complex hierarchical models. The only issue is sampling from the full conditionals. They may not be amenable to easy sampling when these are not in closed form. A more general and extremely powerful - and often easier to code - algorithm is the Metropolis-Hastings (MH) algorithm. This algorithm also constructs a Markov Chain, but does not necessarily care about full conditionals. The Metropolis-Hastings algorithm: Start with a initial value for θ = θ (0). Select a candidate or proposal distribution from which to propose a value of θ at the j-th iteration: θ (j) q(θ (j ), ν). For example, q(θ (j ), ν) = N(θ (j ), ν) with ν fixed. Compute r = p(θ y)q(θ (j ) θ, ν) p(θ (j ) y)q(θ θ (j ) ν) If r then set θ (j) = θ. If r then draw U (0, ). If U r then θ (j) = θ. Otherwise, θ (j) = θ (j ). Repeat for j =,... M. This yields θ (),..., θ (M), which, after a burn-in period, will be samples from the true posterior distribution. It is important to monitor the acceptance ratio r of the sampler through the iterations. Rough recommendations: for vector updates r 20%., for scalar updates r 40%. This can be controlled by tuning ν. Popular approach: Embed Metropolis steps within Gibbs to draw from full conditionals that are not accessible to directly generate from. 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis The Metropolis-Hastings algorithm Example: For the linear model, our parameters are (β, σ 2 ). We write θ = (β, log(σ 2 )) and, at the j-th iteration, propose θ N(θ (j ), Σ). The log transformation on σ 2 ensures that all components of θ have support on the entire real line and can have meaningful proposed values from the multivariate normal. But we need to transform our prior to p(β, log(σ 2 )). Let z = log(σ 2 ) and assume p(β, z) = p(β)p(z). Let us derive p(z). REMEMBER: we need to adjust for the jacobian. Then p(z) = p(σ 2 ) dσ 2 /dz = p(e z )e z. The jacobian here is e z = σ 2. Let p(β) = and an p(σ 2 ) = IG(σ 2 a, b). Then log-posterior is: (a + n/2 + )z + z e z {b + 2 (Y Xβ)T (Y Xβ)}. A symmetric proposal distribution, say q(θ θ (j ), Σ) = N(θ (j ), Σ), cancels out in r. In practice it is better to compute log(r): log(r) = log(p(θ y) log(p(θ (j ) y)). For the proposal, N(θ (j ), Σ), Σ is a d d variance-covariance matrix, and d = dim(θ) = p +. If log r 0 then set θ (j) = θ. If log r 0 then draw U (0, ). If U r (or log U log r) then θ (j) = θ. Otherwise, θ (j) = θ (j ). Repeat the above procedure for j =,... M to obtain samples θ (),..., θ (M). 5 JSM 2009 Hierarchical Modeling and Analysis

12 Spatial Domain Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 y x 2 JSM 2009 Hierarchical Modeling and Analysis Algorithmic Modelling What is a spatial process? Spatial surface observed at finite set of locations S = {s, s 2,..., s n } x Tessellate the spatial domain (usually with data locations as vertices) x Fit an interpolating polynomial: Y(s) Y(s2) f(s) = i w i (S ; s)f(s i ) Y(sn) Interpolate by reading off f(s 0 ). Issues: Sensitivity to tessellations Choices of multivariate interpolators Numerical error analysis x x 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Simple linear model Simple linear model Simple linear model Simple linear model Y (s) = µ(s) + ɛ(s), Response: Y (s) at location s Mean: µ = x T (s)β Error: ɛ(s) iid N(0, τ 2 ) Assumptions regarding ɛ(s): ɛ(s) iid N(0, τ 2 ) Y (s) = µ(s) + ɛ(s), ɛ(s i ) and ɛ(s j ) are uncorrelated for all i j D D Y (s), x(s) ɛ(s i) ɛ(s j) 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 2

13 Sources of variation Sources of variation Spatial Gaussian processes (GP ): Say w(s) GP (0, σ 2 ρ( )) and Cov(w(s ), w(s 2 )) = σ 2 ρ (φ; s s 2 ) Let w = [w(s i )] n i=, then w N(0, σ 2 R(φ)), where R(φ) = [ρ(φ; s i s j )] n i,j= D Realization of a Gaussian process: Changing φ and holding σ 2 = : w N(0, σ 2 R(φ)), where R(φ) = [ρ(φ; s i s j )] n i,j= Correlation model for R(φ): e.g., exponential decay ρ(φ; t) = exp( φt) if t > 0. w(s i) w(s j) Other valid models e.g., Gaussian, Spherical, Matérn. Effective range, t 0 = ln(0.05)/φ 3/φ 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis Sources of variation Univariate spatial regression w N(0, σ 2 wr(φ)) defines complex spatial dependence structures. E.g., anisotropic Matérn correlation function: ρ(s i, s j; φ) = `/Γ(ν)2 ν 2 p νd ij) ν κ p ν(2 νd ij, where d ij = (s i s j) Σ (s i s j), Σ = G(ψ)Λ 2 G(ψ). Thus, φ = (ν, ψ, Λ). Simple linear model + random spatial effects Y (s) = µ(s) + w(s) + ɛ(s), Simulated Predicted Response: Y (s) at some site Mean: µ = x T (s)β Spatial random effects: w(s) GP (0, σ 2 ρ(φ; s s 2 )) Non-spatial variance: ɛ(s) iid N(0, τ 2 ) 9 JSM 2009 Hierarchical Modeling and Analysis 0 JSM 2009 Hierarchical Modeling and Analysis Univariate spatial regression Univariate spatial regression First stage: y β, w, τ 2 Second stage: Hierarchical modelling n N(Y (s i ) x T (s i )β + w(s i ), τ 2 ) i= w σ 2, φ N(0, σ 2 R(φ)) Third stage: Priors on Ω = (β, τ 2, σ 2, φ) Marginalized likelihood: y Ω N(Xβ, σ 2 R(φ) + τ 2 I) Note: Spatial process parametrizes Σ: y = Xβ + ɛ, ɛ N (0, Σ), Σ = σ 2 R(φ) + τ 2 I JSM 2009 Hierarchical Modeling and Analysis Bayesian Computations Choice: Fit [y Ω] [Ω] or [y β, w, τ 2 ] [w σ 2, φ] [Ω]. Conditional model: conjugate full conditionals for σ 2, τ 2 and w easier to program. Marginalized model: need Metropolis or Slice sampling for σ 2, τ 2 and φ. Harder to program. But, reduced parameter space faster convergence σ 2 R(φ) + τ 2 I is more stable than σ 2 R(φ). But what about R (φ)?? EXPENSIVE! 2 JSM 2009 Hierarchical Modeling and Analysis 3

14 Univariate spatial regression Univariate spatial regression Where are the w s? Residual plot: [w(s) y] Interest often lies in the spatial surface w y. They are recovered from [w y, X] = [w Ω, y, X] [Ω y, X]dΩ using posterior samples: Obtain Ω (),..., Ω (G) [Ω y, X] For each Ω (g), draw w (g) [w Ω (g), y, X] w mean of Y NOTE: With Gaussian likelihoods [w Ω, y, X] is also Gaussian. With other likelihoods this may not be easy and often the conditional updating scheme is preferred. Easting Northing 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Univariate spatial regression Univariate spatial regression Another look: [w(s) y] Another look: [w(s) y] w w Easting Northing Easting Northing 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis Spatial Prediction Spatial Prediction Prediction: Summary of [Y (s) y] Often we need to predict Y (s) at a new set of locations { s 0,..., s m } with associated predictor matrix X. Sample from predictive distribution: [ỹ y, X, X] = [ỹ, Ω y, X, X]dΩ = [ỹ y, Ω, X, X] [Ω y, X]dΩ, y [ỹ y, Ω, X, X] is multivariate normal. Sampling scheme: Obtain Ω (),..., Ω (G) [Ω y, X] For each Ω (g), draw ỹ (g) [ỹ y, Ω (g), X, X]. Easting Northing 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis 4

15 Colorado data illustration Temperature residual map Modelling temperature: 507 locations in Colorado. Simple spatial regression model: Y (s) = x T (s)β + w(s) + ɛ(s) w(s) GP (0, σ 2 ρ( ; φ, ν)); ɛ(s) iid N(0, τ 2 ) Parameters 50% (2.5%,97.5%) Intercept (2.3,3.866) [Elevation] (-0.527,-0.333) Precipitation (0.002,0.072) σ (0.05,.245) φ 7.39E-3 (4.7E-3, 5.2E-3) Range (38.8, 476.3) τ (0.022, 0.092) Northing Easting 9 JSM 2009 Hierarchical Modeling and Analysis 20 JSM 2009 Hierarchical Modeling and Analysis Elevation map Residual map with elev. as covariate Latitude Northing Longitude Easting 2 JSM 2009 Hierarchical Modeling and Analysis 22 JSM 2009 Hierarchical Modeling and Analysis 5

16 Spatial Generalized Linear Models Hierarchical Modelling for non-gaussian Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. Often data sets preclude Gaussian modelling: Y (s) may not even be continuous Example: Y (s) is a binary or count variable species presence or absence at location s species abundance from count at location s continuous forest variable is high or low at location s Replace Gaussian likelihood by exponential family member July 9, 2009 Diggle Tawn and Moyeed (998) 2 JSM 2009 Hierarchical Modeling and Analysis Spatial Generalized Linear Models Spatial Generalized Linear Models Comments First stage: Y (s i ) are conditionally independent given β and w(s i ), so f(y(s i ) β, w(s i ), γ) equals h(y(s i ), γ) exp (γ[y(s i )η(s i ) ψ(η(s i ))]) where g(e(y (s i ))) = η(s i ) = x T (s i )β + w(s i ) (canonical link function) and γ is a dispersion parameter. Second stage: Model w(s) as a Gaussian process: w N(0, σ 2 R(φ)) Third stage: Priors and hyperpriors. No process for Y (s), only a valid joint distribution Not sensible to add a pure error term ɛ(s) We are modelling with spatial random effects Introducing these in the transformed mean encourages means of spatial variables at proximate locations to be close to each other Marginal spatial dependence is induced between, say, Y (s) and Y (s ), but observed Y (s) and Y (s ) need not be close to each other Second stage spatial modelling is attractive for spatial explanation in themean First stage spatial modelling more appropriate to encourage proximate observations to be close. 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Binary spatial regression: forest/non-forest from: Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 5: We illustrate a non-gaussian model for point-referenced spatial data: Objective is to make pixel-level prediction of forest/non-forest across the domain. Data: Observations are from 500 georeferenced USDA Forest Service Forest Inventory and Analysis (FIA) inventory plots within a 32 km radius circle in MN, USA. The response Y (s) is a binary variable, with { if inventory plot is forested Y (s) = 0 if inventory plot is not forested Observed covariates include the coinsiding pixel values for 3 dates of m resolution Landsat imagery. 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 6

17 JSM Hierarchical Modeling and Analysis Posterior parameter estimates Binary spatial regression: forest/non-forest Parameter estimates (posterior medians and upper and lower 2.5 percentiles): We fit a generalized linear model where Y (si ) Bernoulli(p(si )), logit(p(si )) = xt (si )β + w(si ). Parameters Intercept (θ0 ) AprilTC (θ ) AprilTC2 (θ2 ) AprilTC3 (θ3 ) JulyTC (θ4 ) JulyTC2 (θ5 ) JulyTC3 (θ6 ) OctTC (θ7 ) OctTC2 (θ8 ) OctTC3 (θ9 ) σ2 φ log(0.05)/φ (meters) Assume vague flat for β, a Uniform(3/32, 3/0.5) prior for φ, and an inverse-gamma(2, ) prior for σ 2. Parameters updated with Metropolis algorithm using target log density: ln (p(ω Y )) n ` 2 σb σa + + ln σ 2 ln ( R(φ) ) 2 wt R(φ) w 2 σ 2 2σ n n X X + Y (si ) xt (si )β + w(si ) ln + exp(xt (si )β + w(si ) i= i= Covariates and proximity to observed FIA plot will contribute to increase precision of prediction. + ln(σ 2 ) + ln(φ φa ) + ln(φb φ). 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis 97.5%-2.5% range of posterior predictive distributions cut point F[P(Y(A) = )] CDF of holdout area s posterior predictive distributions P(Y(A) = ) Classification of pixel areas (based on visual inspection of imagery) into non-forest (), moderately forest ( ), and forest (no marker). JSM 2009 Hierarchical Modeling and Analysis Median of posterior predictive distributions 9 Estimates: 50% (2.5%, 97.5%) (49.56, 20.46) (-0.45, -0.) 0.7 (0.07, 0.29) (-0.43, -0.08) (-0.25, 0.7) 0.09 (-0.0, 0.9) 0.0 (-0.5, 0.6) (-0.68, -0.22) (-0.9, 0.4) (-0.46, -0.07).358 (0.39, 2.42) ( , ) (932.33, ) JSM 2009 Hierarchical Modeling and Analysis 7 JSM 2009 Hierarchical Modeling and Analysis

18 Multivariate spatial modelling Hierarchical Modelling for Multivariate Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 Point-referenced spatial data often come as multivariate measurements at each location. Examples: Environmental monitoring: stations yield measurements on ozone, NO, CO, and PM 2.5. Community ecology: assembiages of plant species due to water availibility, temerature, and light requirements. Forestry: measurements of stand characteristics age, total biomass, and average tree diameter. Atmospheric modeling: at a given site we observe surface temperature, precipitation and wind speed We anticipate dependence between measurements at a particular location across locations 2 JSM 2009 Hierarchical Modeling and Analysis Multivariate spatial modelling Multivariate spatial modelling Multivariate Gaussian process Each location contains m spatial regressions Y k (s) = µ k (s) + w k (s) + ɛ k (s), k =,..., m. Mean: µ(s) = [µ k (s)] m k= = [xt k (s)β k] m k= Cov: w(s) = [w k (s)] m k= MV GP (0, Γ w(, )) Γ w (s, s ) = [Cov(w k (s), w k (s ))] m k,k = Error: ɛ(s) = [ɛ k (s)] m k= MV N(0, Ψ) Ψ is an m m p.d. matrix, e.g. usually Diag(τk 2)m k=. w(s) MV GP (0, Γ w ( )) with Γ w (s, s ) = [Cov(w k (s), w k (s ))] m k,k = Example: with m = 2 ( Γ w (s, s Cov(w (s), w ) = (s )) Cov(w (s), w 2 (s ) )) Cov(w 2 (s), w (s )) Cov(w 2 (s), w 2 (s )) For finite set of locations S = {s,..., s n }: V ar ([w(s i )] n i=) = Σ w = [Γ w (s i, s j )] n i,j= 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Multivariate spatial modelling Multivariate Gaussian process Multivariate spatial modelling Modelling cross-covariances Properties: Γ w (s, s) = Γ T w(s, s ) lim s s Γ w (s, s ) is p.d. and Γ w (s, s) = V ar(w(s)). For sites in any finite collection S = {s,..., s n }: n i= j= n u T i Γ w (s i, s j )u j 0 for all u i, u j R m. Any valid Γ W must satisfy the above conditions. The last property implies that Σ w is p.d. In complete generality: Γ w (s, s ) need not be symmetric. Γ w (s, s ) need not be p.d. fors s. Moving average or kernel convolution of a process: Let Z(s) GP (0, ρ( )). Use kernels to form: w j (s) = κ j (u)z(s + u)du = κ j (s s )Z(s )ds Γ w (s s ) has (i, j)-th element: [Γ w (s s )] i,j = κ i (s s + u)κ j (u )ρ(u u )dudu Convolution of Covariance Functions: ρ, ρ 2,...ρ m are valid covariance functions. Form: [Γ w (s s )] i,j = ρ i (s s t)ρ j (t)dt. 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 8

19 Multivariate spatial modelling Modelling cross-covariances Multivariate spatial modelling Modelling cross-covariances Constructive approach Constructive approach, contd. Let v k (s) GP (0, ρ k (s, s )), for k =,..., m be m independent GP s with unit variance. Form the simple multivariate process v(s) = [v k (s)] m k= : v(s) MV GP (0, Γ v (, )) with Γ v (s, s ) = Diag(ρ k (s, s )) m k=. Assume w(s) = A(s)v(s) arises as a space-varying linear transformation of v(s). Then: Γ w (s, s ) = A(s)Γ v (s, s )A T (s ) is a valid cross-covariance function. When s = s, Γ v (s, s) = I m, so: Γ w (s, s) = A(s)A T (s) A(s) identifies with any square-root of Γ w (s, s). Can be taken as lower-triangular (Cholesky). A(s) is unknown! Should we first model A(s) to obtain Γ w (s, s)? Or should we model Γ w (s, s ) first and derive A(s)? A(s) is completely determined from within-site associations. 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis Multivariate spatial modelling Modelling cross-covariances Multivariate spatial modelling Hierarchical modelling Constructive approach, contd. If A(s) = A: w(s) is stationary when v(s) is. Γ w (s, s ) is symmetric. Γ v (s, s ) = ρ(s, s )I m Γ w = ρ(s, s )AA T Last specification is called intrinsic and leads to separable models: Σ w = H(φ) Λ; Λ = AA T Let y = [Y(s i )] n i= and w = [W(s i)] n i=. First stage: n y β, w, Ψ MV N ( Y(s i ) X(s i ) T β + w(s i ), Ψ ) i= Second stage: w θ MV N(0, Σ w (Φ)) where Σ w (Φ) = [Γ W (s i, s j ; Φ)] n i,j=. Third stage: Priors on Ω = (β, Ψ, Φ). Marginalized likelihood: y β, θ, Ψ MV N(Xβ, Σ w (Φ) + I Ψ) 9 JSM 2009 Hierarchical Modeling and Analysis 0 JSM 2009 Hierarchical Modeling and Analysis Multivariate spatial modelling Bayesian computations Multivariate spatial modelling Bayesian computations Choice: Fit as [y Ω] [Ω] or as [y β, w, Ψ] [w Φ] [Ω]. Conditional model: Conjugate distributions are available for Ψ and other variance parameters. Easy to program. Marginalized model: need Metropolis or Slice sampling for most variance-covariance parameters. Harder to program. But reduced parameter space (no w s) results in faster convergence Σ w (Φ) + I Ψ is more stable than Σ w (Φ). But what about Σ w (Φ)?? Matrix inversion is EXPENSIVE O(n 3 ). Recovering the w s? Interest often lies in the spatial surface w y. They are recovered from [w y, X] = [w Ω, y, X] [Ω y, X]dΩ using posterior samples: Obtain Ω (),..., Ω (G) [Ω y, X] For each Ω (g), draw w (g) [w Ω (g), y, X] NOTE: With Gaussian likelihoods [w Ω, y, X] is also Gaussian. With other likelihoods this may not be easy and often the conditional updating scheme is preferred. JSM 2009 Hierarchical Modeling and Analysis 2 JSM 2009 Hierarchical Modeling and Analysis 9

20 JSM Hierarchical Modeling and Analysis Multivariate spatial modelling Spatial prediction Often we need to predict Y (s) at a new set of locations { s 0,..., s m } with associated predictor matrix X. Sample from predictive distribution: [ỹ y, X, X] = [ỹ, Ω y, X, X]dΩ = [ỹ y, Ω, X, X] [Ω y, X]dΩ, [ỹ y, Ω, X, X] is multivariate normal. Sampling scheme: Obtain Ω (),..., Ω (G) [Ω y, X] For each Ω (g), draw ỹ (g) [ỹ y, Ω (g), X, X]. from: Finley, A.O., S. Banerjee, A.R. Ek, and R.E. McRoberts. (2008) Bayesian multivariate process modeling for prediction of forest attributes. Journal of Agricultural, Biological, and Environmental Statistics, 3: JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest Bartlett Experimental Forest Slight digression why we fit a model: Association between response and covariates, β, (e.g., ecological interpretation) Residual spatial and/or non-spatial associations and patterns (i.e., given covariates) Subsequent prediction Study objectives: Evaluate methods for multi-source forest attribute mapping Find the best model, given the data Produce maps of biomass and uncertainty, by tree species Study area: USDA FS Bartlett Experimental Forest (BEF), NH,053 ha heavily forested Major tree species: American beech (BE), eastern hemlock (EH), red maple (RM), sugar maple (SM), and yellow birch (YB) 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest Bartlett Experimental Forest Bartlett Experimental Forest Response variables: Metric tons of total tree biomass per ha Measured on ha plots Inventory plots BE Models fit using random subset of 28 plots Prediction at remaining 29 plots EH RM SM YB Image provided by Longitude (meters) Longitude (meters) 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis 20

21 2 Bartlett Experimental Forest Covariates DEM derived elevation and slope Spring, Summer, Fall Landsat ETM+ Tasseled Cap features (brightness, greeness, wetness) Elevation Longitude (meters) Slope Longitude (meters) JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest Candidate models Each model includes 55 covariates and 5 intercepts, therefore, X T is Different specifications of variance structures: Non-spatial multivariate Diag(Ψ) = τ 2 2 Diag(K), same φ, Diag(Ψ) 3 K, same φ, Diag(Ψ) 4 Diag(K), different φ, Diag(Ψ) 5 K, different φ, Diag(Ψ) 6 K, different φ, Ψ 20 JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest Model comparison Deviance Information Criterion (DIC): D(Ω) = 2 log L(Data Ω) D(Ω) = E Ω Y [D(Ω)] p D = D(Ω) D( Ω); Ω = E Ω Y [Ω] DIC = D(Ω) + p D. Lower DIC is better. Model p D DIC JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest Selected model Model 5: K, different φ, Diag(Ψ) Parameters: K = 5, φ = 5, Diag(Ψ) = 5 Focus on spatial cross-covariance matrix K (for brevity). Posterior inference of cor(k), e.g., 50 (2.5, 97.5) percentiles: BE EH BE EH 0.6 (0.3, 0.2) RM (-0.23, -0.5)-0.20 (-0.23, -0.5) 0.45 (0.26, 0.66)0.45 (0.26, 0.66 SM (-0.22, -0.7) -0.2 (-0.6, -0.09) YB 0.07 (0.04, 0.08) 0.22 (0.20, 0.25) These relationships expressed in mapped random spatial effects, w. 22 JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest E[w Data] Inventory plots BE EH RM SM Longitude (meters) YB Longitude (meters) E[w Data] Validation plots BE EH RM SM Longitude (meters) YB Longitude (meters) JSM 2009 Hierarchical Modeling and Analysis Bartlett Experimental Forest E[Y Data] Validation plots BE EH RM SM Longitude (meters) YB Longitude (meters) P (2.5 < Y < 97.5 Data) Validation plots BE EH RM SM Longitude (meters) YB Longitude (meters) JSM 2009 Hierarchical Modeling and Analysis

22 Bartlett Experimental Forest Summary Proposed Bayesian hierarchical spatial methodology: Partition sources of uncertainty Provides hypothesis testing Reveal spatial patterns and missing covariates Allow flexible inference Access parameters posterior distribution Access posterior predictive distribution Provide consistent prediction of multiple variables Maintains spatial and non-spatial association Extendable model template: Cluster plot sample design multiresolution models Non-continuous response general linear models Obs. over time and space spatiotemporal models 25 JSM 2009 Hierarchical Modeling and Analysis 22

23 Spatio-temporal Models Hierarchical Modelling for Spatialtemporal Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. Specification: Again point-referenced vs. areal unit data Continuous time vs. discretized time association in space, association in time For point-referenced data, t continuous, Gaussian Y (s, t) = µ(s, t) + w(s, t) + ɛ(s, t) non-gaussian data, g(ey (s, t) = µ(s, t) + w(s, t) July 9, 2009 Don t treat time as a third coordinate (s, t) Cov(Y (s, t), Y (s, t )) = C(s s, t t ) 2 JSM 2009 Hierarchical Modeling and Analysis Spatio-temporal Models Spatio-temporal Models Time discretized, Y t (s), t =, 2,...T Separable form: C(s s, t t ) = σ 2 ρ (s s ; φ )ρ 2 (t t ; φ 2 ) Nonseparable form: Sum of independent separable processes Mixing of separable covariance functions Spectral domain approaches Type of data: time series or cross-sectional For time series data, exploratory analysis: Arrange into an n T matrix Y with entries Y t (s i ) Center by row averages of Y yields Y rows Center by column averages of Y yields Y cols sample spatial covariance matrix: T Y rowsy T rows sample autocorrelation matrix: n Y T cols Y cols E, residuals matrix after a regression fitting, Empirical orthogonal functions (EOF) 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Spatio-temporal Models Spatio-temporal Models Dynamic spatiotemporal models Modeling: Y t (s) = µ t (s) + w t (s) + ɛ t (s), or perhaps g(e(y t (s)) = µ t (s) + w t (s) For ɛ t (s), i.i.d. N(0, τ 2 t ) For w t (s) w t (s) = α t + w(s) w t (s) independent for each t w t (s) = w t (s) + η t (s), independent spatial process innovations Measurement Equation Y (s, t) = µ(s, t) + ɛ(s, t); ɛ(s, t) ind N(0, σ 2 ɛ ). µ(s, t) = x(s, t) β(s, t). β(s, t) = β t + β(s, t) Transition Equation β t = β t + η t, η t ind N p (0, Ση) β(s, t) = β(s, t ) + w(s, t). 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 23

24 Big N problem Introduction The Big n issue Hierarchical Modelling for Large Spatial Datasets Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 9, 2009 Univariate spatial regression Y = Xβ + w + ɛ, Estimation involves (σ 2 R(φ) + τ 2 I), which is n n. Matrix computations occur in each MCMC iteration. Known as the Big-N problem in geostatistics. Approach: Use a model Y = Xβ + Zw + ɛ. But what Z? 2 JSM 2009 Hierarchical Modeling and Analysis Big N problem Predictive Process Models Big N problem Selection of knots Consider knots S = {s,..., s n } with n << n. Let w = {w(s i )}n i= Z(θ) = {cov(w(s i ), w(s j ))} {var(w )} is n n. Predictive process regression model Y = Xβ + Z(θ)w + ɛ, Fitting requires only n n matrix computations (n << n). Key attraction: The above arises as a process model: w(s) GP (0, σ 2 w ρ( ; φ)) instead of w(s). ρ(s, s 2 ; φ) = cov(w(s ), w )var(w ) cov(w, w(s 2 )) 3 JSM 2009 Hierarchical Modeling and Analysis Knots: A Knotty problem?? Knot selection: Regular grid? More knots near locations we have sampled more? Formal spatial design paradigm: maximize information metrics (Zhu and Stein, 2006; Diggle & Lophaven, 2006) Geometric considerations: space-filling designs (Royle & Nychka, 998); various clustering algorithms Compare performance of estimation of range and smoothness by varying knot size. Stein (2007, 2008): method may not work for fine-scale spatial data Still a popular choice seamlessly adapts to multivariate and spatiotemporal settings. 4 JSM 2009 Hierarchical Modeling and Analysis Big N problem Selection of knots Big N problem Comparisons: Unrectified VS Rectified tauˆ A rectified predictive process is defined as w ɛ (s) = w(s) + ɛ(s), where ɛ(s) indep N(0, σw( 2 r(s, φ) R (φ)r(s, φ))). Maximum likelihood estimates of τ 2 : # of Knots Predictive Process Rectified Predictive Process exact knots 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis 24

25 from: Finley, A.O., S. Banerjee, P. Waldmann, and T. Ericsson. (2008) Hierarchical spatial modeling of additive and dominance genetic variance for large spatial trial datasets. Biometrics. DOI:0./j x Modeling genetic variation in Scots pine (Pinus sylvestris L.), long-term progeny study in northern Sweden. Quantitative genetics: studies the inheritance of polygenic traits, focusing upon estimation of additive genetic variance, σa, 2 and the heritability h 2 = σa/σ 2 T 2 ot, where the σ2 T ot represents the total genetic and unexplained variation. A high heritability, h 2, should result in a larger selection response (i.e., a higher probability for genetic gain in future generations). 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis Observed trees Observed height Data overview: established in 97 (by Skogforsk) partial diallel design of 52 parent trees 8,60 planted randomly on 2.2m squares 997 reinventory of 4,970 surviving trees, height, DBH, branch angle, etc. Genetic effects model: Y i = x T i β + a i + d i + ɛ i, a = [a i ] n i= MV N(0, σ2 aa) d = [d i ] n i= MV N(0, σ2 d D) ɛ = [ɛ i ] n i= N(0, τ 2 I n ) A and D are fixed relationship matrices (See e.g., Henderson, 985; Lynch and Walsh,998) Note, genetic variance is further partitioned into additive and the non-additive dominance component σ 2 d 9 JSM 2009 Hierarchical Modeling and Analysis 0 JSM 2009 Hierarchical Modeling and Analysis Genetic effects model: Y i = x T i β + a i + d i + ɛ i, Common feature is systematic heterogeneity among observational units (i.e., violation of ɛ N(0, τ 2 I n )) Spatial heterogeneity arises from: soil characteristics micro-climates light availability Residual correlation among units as a function of distance and/or direction = erroneous parameter estimates (e.g., biased h 2 ) Genetic model results Parameter credible intervals, 50% (2.5%, 97.5%) for the non-spatial models Scots pine trial. Non-spatial Parameter Add. Add. Dom. β (69.66, 75.08) (70.04, 74.57) σa (8.30, 49.85) (4.2, 43.96) σd (.24, 40.) τ (2.8, 44.70) 6.4 (00.5, 27.76) h (0.2, 0.28) 0.5 (0.09, 0.26) JSM 2009 Hierarchical Modeling and Analysis 2 JSM 2009 Hierarchical Modeling and Analysis 25

26 Genetic model results, cont d. Residual surface Residual semivariogram So, ɛ N(0, τ 2 I n ). Consider a spatial model. Previous approaches to accommodating residual spatial dependence: Manipulate the mean function constructing covariates using residuals from neighboring units (see e.g., Wilkinson et al., 983; Besag and Kempton, 986; Williams, 986) Geostatitical spatial process formed AR() col AR() row (Martin, 990; Cullis et al., 998) classical geostatistical method (Zimmerman and Harville, 99) All are computationally feasible, but ad hoc and/or restrictive from a modeling perspective. 3 JSM 2009 Hierarchical Modeling and Analysis 4 JSM 2009 Hierarchical Modeling and Analysis Spatial model for genetic trials: Y (s i ) = x T (s i )β + a i + d i + w(s i ) + ɛ i, a = [a i ] n i= MV N(0, σ2 aa) d = [d i ] n i= MV N(0, σ2 d D) w = [w(s i )] n i= MV N(0, σ2 wc(θ)) ɛ = [ɛ i ] n i= N(0, τ 2 I n ) Tools used to estimate parameters: Markov chain Monte Carlo (MCMC) - iterative Gibbs sampler (β, a, d, w) Metropolis-Hastings and Slice samplers (θ) Trick to sample genetic effects: Gibbs draw for random effects, e.g., a MV N(µ a, Σ a ), [ ] where calculating Σ a = A + In σa 2 τ is computationally 2 expensive! However A and D are known, so use initial spectral decomposition i.e., A = P T Λ P. ( Thus, Σ a = P T Λ + I) σa 2 τ P to achieve computational 2 benefits. Here MCMC is computationally infeasible because of Big-N! 5 JSM 2009 Hierarchical Modeling and Analysis 6 JSM 2009 Hierarchical Modeling and Analysis Unfortunately, this trick does not work for w. Rather, we proposed the knot-based predictive process. Corresponding predictive process model: Y (s i ) = x T (s i )β + a i + d i + w(s i ) + ɛ i, w can accomodate complex spatial dependence structures, E.g., anisotropic Matérn correlation function: ρ(s i, s j ; θ) = ( /Γ(ν)2 ν ) ( 2 νd ij ) ν κ ν (2 νd ij ), where d ij = (s i s j ) T Σ (s i s j ), Σ = G(ψ)Λ 2 G T (ψ). Thus, θ = (ν, ψ, Λ). w(s i) = c(s i; θ) T C(θ) (θ)w Simulated Predicted where, w = [w(s i )] m i= MV N(0, C (θ)) and C (θ) = [C(s i, s j ; θ)] m i,j= Projection = 7 JSM 2009 Hierarchical Modeling and Analysis 8 JSM 2009 Hierarchical Modeling and Analysis 26

27 Genetic + spatial effects models Candidate spatial models (i.e., specifications of C (θ)): AR() col AR() row 2 isotropic Matérn 3 anisotropic Matérn Each model evaluated using 64, 44, and 256 knot grids. Model choice using Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002) Table: Model comparisons using the DIC criterion for the Scots pine dataset. Model p D DIC Non-spatial Add ,68.09 Add. Dom , Spatial Isotropic 64 Knots , Knots , Knots ,77.64 Spatial Anisotropic 64 Knots , Knots , Knots , JSM 2009 Hierarchical Modeling and Analysis 20 JSM 2009 Hierarchical Modeling and Analysis Genetic + spatial effects models results Parameter credible intervals, 50% (2.5%, 97.5%) for the isotropic Matérn and 64 and 256 knots Scots pine trial. Spatial Parameter 64 Knots 256 Knots β (69.00, 76.05) 74.2 (69.66, 79.66) σa (7.4, 4.82) (8.9, 53.69) σd 2.69 (6.00, 34.27) 3.96 (7.65, 27.05) σw (23.7, 73.34) (30.24, 88.0) τ (72., 99.65) (67.90, 96.6) ν 0.83 ( 0.3,.46) 0.47 (0.26,.28) φ 0.05 (0.02, 0.09) 0.04 (0.02, 0.09) Eff. Range 7.00 (44.66, 27.93) (45.22, 29.83) h (0.3, 0.3) 0.25 (0.5, 0.39) Decrease in τ 2 due to removal of spatial variation, results in increase in h 2 (i.e., 0.25 vs. 0.5 with confounding). 2 JSM 2009 Hierarchical Modeling and Analysis Genetic + spatial effects models results, cont d. Genetic model residuals w(s), 64 knots w(s), 256 knots Predictive process balance model richness with computational feasibility (e.g., 4,970 4,970 vs ). 22 JSM 2009 Hierarchical Modeling and Analysis Summary Summary Challenge - to meet modeling needs: ensure computationally feasible reduce algorithmic complexity = cheap tricks (e.g., spectral decomp. of A prior to MCMC) reduce dimensionality = predictive process maintain richness and flexibility focus on the model not how to estimate the parameters = embrace new tools (MCMC) for estimating highly flexible hierarchical models truly acknowledge sources of uncertainty propagate uncertainty through hierarchical structures (e.g., recognize uncertainty in C(θ)) 23 JSM 2009 Hierarchical Modeling and Analysis 27

Hierarchical Modeling for Multivariate Spatial Data

Hierarchical Modeling for Multivariate Spatial Data Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Hierarchical Modelling for Multivariate Spatial Data

Hierarchical Modelling for Multivariate Spatial Data Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Researchers in diverse areas such as climatology, ecology, environmental health, and real estate marketing are increasingly faced with the task of analyzing data

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Hierarchical Modeling for non-gaussian Spatial Data

Hierarchical Modeling for non-gaussian Spatial Data Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for non-gaussian Spatial Data Hierarchical Modelling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for non-gaussian Spatial Data Hierarchical Modelling for non-gaussian Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Generalized Linear Models Often data

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.

More information

Hierarchical Modelling for Univariate and Multivariate Spatial Data

Hierarchical Modelling for Univariate and Multivariate Spatial Data Hierarchical Modelling for Univariate and Multivariate Spatial Data p. 1/4 Hierarchical Modelling for Univariate and Multivariate Spatial Data Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,

More information

Hierarchical Modeling for Spatio-temporal Data

Hierarchical Modeling for Spatio-temporal Data Hierarchical Modeling for Spatio-temporal Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Modelling Multivariate Spatial Data

Modelling Multivariate Spatial Data Modelling Multivariate Spatial Data Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. June 20th, 2014 1 Point-referenced spatial data often

More information

Nearest Neighbor Gaussian Processes for Large Spatial Data

Nearest Neighbor Gaussian Processes for Large Spatial Data Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Univariate spatial models Spatial Domain Hierarchical Modeling for Univariate Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis,

More information

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic

More information

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments

More information

Gaussian predictive process models for large spatial data sets.

Gaussian predictive process models for large spatial data sets. Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview

More information

Point-Referenced Data Models

Point-Referenced Data Models Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should

More information

CBMS Lecture 1. Alan E. Gelfand Duke University

CBMS Lecture 1. Alan E. Gelfand Duke University CBMS Lecture 1 Alan E. Gelfand Duke University Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental exposure, public health, and real estate

More information

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,

More information

Multivariate spatial modeling

Multivariate spatial modeling Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing

More information

Model Assessment and Comparisons

Model Assessment and Comparisons Model Assessment and Comparisons Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Hierarchical Linear Models

Hierarchical Linear Models Hierarchical Linear Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin The linear regression model Hierarchical Linear Models y N(Xβ, Σ y ) β σ 2 p(β σ 2 ) σ 2 p(σ 2 ) can be extended

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

Practicum : Spatial Regression

Practicum : Spatial Regression : Alexandra M. Schmidt Instituto de Matemática UFRJ - www.dme.ufrj.br/ alex 2014 Búzios, RJ, www.dme.ufrj.br Exploratory (Spatial) Data Analysis 1. Non-spatial summaries Numerical summaries: Mean, median,

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

Hierarchical Modeling for Spatial Data

Hierarchical Modeling for Spatial Data Bayesian Spatial Modelling Spatial model specifications: P(y X, θ). Prior specifications: P(θ). Posterior inference of model parameters: P(θ y). Predictions at new locations: P(y 0 y). Model comparisons.

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

On Gaussian Process Models for High-Dimensional Geostatistical Datasets

On Gaussian Process Models for High-Dimensional Geostatistical Datasets On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2015 Julien Berestycki (University of Oxford) SB2a MT 2015 1 / 16 Lecture 16 : Bayesian analysis

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software April 2007, Volume 19, Issue 4. http://www.jstatsoft.org/ spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O.

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Introduction to Bayes and non-bayes spatial statistics

Introduction to Bayes and non-bayes spatial statistics Introduction to Bayes and non-bayes spatial statistics Gabriel Huerta Department of Mathematics and Statistics University of New Mexico http://www.stat.unm.edu/ ghuerta/georcode.txt General Concepts Spatial

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable

More information

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Marcela Alfaro Córdoba August 25, 2016 NCSU Department of Statistics Continuous Parameter

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

AMS-207: Bayesian Statistics

AMS-207: Bayesian Statistics Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Non-gaussian spatiotemporal modeling

Non-gaussian spatiotemporal modeling Dec, 2008 1/ 37 Non-gaussian spatiotemporal modeling Thais C O da Fonseca Joint work with Prof Mark F J Steel Department of Statistics University of Warwick Dec, 2008 Dec, 2008 2/ 37 1 Introduction Motivation

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case Areal data models Spatial smoothers Brook s Lemma and Gibbs distribution CAR models Gaussian case Non-Gaussian case SAR models Gaussian case Non-Gaussian case CAR vs. SAR STAR models Inference for areal

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modeling and Analysis for Spatial Data Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström.

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström. C Stochastic fields Covariance Spatial Statistics with Image Analysis Lecture 2 Johan Lindström November 4, 26 Lecture L2 Johan Lindström - johanl@maths.lth.se FMSN2/MASM2 L /2 C Stochastic fields Covariance

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Bayesian Dynamic Modeling for Space-time Data in R

Bayesian Dynamic Modeling for Space-time Data in R Bayesian Dynamic Modeling for Space-time Data in R Andrew O. Finley and Sudipto Banerjee September 5, 2014 We make use of several libraries in the following example session, including: ˆ library(fields)

More information