Multivariate spatial models and the multikrig class
|
|
- Ambrose Carr
- 6 years ago
- Views:
Transcription
1 Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009
2 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions and soil water profiles The multikrig class Case study: NC temperature and precipitation 2
3 A Spatial Regression Model A spatial regression model: where Y = Xβ + h + ɛ (n 1) (n q)(q 1) (n 1) (n 1) E[h] = 0, Var[h] = Σ h E[ɛ] = 0, Var[ɛ] = σ 2 I h and ɛ are independent Y N (Xβ, V), V = Σ h + σ 2 I β = (X V 1 X) 1 X V 1 Y, ĥ = Σ hv 1 (Y X β) 3
4 Multivariate Regression A multivariate, multiple regression model: where Y = Xβ + ɛ (n p) (n q)(q p) (n p) Each of the n rows of Y represents a p-vector observation Each of the p columns of β represent regression coefficients for each variable The rows of ɛ represents a collection of iid error vectors with zero mean and common covariance matrix, Σ 4
5 Multivariate Regression MLEs are straightforward to obtain: β = (X X) 1 X Y (q p) Σ = n 1Y PY (p p) where P = I X(X X) 1 X Note that the columns of β can be obtained through p univariate regressions 5
6 Vec and Kronecker The Kronecker product of an m n matrix A and an r q matrix B is an mr nq matrix: A B = a 11 B a 12 B a 1n B a 21 B a 22 B a 2n B a m1 B a m2 B a mn B Some properties: A (B + C) = A B + A C A (B C) = (A B) C (A B)(C D) = AC BD (A B) = A B (A B) 1 = A 1 B 1 A B = A m B n 6
7 Vec and Kronecker The vec-operator stacks the columns of a matrix: A = Some properties: [ a11 a 12 a 21 a 22 ] vec(a) = vec(axb) = (B A) vec X a 11 a 21 a 12 a 22 tr(a B) = vec(a) vec(b) vec(a + B) = vec(a) + vec(b) vec(αa) = α vec(a) 7
8 Multivariate Regression Revisited Rewrite the multivariate, multiple regression model: What is Var[vec ɛ]? vec(y) = (I p X) vec(β) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) What is the GLS estimator for vec(β)? 8
9 A Multivariate Spatial Model Extend the multivariate, multiple regression model: where vec(y) = (I p X) vec(β) + vec(h) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) (np 1), Var[vec(h)] = Σ h = Var[vec(ɛ)] = Σ I n Σ 11 Σ 12 Σ 1p Σ 12 Σ 22 Σ 2p Σ 12 Σ 2p Σ pp 9
10 A Multivariate Spatial Model One simplification to the spatial covariance matrix is to use a Kronecker form: Σ h = ρ K where ρ is a p p matrix of scale parameters K is an n n spatial covariance 10
11 A Multivariate Spatial Model Extend the multivariate, multiple regression model: vec(y) = (I p X) vec(β) + vec(h) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) (np 1) OR Y = Xβ + h + ɛ Now everything follows 11
12 Case Study: Pedotransfer Functions Soil characteristics such as composition (clay, silt, sand) are commonly measured and easily obtainable Unfortunately, crop models require water holding characteristics such as the wilting point or lower limit (LL) and the drained upper limit (DUL) which are not so easy to obtain Often the LL and DUL are a function of depth - soil water profile 12
13 Case Study: Pedotransfer Functions Pedotransfer functions are commonly used to estimate LL and DUL Differential equations, regression, nearest neighbors, neural networks, etc Often specialized by soil type and/or region Develop a new type of pedotransfer function that can capture the entire soil water profile (LL & DUL as a function of depth) Characterize the variation! 13
14 Soil Water Profiles SIL SICL CL S depth proportion water
15 The Big Picture Soil Weather Many, many other things Big, black box Yield Soil Water holding characteristics Bulk density Etc Weather (20 years) Solar radiation Temperature max/min Precipitation The CERES Crop Model 15
16 The Big Picture Given a complicated array of inputs, the CERES crop model will give the yields of, for example, maize Deterministic output variation in yields also of interest Goals: Establish a framework to study sources of variation in crop yields Assess impacts of climate change on crop yields 16
17 Data n = 272 measurements on N = 63 soil samples Gijsman et al (2002) Ratliff et al (1983), Ritchie et al (1987) Includes measurements of: depth, soil composition and texture percentages of clay, sand, and silt bulk density, organic matter, and field measured values of LL and DUL 17
18 Data The soil texture measurements form a composition Z clay + Z silt + Z sand = 1 and Z clay, Z silt,z sand are the proportions of each soil component Not really three variables To remove the dependence, the additive log-ratio transformation (Aitchinson, 1986) is applied, defining two new variables X 1 = log Z sand Z clay X 2 = log Z silt Z clay 18
19 Data - Composition vs LL CLAY SAND SILT 19
20 Data - Composition vs LL log(silt/clay) log(sand/clay) 20
21 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is where Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) Y 0 = log LL 1 LL d 1 d and d is the number of measurements (depths) and i = DUL i LL i, 21
22 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) where and T 0 = [ 1 X0 Z LL, X 0 Z,0 ], X 0 is the transformed soil composition information Z LL and Z are additional covariates for LL and Z LL includes organic carbon Z includes linear and quadratic terms for depth 22
23 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is where Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) h(x 0 ) is a two-dimensional spatial process that controls the smoothness of the contribution of X ɛ(d 0 ) is an error process that accounts for the dependence in LL and for a particular depth and accounts for dependence across depths (one-dimensional spatial process) 23
24 A Multi-objective Pedotransfer Function Letting Y = log [LL 11 LL 1d1 LL 21 LL NdN 11 1d1 21 NdN ], then Y is multivariate normal with E[Y] = Tβ Var[Y] = Σ h + Σ ɛ [ ] ρ1 0 Σ h = 0 ρ 2 Σ ɛ = S R K with K ij = k(x i, X j ) S is the covariance of (LL, ) at a fixed depth R is the (spatial) covariance across depths 24
25 Covariance Structures The covariance function for h is the Matern family C(d) = σ 22(θd/2)ν K ν (θd) Γ(ν) where σ 2 is a scale parameter, θ represents the range, ν controls the smoothness σ 2 = 1 (the ρ controls the variances), ν = 1, and θ is taken to be approximately the range of the data These choices represent a covariance structure that is consistent with the thin-plate spline estimator (large range, more smoothness) Analogous to fixing the kernel and estimating the bandwidth with kernel estimators 25
26 Covariance Structures The covariance function across depths is exponential C(d) = σ 2 exp ( d/θ) where again σ 2 is a scale parameter and θ represents the range The parameters σ 2 = 1 (the matrix S controls the variances) and θ is estimated from the data The multiple realizations of the soils allow for improved ability to estimate both scale and range parameters 26
27 Covariance Structures Σ h = [ ρ1 0 0 ρ 2 ] K Σ ɛ = S R 27
28 Spatial Smoothing Write Σ h + Σ ɛ = [ ρ1 0 0 ρ 2 ] [[ η1 0 = s 11 0 η 2 = s 11 Ω [ s11 s K + 12 s 12 s 22 ] ] R K + [ 1 v12 v 12 v 22 ] R ] The amount of smoothing is due to the relative contributions of the variance components, ie η 1 and η 2 Different degrees of smoothing are allowed for LL and Also, this construction allows for different degrees of variation in the error terms for LL and the variables 28
29 The Estimator The model suggests an estimator of the form Ŷ 0 = T 0ˆβ + K 0ˆδ, where K 0 = [ η1 0 0 η 2 ] K To fit the model, we must estimate: η 1, η 2 and s 11 β, δ R and the other entries of S 29
30 REML Take the QR decomposition of T T = [Q 1 Q 2 ] [ R 0 ] Then Q 2Y has zero mean and covariance matrix given by Q 2 (Σ h + Σ ɛ )Q 2 Maximize (numerically) the likelihood based on Q 2Y which is only a function of the covariance parameters Estimates of β and δ follow directly ˆβ = (T ˆΩ 1 T) 1 T ˆΩ 1 Y ˆδ = ˆΩ 1 (Y Tˆβ) 30
31 An Iterative Approach 0 Initialize: compute K and set S = I and R = I 1 Estimate η 1 and η 2 (and s 11 ) via a simplified type of REML (grid search) 2 Then ˆβ = (T ˆΩ 1 T) 1 T 1 ˆΩ 1 Y ˆδ = ˆΩ 1 (Y Tˆβ) 3 Compute residuals and a Update S (R fixed) closed form solution b Update R (S fixed) grid search for θ 4 Repeat items 1-3 until convergence 31
32 An Iterative Approach Let Y = µ + h + ɛ, where h and ɛ are independent Gaussian random variables; the conditional distribution of Y µ h given h is a zero mean Gaussian with covariance matrix ɛ Thus, the log-likelihood associated with the residuals is given by n 2 S R vec(u) (S 1 R 1 ) vec(u) The quadratic form can be written as tr(s 1 i j r ij u j u i ) where r ij is the ijth element of R 1 and u i is the bivariate, unstacked residual for the ith observation 32
33 An Iterative Approach An update for S can be written as Ŝ = 1 n i j r ij u j u i = 1 n U R 1 U where U is the n 2 matrix of unstacked residuals Again, a simple grid search for θ is used to obtain a new value for R 33
34 Parameter Estimates η 1 η 2 S 11 S 22 S 12 θ REML Iterative
35 Soil Composition and LL X1 X2 Y
36 Soil Composition and X1 X2 Y
37 Soil Composition and LL/ SAND CLAY SILT SAND CLAY SILT
38 Organic Carbon and LL orgc partial residuals
39 Depth and depth partial residuals
40 Residuals (Within Depth) LL residuals Delta residuals
41 Spatial Covariance Across Depth covariance distance 41
42 Prediction Error The real benefit of considering our estimator as a spatial process is in the interpretation with respect to the uncertainty of the estimator: The thin-plate spline is a biased estimator with uncorrelated error; not easy to quantify the bias (interpolation error and smoothing error) The spatial process estimator is unbiased, but with correlated error; more complicated error structure but conceptually straightforward to work with 42
43 Prediction Error The estimator can be written as Ŷ 0 = T 0ˆβ + K 0ˆδ = A 0 Y, where A 0 = + T 0 (T ˆΩ 1 T) 1 T ˆΩ 1 K 0 (ˆΩ 1 ˆΩ 1 T(T ˆΩ 1 T) 1 T ˆΩ 1) 43
44 Prediction Error Hence, Var(Y 0 Ŷ 0 ) = Var(Y 0 A 0 Y) = Var(Y 0 ) + A 0 Var(Y)A 0 2A 0Cov(Y, Y 0 ) Var(Y 0 ) and Var(Y) are computed by plugging in parameters estimates for Σ h and Σ ɛ The covariance between Y 0 and Y comes from h and is based on the distance between the transformed composition data 44
45 Generation of Soil Profiles Simulations of log LL and log were generated from a multivariate normal with mean A 0 Y and variance given by the prediction error We use an average soil composition profile computed from the data and assumed to constant across all depths, D = {5, 15, 30, 45, 60, 90, 120, 150} Represent a draw from the posterior distribution of soil water profiles based on the estimated mean and covariance structure and the uncertainty gleaned from the data 45
46 Generation of Soil Profiles SIL S depth proportion water 46
47 Application: Crop Models Two soils (SIL, S) Given the soil composition, organic carbon, depth, etc, 100 soil profiles (LL, DUL) were generated Twenty years of weather (solar radiation, temperature min/max, and precipitation) Yield output generated from the CERES-Maize crop model 47
48 Crop Yields SIL (red), S (blue), total annual precipitation (solid line) 48
49 Crop Yields SIL (red), S (blue), average annual temperature (solid line) 49
50 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: Y = Tβ + h + ɛ Y = Y 1 Y p T = I p X h = h 1 h p ɛ = ɛ 1 ɛ p 50
51 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: E[Y] = Tβ Y = Tβ + h + ɛ Var[Y] = Σ h + Σ ɛ = ρ 1 ρ p V(θ) + = ρ 1 R V(θ) + s 11 S I n = ρ 1 (R V(θ) + λs I n ) s 11 s 12 s 1p s 21 s 22 s 2p s p1 s p2 s pp I n 51
52 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: Y = Tβ + h + ɛ E[Y] = Tβ Var[Y] = Σ h + Σ ɛ = ρ 1 (R V(θ) + λs I n ) Given S, R, and θ, use Krig to estimate β, ρ 1 and λ 52
53 The multikrig Class Issues: Specifying x, Y, and Z Mean function (nullfunction) Covariance function (covfunction) Error function (wghtfunction) Estimation (S, R, and θ) 53
54 Krig Function Krig <- function (x, Y, Z, nullfunction = "Krignullfunction", covfunction = "stationarycov", wghtfunction = NULL, nullargs = NULL, covargs = NULL, wghtargs = NULL) x is an n q matrix of spatial locations Y is a n-vector of observations observations Z is a n q matrix of additional covariates 54
55 multikrig Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null) s is an n q matrix of spatial locations Y is an n p matrix of observations Z is either: a n q matrix of additional covariates, or a list of n q i matrices of additional covariates 55
56 multikrig Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 56
57 Y <- c(y) Y = Y 1 Y p multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 57
58 x <- expandgrid(1:n,1:d) x = n n p multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 58
59 nz <- kronecker(diag(d),cbind(s,z)) Z = s Z s Z multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 59
60 multinull <- function(x,z=null,dropz=false){ data <- dataframe(a=asfactor(x[,2])) X <- modelmatrix( a,data=data,contrasts=list(a="contrtreatment")) return(cbind(x,z)) } T = s Z s Z s Z multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 60
61 Covariance Function Issue: x is now a matrix of indices Solution: pass the spatial locations as an argument to the covariance function 61
62 multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ covargs$s <- s obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } multicov <- function(x1,x2,marginal=false,c=na,s,theta,rho,smoothness){ ind <- unique(x1[,2]) temp <- stationarycov(s[x1[,1][x1[,2]==1],], s[x2[,1][x2[,2]==1],], Covariance="Matern",theta=theta,smoothness=smoothness) if (length(ind)>1){ for } } return(temp) } (i in 2:length(ind)){ temp2 <- rho[i-1]*stationarycov(s[x1[,1][x1[,2]==i],], s[x2[,1][x2[,2]==i],], Covariance="Matern",theta=theta,smoothness=smoothness) d1 <- dim(temp) d2 <- dim(temp2) temp <- rbind(cbind(temp,matrix(0,d1[1],d2[2])), cbind(matrix(0,d1[1],d2[2]),temp2)) 62
63 Weight Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } multiwght <- function(x,sp){ n <- length(unique(x[,1])) return(kronecker(solve(s),diag(n))) } 63
64 Estimation Krig will estimate β, ρ 1 and λ REML GCV (not quite there) How to estimate S, R, and θ? 64
65 Thanks! wwwimageucaredu/ ssain Sain, SR, Mearns, L, Shrikant, J, and Nychka, D (2006), A multivariate spatial model for soil water profiles, Journal of Agricultural, Biological, and Environmental Statistics, 11,
Multivariate spatial models and the multikrig class
Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR SAMSI Summer School on Spatial Statistics July 28 August 1, 2009 Outline Overview of multivariate spatial regression models
More informationA Multivariate Spatial Model for Soil Water Profiles
A Multivariate Spatial Model for Soil Water Profiles Stephan R. Sain, 1 Shrikant Jagtap, 2 Linda Mearns, 3 and Doug Nychka 4 July 20, 2004 SUMMARY: Pedotransfer functions are classes of models used to
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012 Credits Steve Sain, NCAR Tia LeRud, UC Davis
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationA Framework for Daily Spatio-Temporal Stochastic Weather Simulation
A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric
More informationTHE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay Lecture 5: Multivariate Multiple Linear Regression The model is Y n m = Z n (r+1) β (r+1) m + ɛ
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation Iowa State March, 2013 Credits Steve Sain, Tamra Greasby, NCAR Tia LeRud,
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationCS281A/Stat241A Lecture 17
CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationStatistics 910, #15 1. Kalman Filter
Statistics 910, #15 1 Overview 1. Summary of Kalman filter 2. Derivations 3. ARMA likelihoods 4. Recursions for the variance Kalman Filter Summary of Kalman filter Simplifications To make the derivations
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More information. a m1 a mn. a 1 a 2 a = a n
Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationvariability of the model, represented by σ 2 and not accounted for by Xβ
Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationSTATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS
STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS Eric Gilleland Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Supported
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationVector Auto-Regressive Models
Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationVAR Models and Applications
VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationSpatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions
Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64
More informationIntroduction to Regression
Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures
More informationChapter 4 - Fundamentals of spatial processes Lecture notes
TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationOverview of Spatial Statistics with Applications to fmri
with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationSensitivity of GLS estimators in random effects models
of GLS estimators in random effects models Andrey L. Vasnev (University of Sydney) Tokyo, August 4, 2009 1 / 19 Plan Plan Simulation studies and estimators 2 / 19 Simulation studies Plan Simulation studies
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More information1 Introduction to Generalized Least Squares
ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationFunctional SVD for Big Data
Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationGaussian Processes 1. Schedule
1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project
More informationLinear Algebra Review
Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and
More informationFitting Large-Scale Spatial Models with Applications to Microarray Data Analysis
Fitting Large-Scale Spatial Models with Applications to Microarray Data Analysis Stephan R Sain Department of Mathematics University of Colorado at Denver Denver, Colorado ssain@mathcudenveredu Reinhard
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationRegression. ECO 312 Fall 2013 Chris Sims. January 12, 2014
ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationClimate Change: the Uncertainty of Certainty
Climate Change: the Uncertainty of Certainty Reinhard Furrer, UZH JSS, Geneva Oct. 30, 2009 Collaboration with: Stephan Sain - NCAR Reto Knutti - ETHZ Claudia Tebaldi - Climate Central Ryan Ford, Doug
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationUncertainty and regional climate experiments
Uncertainty and regional climate experiments Stephan R. Sain Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO Linda Mearns,
More informationWeather generators for studying climate change
Weather generators for studying climate change Assessing climate impacts Generating Weather (WGEN) Conditional models for precip Douglas Nychka, Sarah Streett Geophysical Statistics Project, National Center
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationQuestions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares
Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares L Magee Fall, 2008 1 Consider a regression model y = Xβ +ɛ, where it is assumed that E(ɛ X) = 0 and E(ɛɛ X) =
More informationSpatio-temporal prediction of site index based on forest inventories and climate change scenarios
Forest Research Institute Spatio-temporal prediction of site index based on forest inventories and climate change scenarios Arne Nothdurft 1, Thilo Wolf 1, Andre Ringeler 2, Jürgen Böhner 2, Joachim Saborowski
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationMassachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s
Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture 6: Additional Results for VAR s 6.1. Confidence Intervals for Impulse Response Functions There
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationGaussian predictive process models for large spatial data sets.
Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationAsymptotic standard errors of MLE
Asymptotic standard errors of MLE Suppose, in the previous example of Carbon and Nitrogen in soil data, that we get the parameter estimates For maximum likelihood estimation, we can use Hessian matrix
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationStatistics for analyzing and modeling precipitation isotope ratios in IsoMAP
Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable
More informationMath 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University
Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationPh.D. Qualifying Exam Monday Tuesday, January 4 5, 2016
Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Find the maximum likelihood estimate of θ where θ is a parameter
More informationForecasting and data assimilation
Supported by the National Science Foundation DMS Forecasting and data assimilation Outline Numerical models Kalman Filter Ensembles Douglas Nychka, Thomas Bengtsson, Chris Snyder Geophysical Statistics
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationGLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22
GLS and FGLS Econ 671 Purdue University Justin L. Tobias (Purdue) GLS and FGLS 1 / 22 In this lecture we continue to discuss properties associated with the GLS estimator. In addition we discuss the practical
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More information