A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging
|
|
- Cora Jennings
- 5 years ago
- Views:
Transcription
1 A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging Cheng Li DSAP, National University of Singapore Joint work with Rajarshi Guhaniyogi (UC Santa Cruz), Terrance D. Savitsky (US Bureau of Labor Statistics), and Sanvesh Srivastava (U Iowa)
2 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
3 Background: Gaussian Process Gaussian Process is a commonly tool to model functions (surfaces). Let s,..., s n be n arbitrary different locations in a spatial domain D. f ( ) is the mean function (which we want to model). f i = f (s i ) (i =,..., n) are the functional values at the location s i. A Gaussian process assumes that f = (f,..., f n) jointly follows N (f, C). (C) ij = C(s i, s j ), i, j n. C(, ) is the covariance kernel function. Examples of covariance kernel function: C(s, s ) = τ e φ s s, C(s, s ) = τ e φ s s, C(s, s ) = τ ν Γ(ν) (φ s s ) ν K ν (φ s s ).
4 Fitting Gaussian Process Regression Consider fitting a function f with noise: y(s) = f (s) + ɛ(s). Observed locations S = (s,..., s n); y = (y(s ),..., y(s n)); f = (f (s ),..., f (s n)). We impose a Gaussian process (GP) prior on f : f N (, C(S, S)). (C(S, S)) ij = C(s i, s j ), i, j n. ɛ(s) is a white noise process independent of f ; Var(ɛ(s)) = σ. Let f = f (s ) be the functional value at a testing location s. Then the joint distribution of (y, f ) is [ ] ( [ ]) y C(S, S) + σ I n C(S, s ) f N, C(s, S) C(s, s ) The GP posterior of f given the observed data y is Gaussian, with [ ] E[f y] = C(s, S) C(S, S) + σ I n y, [ ] Var[f y] = C(s, s ) C(s, S) C(S, S) + σ I n C(S, s ).
5 Long-standing Challenge in Fitting GP Inverting the matrix C(S, S) + σ I n requires O(n 3 ) computational cost and O(n ) storage cost. For Bayesian methods, learning the parameters in the model (such as the parameters in the kernel C(, ) and σ ) often requires inverting the matrix C(S, S) + σ I n repeatedly. For commonly used kernels, such as squared exponential kernel and Matérn kernel, the matrix C(S, S) tends to have singularity problem when n becomes large, say n 4. This is one of the main motivations that drive the research on GP.
6 Existing Methods to Speed up GP The general idea is to simplify the structure of C(, ). Low-rank methods: Fit the spatial surface f using r chosen basis functions or knots. Fixed-rank kriging (Cressie & Johannesson 8 JRSSB), Predictive process (Banerjee et al. 8 JRSSB, Finley et al. 9 CSDA). Compacted supported covariance kernel: Covariance tapering (Kaufman, Schervish & Nychka 8 JASA). Sparsity on inverse covariance matrix: Composite likelihood (Stein, Chi & Welty 4 JRSSB), Nearest Neighbor GP (Datta et al. 6 JASA, Datta et al. 6 AOAS) Block structured covariance: Fitting GP on sub-regions (Nychka et al. 5 JCGS, Katzfuss 7 JASA, Guhaniyogi and Banerjee 8 JCGS). Related methods in machine learning: Subset of regressors (Quinonero-Candela & Rasmussen 5 JMLR); Inducing point methods (Wilson & Nickisch 5 ICML), etc.
7 Problems with Existing Methods Low-rank methods are popular for large spatial datasets. In this work, we compare mainly with the modified predictive process (MPP) (Finley et al. 9 CSDA). MPP reduces the computation complexity of GP from O(n 3 ) to O(nr ), where r is the number of knots (chosen basis functions). However, a very large n also requires a much larger r ( n), in order to achieve the same level of accuracy. For example, if n is about million, then the number of knots needs to be a few thousand. Nearest Neighbor GP (NNGP, Datta et al. 6 JASA, Datta et al. 6 AOAS) works well for rough surfaces but not as well for smooth surfaces.
8 Goals We need fast and scalable computational methods to fit GP to spatial datasets with million observations in reasonable amount of time (e.g. in about day). We develop under the Bayesian framework with posterior samplers that can be practically easy to implement. We need theoretical guarantees for the validity of uncertainty quantification.
9 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
10 Divide-and-Conquer Approach for Big Data General steps: Divide the massive datasets of size n into k non-overlapping subsets; Perform statistical inference (estimation, posterior sampling, etc.) on each of the k subsets; Combine the k subset results into a global result (averaged estimator, posterior distribution, etc.) Divide-and-conquer simple and effective: Allows parallel computing; Significantly reduces computational costs locally. Each machine only handles data of size m = n/k.
11 Divide and Conquer Bayes Consensue Monte Carlo (CMC, Scott et al. 6 ) Semiparametric density product (SDP, Neiswanger et al. 4 ) Wasserstein posterior (WASP, Srivastava et al. 5 ), PIE (Li et al. 7 ) Double-parallel MCMC (DP-MCMC, Xue & Liang 7 ) Basic idea: When the data are all independent Posterior Likelihood Prior [ k ] jth subset likelihood prior j= k j= [ jth subset likelihood prior /k] (CMC, SDP) [ k /k (jth subset likelihood) k prior] (WASP, PIE, DP-MCMC) j=
12 Divide and Conquer Bayes Idea: Run MCMC in parallel on disjoint subset data, and combine the subset posterior distributions into a global posterior.
13 Divide and Conquer Bayes PIE (Li, Srivastava, and Dunson 7 ): Average the subset posterior empirical quantiles. Leading to the Wasserstein- barycenter, as an approximation to the full data posterior.
14 For a -dimensional parameter ξ R, PIE Algorithm in 3 Steps Step : Run MCMC on k subsets on k machines in parallel. Draw posterior samples from each subset posterior. Get the posterior samples for ξ. Step : Find the qth empirical quantiles of ξ, for each of the k subset posterior distributions. q (, ) can come from a fine grid on (, ). Step 3: Average the k subset quantiles. Output: Averaged quantiles for ξ = A combined distribution for ξ, as an approximation to the true posterior of ξ based on the full data. Posterior credible intervals. For independent data, this method (as well as many other similar ones) is asymptotically valid for approximating the true posterior of ξ given the full data.
15 Wasserstein Barycenter Wasserstein- (W ) distance between two measures µ and ν: { } W (µ, ν) = inf E(X Y ), law(x ) = µ, law(y ) = ν For univariate distribution functions F and F, their W distance has an equivalent form: [ W (F, F ) = F (q) F (q) ] dq where F, F are quantile functions of F, F. The Wasserstein- barycenter of k distributions µ,..., µ k is µ = arg min µ P k k W (µ, µ j ), where P is the space of all distributions with finite nd moment. j=
16 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
17 A GP-based Spatial Model In this work, we consider a generalized GP regression model with spatial predictors: y(s) = x(s) β + w(s) + ɛ(s). Observed locations S = {s,..., s n}; y = (y(s ),..., y(s n)) ; w = (w(s ),..., w(s n)). x(s) is a p-dimensional predictor vector at location s. X = (x(s ),..., x(s n)). We impose a Gaussian process (GP) prior on w: w N (, C α(s, S)). (C α(s, S)) ij = C α(s i, s j ), i, j n. ɛ(s) is a white noise process independent of w and x; Var(ɛ(s)) = σ. Let s be a testing location and s / S. Let w = w(s ), y = y(s ), x = x(s ). The prior for β is N(µ β, Σ β ). A prior is also imposed on α, σ.
18 Iterate the following steps: Posterior Sampling Algorithm Sample β given y, X, α, σ from N(m β, V β ), where { } V β = X V(α, σ ) X + Σ β, { } m β = V β X V(α, σ ) y + Σ β µ β, V(α, σ ) = C α(s, S) + σ I n. Sample (α, σ ) given y, X, β using the random walk Metropolis-Hastings. Then w, y can be drawn as w y, X, β, α, σ N(m, V ), where m = C α(s, S)V(α, σ ) (y Xβ), V = C α(s, s ) C α(s, S)V(α, σ ) C α(s, s ). y y, X, x, w, β, α, σ N(x β + w, σ ).
19 Distributed Kriging (DISK) What can we do for big n like n = 6? We propose a D&C Bayesian method called DISK. For short, DISK = (Existing D&C Bayes) + (Spatially Dependent Data) We fit the GP-based spatial model on k different subsets of data, and then combine the results. Partition scheme: Uniform sampling - such that every subset is representative of the full data. For example, if PIE (posterior interval estimation, Li et al 7 ) is used, then for each parameter in (β, α, σ, w, y ), the marginal posterior is obtained by the following steps:
20 DISK with PIE Partition the locations into k subsets S = k j=s j, S j S j =. Let n = km, where n is the total sample size and m is the subset size. Fit the GP regression model for the jth subset (j =,..., k) y(s ji ) = x(s ji ) β + w(s ji ) + ɛ(s ji ), i =,..., m, by using the modified posterior { k π m(β, α, σ y j, X j ) l j (β, α, σ )} π(β, α, σ ), ( ) l j (β, α, σ ) = N X j β, V j (α, σ ), V j (α, σ ) = C α(s j, S j ) + σ I m, where (y j, X j ) is the jth subset data at locations S j. Take the qth empirical quantiles of the marginals of π m(β, α, σ y j, X j ) for j =,..., k, and then average them. Perform the same combination for the predictive posteriors of w and y.
21 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
22 Theory Our theoretical results mainly focus on the posterior of w, the values function w at a testing location s. We present upper bounds for the Bayes L risk, and a Bayesian version of bias-variance tradeoff. We first show the results for the simplified model y(s) = w(s) + ɛ(s) without the x(s) β part, and then generalize to the full model y(s) = x(s) β + w(s) + ɛ(s). To ease the technical difficulty, we assume that the finite-dimensional parameters (α, σ ) are fixed (because their estimation is notoriously difficult...) Furthermore, we assume that the testing location s is selected independently from the same distribution that generates the training locations S = {s,..., s n}.
23 Bayes L Risk The Bayes L risk of the DISK posterior is defined as Bayes L risk = E SE S E s [w(s ) w (s )], where E s is the expectation w.r.t. the sampled testing location, E S is the DISK posterior expectation given the training locations S, and E S is the expectation w.r.t. the training location. This Bayes L risk can be decomposed into 3 parts: Bayes L risk = bias + var mean + var DISK They are the squared bias, the variance of subset posterior means (var mean), and the mean of subset variances (var DISK ). The reason for variance terms is the law of total variance var(y ) = var[e(y X )] + E[var(Y X )].
24 Some Notation Let P s be the sampling distribution of s,..., s n, s. Let {φ i } i= be an orthonormal basis of L (P s) w.r.t. P s. The kernel has the series expansion C α(s, s ) = i= µ iφ i (s)φ i (s ) w.r.t. P s for any s, s D, where µ µ... are the eigenvalues of C α. The trace of the kernel C α is defined as tr(c α) = i= µ i. For any f L (P s), f (s) = i= θ iφ i (s), where θ i = f, φ i L (P s). The reproducing kernel Hilbert space (RKHS) H attached to C α is the space of all functions f L (P s) such that the H-norm f H = i= θ i /µ i <.
25 Bayes Bias-Variance Tradeoff More technical assumptions: Assumption : The true surface w lies in the RKHS (reproducing kernel Hilbert space) of C α. (This can be relaxed by using sieve approximation.) Assumption : The covariance kernel C α satisfies tr(c α) <. Assumption 3: There are positive constants ρ and r, such that E Ps [φ r i (s)] ρ r for every i =,,...,, and var [ɛ(s)] σ < for all s D. We derive upper bounds for the three terms in the Bayes L risk, and show that (roughly speaking) for a given total sample size n, as k, bias, var mean, var DISK bias + var mean, where k is the number of subsets.
26 Bayes Bias-Variance Tradeoff bias 8σ n w H + w H inf d N var mean n + 4 w H k var DISK 3 σ n γ inf d N 3n σ σ k n w H + σ n γ ( ) σ + inf n d N 8n ρ 4 tr(c α)tr(cα) d Ab(m, d, r)ρ r γ( σ + µ n ) σ m ρ 4 tr(c α)tr(cα) d Ab(m, d, r)ρ r γ( σ + n ) + m ( ) σ, n 5n σ tr(c α)tr(cα) d Ab(m, d, r)ρ γ( σ + tr(c α) m where N is the set of all positive integers, A is a positive constant, ( ) max(r, max(r, log d) b(m, d, r) = max log d),, γ(a) = i= m / /r µ i µ i + a for any a >, tr(c d α) = µ i. i=d+ n ) r,
27 Examples of Covariance Kernels Implications from the previous results for some kernel classes (with r > 4): Finite rank kernels: if µ d + = µ d =... = for some integer d, and k cn r 4 r /(log n) r r for some constant c >, then the Bayes L risk is O ( n ). Exponentially decaying kernels: if µ i c µ exp ( c µi κ ) for some constants c µ >, c µ >, κ > and all i N, and k cn r 4 r /(log n) (rκ+κ ) κ(r ) ( ) for some constant c >,, then the Bayes L risk is O (log n) /κ /n. Polynomially decaying kernels: if µ i c µi ν for some constants (r 4)ν (r ) and all i N, and k cn (r )ν /(log n) r r c µ >, ν > r r 4 constant c >, then the Bayes L risk is O (n ν ν ). for some
28 Examples of Covariance Kernels Some comments: The rates for finite rank kernels and exponentially decaying kernels are near minimax optimal up to logarithm factors. The rate for polynomially decaying kernels is suboptimal, but ) the difference is small. The minimax optimal rate is O (n ν+ ν. The gap between the two rates is n ν(ν+). The reason for sub-optimality is due to our fixed stochastic approximation (likelihood raised to the fixed power of k). One can possibly make it optimal by using an adaptive power in the subset likelihood.
29 Examples of Covariance Kernels (Convenient) generalization to the full model y(s) = x(s) β + w(s) + ɛ(s): Extra Assumption: x( ) is a deterministic function. The prior of β is normal N(µ β, Σ β ), and is independent of the prior on w( ), which is GP(, C α(, )). Based on this assumption, we can do a change of kernel: Č α(s, s ) = Cov { x(s ) β + w(s ), x(s ) β + w(s ) } = x(s ) Σ β x(s ) + C α(s, s ). Our theory results hold for the mean function x(s) β + w(s) in the full model, as long as all the previous conditions on C α is satisfied for the new kernel Č α. Since we only focus on the predictive loss on the mean function x(s) β + w(s), we do not need any identification condition for β and w( ).
30 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
31 Simulation Model : Smooth Surface D = [, ] [, ]. f (s) = e (s ) + e.8(s+).5 sin{8(s +.)}, for s [, ]. The true surface is w (s) = f (s )f (s ), s = (s, s ) D. The response is y(s) = β + w (s) + ɛ, ɛ N(, σ ). β =, σ =.. We use the Matérn kernel with ν = /: C(s, s ) = τ exp ( φ s s ) for s, s D. σ InvGamma(,.), τ InvGamma(, ), φ Uniform(., ). In simulation scenarios, we use n = 4 and n = 6, respectively. The training locations are sampled uniformly over D. We use l = 5 testing locations, also sampled uniformly over D. The Bayes L risks (and other criteria) are estimated by averaging over the l testing locations. All methods are replicated for Monte Carlo simulations.
32 Methods Compared LatticeKrig (Nychka et al. 5 ) using the LatticeKrig package in R, with 3 number of resolutions. Locally approximated Gaussian process (lagp, Gramacy and Apley 5 ) using the lagp package in R with method set to nn. Full-rank Gaussian process (GP) using the spbayes package in R with the full data. Modified predictive process (MPP, Finley et al. 5 ) using the spbayes package in R with the full data. Nearest neighbor Gaussian process (NNGP, Datta et al. 5 ) using the spnngp package in R with the number of nearest neighbors (NN) as 5, 5, and 5. DISK + CMC (Scott et al. 6 ) + (Full GP or MPP). DISK + PIE (Li et al. 7 ) + (Full GP or MPP). The first two methods are frequentist; the others are all Bayesian.
33 Simulation Results When n = 4 Prediction MSE averaged over l = 5 locations lagp LatticeKrig Full GP MPP MPP (r = ) (r = 4) NNGP NN= 5 NN= 5 NN= DISK-CMC(MPP) k = k = k = 3 k = 4 k = 5 r = DISK-CMC(MPP) k = k = k = 3 k = 4 k = 5 r = DISK-PIE(GP) k = k = k = 3 k = 4 k = DISK-PIE(MPP) k = k = k = 3 k = 4 k = 5 r = DISK-PIE(MPP) k = k = k = 3 k = 4 k = 5 r =
34 Background of GP D&C Bayes Method True w* Theory DISK (GP, k = ): 5% Quantile Experiments DISK (GP, k = 3): 5% Quantile Future Work DISK (GP, k = 4): 5% Quantile..5. DISK (GP, k = ):.5% Quantile GP: 5% Quantile DISK (GP, k = ): 5% Quantile.5 DISK (GP, k = ): 97.5% Quantile..5. MPP (r = ): 5% Quantile DISK (MPP, r =, k = ):.5% Quantile DISK (MPP, r =, k = ): 5% Quantile.5 DISK (MPP, r =, k = ): 97.5% Quantile..5. MPP (r = 4): 5% Quantile DISK (MPP, r = 4, k = ):.5% Quantile DISK (MPP, r = 4, k = ): 5% Quantile.5 DISK (MPP, r = 4, k = ): 97.5% Quantile..5..5
35 Bias-Variance Tradeoff Bias Variance DISK (GP) DISK (MPP, r = ) DISK (MPP, r = 4) GP lagp l Risk k k k When n = 6, full data GP, LatticeKrig, MPP, NNGP are not applicable. Only lagp is left to compare with DISK. Prediction MSE averaged over l = 5 locations lagp DISK(MPP) DISK(MPP) (r = 4, k = 5) (r = 4, k = 5)..6.5
36 Simulation Results When n = 4 Parametric Inference and Predictive Interval Coverage β τ σ φ 95% PI Coverage Truth lagp LatticeKrig GP MPP (r = ) MPP (r = ) NNGP NN= NN= NN= DISK+PIE (GP) k = DISK+PIE (MPP) r =, k = DISK+PIE (MPP) r = 4, k =
37 Simulation Results When n = 6 Parametric Inference and Predictive Interval Coverage β τ σ φ Truth lagp DISK+PIE (MPP) r = 4, k = r = 6, k = MSPE 95% PI Coverage lagp..945 DISK+PIE (MPP) r = 4, k = r = 6, k = r is the number of knots. k is the number of subsets. Full GP, LatticeKrig, MPP, and NNGP fail to run for n = 6.
38 Simulation Model : Rough Surface D = [, ] [, ]. The true surface w ( ) is a rough sample path from the Matérn kernel with ν = /: C(s, s ) = τ exp ( φ s s ) for s, s D. We fit the model using the same kernel. The response is y(s) = β + w (s) + ɛ, ɛ N(, σ ). β =, σ =., τ =, φ = 9. σ InvGamma(,.), τ InvGamma(, ), φ Uniform(., 3). Training sample size n = 4 respectively. The training locations are sampled uniformly over D. Testing sample size l = 5, also sampled uniformly over D. The Bayes L risks (and other criteria) are estimated by averaging over the l testing locations. Only compare DISK(MPP) k = and lagp.
39 Results for Simulation Parametric Inference and Predictive Interval Coverage (k = for DISK) β σ τ φ Truth. 9 lagp DISK(MPP) r = (.96,.3) (.,.7) (.9,.3) (8.94, 9.68) r = (.96,.3) (.,.7) (.9,.3) (8.96, 9.73) MSPE 95% PI Coverage 95% PI Length lagp.5 (.). (.5).38 (.3) DISK(MPP) r =.9 (.).96 (.7) 3.97 (.773) r = 4.8 (.94).96 (.35) 3.86 (.77)
40 Sea Surface Temperature Data SST in the west coast of mainland U.S.A., Canada, and Alaska between 4 65 north latitudes and 8 west longitudes. The dataset is obtained from NODC World Ocean Database. wod.html We choose,, 8 spatial observations in the selected domain. 6 obs are used as training data, and 8 obs are used as testing data. Model: y(s) = β + β latitude + w(s) + ɛ(s), ɛ(s) N(, σ ). For DISK+CMC(MPP) and DISK+PIE(MPP), we show the results with r = 4 and k = 3.
41 Background of GP D&C Bayes Method Theory Experiments Future Work SST Data: Predicted Surfaces Truth 65 lagp 65 CMC:.5% Quantile 65 DISK:.5% Quantile CMC: 5% Quantile DISK: 5% Quantile CMC: 97.5% Quantile DISK: 97.5% Quantile I The PMSEs of CMC(MPP) and DISK(MPP) are almost the same, and both are comparable to lagp (a freq GP method). DISK(MPP) has wider predictive intervals. I Other Bayesian GP methods do not work, including MPP and NNGP.
42 SST Data: Numerical Results Parametric inference and prediction in SST data. DISK-CMC nd DISK-PIE use MPP-based modeling with 4 knots on k = 3 subsets. β β σ τ lagp DISK-CMC (3.9, 3.37) (-.36, -.34) (.78,.69) (.8,.) DISK-PIE (3.74, 3.95) (-.33, -.3) (.3,.45) (.8,.85) MSPE 95% PI Coverage 95% PI Length lagp.5 (.).95 (.).35 (.) DISK-CMC.4 (.).3 (.).4 (.) DISK-PIE.4 (.).95 (.).67 (.)
43 Outline Background of GP D&C Bayes Method Theory Experiments Future Work
44 Future Work Theory for the finite-dimensional parameters (α, σ ), and the joint distribution of both parametric and nonparametric parts. Theory for the frequentist coverage of DISK posterior credible intervals. The choice of partition distribution; The influence from the number of knots in MPP. Extension to spatial-temporal processes. Manuscript at arxiv:
45 Future Work Theory for the finite-dimensional parameters (α, σ ), and the joint distribution of both parametric and nonparametric parts. Theory for the frequentist coverage of DISK posterior credible intervals. The choice of partition distribution; The influence from the number of knots in MPP. Extension to spatial-temporal processes. Manuscript at arxiv: Thank you! & Questions?
arxiv: v1 [stat.me] 28 Dec 2017
A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging Raarshi Guhaniyogi, Cheng Li, Terrance D. Savitsky 3, and Sanvesh Srivastava 4 arxiv:7.9767v [stat.me] 8 Dec 7 Department of Applied Mathematics
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationOn Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationBayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets
Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets Sudipto Banerjee University of California, Los Angeles, USA Based upon projects involving: Abhirup Datta (Johns Hopkins University)
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationA Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models
A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models arxiv:1811.03735v1 [math.st] 9 Nov 2018 Lu Zhang UCLA Department of Biostatistics Lu.Zhang@ucla.edu Sudipto Banerjee UCLA
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationScaling up Bayesian Inference
Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background
More informationComparing Non-informative Priors for Estimation and Prediction in Spatial Models
Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationSemi-Nonparametric Inferences for Massive Data
Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationAn Additive Gaussian Process Approximation for Large Spatio-Temporal Data
An Additive Gaussian Process Approximation for Large Spatio-Temporal Data arxiv:1801.00319v2 [stat.me] 31 Oct 2018 Pulong Ma Statistical and Applied Mathematical Sciences Institute and Duke University
More informationWrapped Gaussian processes: a short review and some new results
Wrapped Gaussian processes: a short review and some new results Giovanna Jona Lasinio 1, Gianluca Mastrantonio 2 and Alan Gelfand 3 1-Università Sapienza di Roma 2- Università RomaTRE 3- Duke University
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationNonparametric Bayesian Methods
Nonparametric Bayesian Methods Debdeep Pati Florida State University October 2, 2014 Large spatial datasets (Problem of big n) Large observational and computer-generated datasets: Often have spatial and
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationBayesian Aggregation for Extraordinarily Large Dataset
Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationParameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets
Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationSpatial smoothing using Gaussian processes
Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLog Gaussian Cox Processes. Chi Group Meeting February 23, 2016
Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationQuantile Regression for Extraordinarily Large Data
Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationMCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17
MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making
More informationComparing Non-informative Priors for Estimation and. Prediction in Spatial Models
Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Vigre Semester Report by: Regina Wu Advisor: Cari Kaufman January 31, 2010 1 Introduction Gaussian random fields with specified
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationPractical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK
Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops
More informationStudent-t Process as Alternative to Gaussian Processes Discussion
Student-t Process as Alternative to Gaussian Processes Discussion A. Shah, A. G. Wilson and Z. Gharamani Discussion by: R. Henao Duke University June 20, 2014 Contributions The paper is concerned about
More informationApproximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)
Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationA full scale, non stationary approach for the kriging of large spatio(-temporal) datasets
A full scale, non stationary approach for the kriging of large spatio(-temporal) datasets Thomas Romary, Nicolas Desassis & Francky Fouedjio Mines ParisTech Centre de Géosciences, Equipe Géostatistique
More informationGaussian predictive process models for large spatial data sets.
Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationSpatio-temporal prediction of site index based on forest inventories and climate change scenarios
Forest Research Institute Spatio-temporal prediction of site index based on forest inventories and climate change scenarios Arne Nothdurft 1, Thilo Wolf 1, Andre Ringeler 2, Jürgen Böhner 2, Joachim Saborowski
More informationLecture 3: Introduction to Complexity Regularization
ECE90 Spring 2007 Statistical Learning Theory Instructor: R. Nowak Lecture 3: Introduction to Complexity Regularization We ended the previous lecture with a brief discussion of overfitting. Recall that,
More informationBayesian spatial quantile regression
Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse
More informationThe Bayesian approach to inverse problems
The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationDivide and Conquer Kernel Ridge Regression. A Distributed Algorithm with Minimax Optimal Rates
: A Distributed Algorithm with Minimax Optimal Rates Yuchen Zhang, John C. Duchi, Martin Wainwright (UC Berkeley;http://arxiv.org/pdf/1305.509; Apr 9, 014) Gatsby Unit, Tea Talk June 10, 014 Outline Motivation.
More informationGaussian Multiscale Spatio-temporal Models for Areal Data
Gaussian Multiscale Spatio-temporal Models for Areal Data (University of Missouri) Scott H. Holan (University of Missouri) Adelmo I. Bertolde (UFES) Outline Motivation Multiscale factorization The multiscale
More informationPractical Bayesian Optimization of Machine Learning. Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that
More informationAfternoon Meeting on Bayesian Computation 2018 University of Reading
Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationMulti-resolution models for large data sets
Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012 Credits Steve Sain, NCAR Tia LeRud, UC Davis
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationStatistics for analyzing and modeling precipitation isotope ratios in IsoMAP
Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationApproximate Bayesian computation for spatial extremes via open-faced sandwich adjustment
Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial
More informationStochastic optimization in Hilbert spaces
Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic optimization Hilbert spaces 1 / 48 Outline Learning vs Statistics Aymeric Dieuleveut Stochastic optimization Hilbert
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationBayesian Deep Learning
Bayesian Deep Learning Mohammad Emtiyaz Khan AIP (RIKEN), Tokyo http://emtiyaz.github.io emtiyaz.khan@riken.jp June 06, 2018 Mohammad Emtiyaz Khan 2018 1 What will you learn? Why is Bayesian inference
More informationA Framework for Daily Spatio-Temporal Stochastic Weather Simulation
A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationSpatial Dynamic Factor Analysis
Spatial Dynamic Factor Analysis Esther Salazar Federal University of Rio de Janeiro Department of Statistical Methods Sixth Workshop on BAYESIAN INFERENCE IN STOCHASTIC PROCESSES Bressanone/Brixen, Italy
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationMaking rating curves - the Bayesian approach
Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationSome Curiosities Arising in Objective Bayesian Analysis
. Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work
More informationDisease mapping with Gaussian processes
EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More information