Package bayeslm. R topics documented: June 18, Type Package

Similar documents
Package horseshoe. November 8, 2016

Package sbgcop. May 29, 2018

Package EMVS. April 24, 2018

Package basad. November 20, 2017

Package FDRreg. August 29, 2016

Package SpatPCA. R topics documented: February 20, Type Package

Package leiv. R topics documented: February 20, Version Type Package

Package RcppSMC. March 18, 2018

The sbgcop Package. March 9, 2007

Package esaddle. R topics documented: January 9, 2017

Package spatial.gev.bma

Package FDRSeg. September 20, 2017

Package IGG. R topics documented: April 9, 2018

Package bigtime. November 9, 2017

Package effectfusion

Package zic. August 22, 2017

Package factorqr. R topics documented: February 19, 2015

Package dhsic. R topics documented: July 27, 2017

Package clogitboost. R topics documented: December 21, 2015

Package NonpModelCheck

Package ShrinkCovMat

Package RTransferEntropy

Package NlinTS. September 10, 2018

Package bhrcr. November 12, 2018

Package msbp. December 13, 2018

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression

Package misctools. November 25, 2016

Package ltsbase. R topics documented: February 20, 2015

Package netcoh. May 2, 2016

Package beam. May 3, 2018

Package depth.plot. December 20, 2015

Package rnmf. February 20, 2015

Package BNN. February 2, 2018

Package blme. August 29, 2016

Package gma. September 19, 2017

STAT 425: Introduction to Bayesian Analysis

Package jmcm. November 25, 2017

Package drgee. November 8, 2016

Package face. January 19, 2018

Package polypoly. R topics documented: May 27, 2017

Package scpdsi. November 18, 2018

Package MAVE. May 20, 2018

Penalized Regression

Package sklarsomega. May 24, 2018

Package BGGE. August 10, 2018

Package ForwardSearch

Efficient sampling for Gaussian linear regression with arbitrary priors

Package Grace. R topics documented: April 9, Type Package

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package convospat. November 3, 2017

Package dina. April 26, 2017

Package BayesNI. February 19, 2015

Package HIMA. November 8, 2017

Package SimCorMultRes

Package RZigZag. April 30, 2018

Package invgamma. May 7, 2017

Package IsingSampler

Package esreg. August 22, 2017

Package EnergyOnlineCPM

Package LBLGXE. R topics documented: July 20, Type Package

Package noncompliance

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Package icmm. October 12, 2017

Package mpmcorrelogram

The evdbayes Package

Package varband. November 7, 2016

Package fwsim. October 5, 2018

Package MicroMacroMultilevel

Package ICBayes. September 24, 2017

Package flexcwm. R topics documented: July 11, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.0.

Package elhmc. R topics documented: July 4, Type Package

Package CEC. R topics documented: August 29, Title Cross-Entropy Clustering Version Date

Package gwqs. R topics documented: May 7, Type Package. Title Generalized Weighted Quantile Sum Regression. Version 1.1.0

Package shrink. August 29, 2013

Package idmtpreg. February 27, 2018

Package MonoPoly. December 20, 2017

Package covtest. R topics documented:

Package codadiags. February 19, 2015

Package mcmcse. July 4, 2017

Package RootsExtremaInflections

Package flexcwm. R topics documented: July 2, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.1.

Package MACAU2. R topics documented: April 8, Type Package. Title MACAU 2.0: Efficient Mixed Model Analysis of Count Data. Version 1.

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Package gtheory. October 30, 2016

Package epr. November 16, 2017

Package darts. February 19, 2015

Package AID. R topics documented: November 10, Type Package Title Box-Cox Power Transformation Version 2.3 Date Depends R (>= 2.15.

Package tspi. August 7, 2017

ST 740: Markov Chain Monte Carlo

Package pssm. March 1, 2017

Package scio. February 20, 2015

Package Blendstat. February 21, 2018

Package genphen. March 18, 2019

Package SimSCRPiecewise

Package mbvs. August 28, 2018

Package orclus. R topics documented: March 17, Version Date

Package clr. December 3, 2018

Package goftest. R topics documented: April 3, Type Package

Package TSPred. April 5, 2017

Package rbvs. R topics documented: August 29, Type Package Title Ranking-Based Variable Selection Version 1.0.

Transcription:

Type Package Package bayeslm June 18, 2018 Title Efficient Sampling for Gaussian Linear Regression with Arbitrary Priors Version 0.8.0 Date 2018-6-17 Author P. Richard Hahn, Jingyu He, Hedibert Lopes Maintainer Jingyu He <jingyu.he@chicagobooth.edu> Efficient sampling for Gaussian linear regression with arbitrary priors, Hahn, He and Hedibert (2018) <arxiv:1806.05738>. License LGPL (>= 2) Imports Rcpp (>= 0.12.7), stats, graphics, grdevices, coda, RcppParallel SystemRequirements GNU make Depends R (>= 2.10) URL http://jingyuhe.com/software.html NeedsCompilation yes LinkingTo Rcpp, RcppArmadillo, RcppParallel Repository CRAN Date/Publication 2018-06-18 17:57:57 UTC R topics documented: bayeslm-package...................................... 2 bayeslm........................................... 2 hs_gibbs........................................... 6 plot.mcmc......................................... 7 predict.bayeslm.fit...................................... 8 summary.bayeslm.fit.................................... 9 summary.mcmc...................................... 10 Index 12 1

2 bayeslm bayeslm-package Efficient sampling for Gaussian linear regression with arbitrary priors The elliptical slice sampler for Bayesian linear regression with shrinakge priors such as horseshoe, Laplace prior, ridge prior. P. Richard Hahn, Jingyu He and Hedibert Lopes Maintainer: Jingyu He <jingyu.he@chicagobooth.edu> References Hahn, P. Richard, Jingyu He, and Hedibert Lopes. Efficient sampling for Gaussian linear regression with arbitrary priors (2017). See Also bayeslm bayeslm Efficient sampling for Gaussian linear model with arbitrary priors Usage This package implements an efficient sampler for Gaussian Bayesian linear regression. The package uses elliptical slice sampler instead of regular Gibbs sampler. The function has several built-in priors and user can also provide their own prior function (written as a R function). ## Default S3 method: bayeslm(y, X = FALSE, prior = "horseshoe", penalize = NULL, block_vec = NULL, sigma = NULL, s2 = 1, kap2 = 1, N = 20000L, burnin = 0L, thinning = 1L, vglobal = 1, sampling_vglobal = TRUE, verb = FALSE, icept = TRUE, standardize = TRUE, singular = FALSE, scale_sigma_prior = TRUE, prior_mean = NULL, prob_vec = NULL, cc = NULL,...) ## S3 method for class 'formula' bayeslm(formula, data = list(), Y = FALSE, X = FALSE, prior = "horseshoe", penalize = NULL, block_vec = NULL, sigma = NULL, s2 = 1, kap2 = 1, N = 20000L, burnin = 0L, thinning = 1L, vglobal = 1, sampling_vglobal = TRUE, verb = FALSE, standardize = TRUE, singular = FALSE, scale_sigma_prior = TRUE, prior_mean = NULL, prob_vec = NULL, cc = NULL,...)

bayeslm 3 Arguments formula data Y X prior block_vec penalize sigma formula of the model to fit. an optional data frame containing the variables in the model. By default the variables are taken from the environment which bayeslm is called from. data.frame, matrix, or vector of inputs Y. Response variable. data.frame, matrix, or vector of inputs X. Regressors. Indicating shrinkage prior to use. "horseshoe" for approximate horseshoe prior (default), "laplace" for laplace prior, "ridge" for ridge prior, "sharkfin" for "sharkfin" prior and "nonlocal" for nonlocal prior. A vector indicating number of regressors in each block. Sum of all entries should be the same as number of regressors. The default value is block_vec = rep(1, p), put every regressor in its own block (slice-within-gibbs sampler) A vector indicating shrink regressors or not. It s length should be the same as number of regressors. 1 indicates shrink corresponding coefficient, 0 indicates no shrinkage. The default value is rep(1, p), shrink all coefficients Initial value of residual standard error. The default value is half of standard error of Y. s2, kap2 Parameter of prior over sigma, an inverse gamma prior with rate s2 and shape s2. N burnin thinning Number of posterior samples (after burn-in). Number of burn-in samples. If burnin > 0, the function will draw N + burnin samples and return the last N samples only. Number of thinnings. thinning = 1 means no thinning. vglobal Initial value of global shrinkage parameter. Default value is 1 sampling_vglobal Bool, if TRUE, sampling the global shrinkage parameter by random walk Metropolis Hastings on log scale, otherwise always stay at the initial value vglobal. verb icept standardize Bool, if TRUE, print out sampling progress. Bool, if the inputs are matrix X and Y, and icept = TRUE, the function will estimate intercept. Default value is TRUE. If the input is formula Y ~ X, option icept is useless, control intercept by formular Y ~ X or Y ~ X - 1. Bool, if TRUE, standardize X and Y before sampling. singular Bool, if TRUE, take it as a rank-deficient case such as n < p or X X is singular. See section 2.3.2 of the paper for details. scale_sigma_prior Bool, if TRUE, the prior of regression coefficient β is scaled by residual standard error σ. prior_mean prob_vec vector, specify prior mean of nonlocal prior for each regressor. It should have length p (no intercept) or p + 1 (intercept). The default value is 1.5 for all regressors. vector, specify prior mean of sharkfin prior for each regressor. It should have length p (no intercept) or p + 1 (intercept). The default value is 0.25 for all regressors.

4 bayeslm cc Details Value Only works when singular == TRUE, precision parameter of ridge adjustment. It should be a vector with length $p$. If it is NULL, it will be set as rep(10, p).... optional parameters to be passed to the low level function bayeslm.default. For details of the approach, please see Hahn, He and Lopes (2017) loops sigma vglobal beta fitted.values residuals A vector of number of elliptical slice sampler loops for each posterior sample. A vector of posterior samples of residual standard error. A vector of posterior samples of the global shrinkage parameter. A matrix of posterior samples of coefficients. Fitted values of the regression model. Take posterior mean of coefficients with 20% burnin samples. Residuals of the regression model, equals y - fitted.values. Note horseshoe is essentially call function bayeslm with prior = "horseshoe". Same for sharkfin, ridge, blasso, nonlocal. Jingyu He <jingyu.he@chicagobooth.edu> References Hahn, P. Richard, Jingyu He, and Hedibert Lopes. Efficient sampling for Gaussian linear regression with arbitrary priors. (2017). Examples p = 20 n = 100 kappa = 1.25 beta_true = c(c(1,2,3),rnorm(p-3,0,0.01)) sig_true = kappa*sqrt(sum(beta_true^2)) x = matrix(rnorm(p*n),n,p) y = x %*% beta_true + sig_true * rnorm(n) x = as.matrix(x) y = as.matrix(y)

bayeslm 5 data = data.frame(x = x, y = y) block_vec = rep(1, p) # slice-within-gibbs sampler, put every coefficient in its own block fitols = lm(y~x-1) # call the function using formulas fita = bayeslm(y ~ x, prior = 'horseshoe', block_vec = block_vec, N = 10000, burnin = 2000) # summary the results summary(fita) summary(fita$beta) # put the first two coefficients in one elliptical sampling block block_vec2 = c(2, rep(1, p-2)) fitb = bayeslm(y ~ x, data = data, prior = 'horseshoe', block_vec = block_vec2, N = 10000, burnin = 2000) # comparing several different priors fit1 = bayeslm(y,x,prior = 'horseshoe', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est1 = colmeans(fit1$beta) fit2 = bayeslm(y,x,prior = 'laplace', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est2 = colmeans(fit2$beta) fit3 = bayeslm(y,x,prior = 'ridge', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est3 = colmeans(fit3$beta) fit4 = bayeslm(y,x,prior = 'sharkfin', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est4 = colmeans(fit4$beta) fit5 = bayeslm(y,x,prior = 'nonlocal', icept = FALSE, block_vec = block_vec, N = 10000, burnin=2000) beta_est5 = colmeans(fit5$beta) plot(null,xlim=range(beta_true),ylim=range(beta_true), xlab = "beta true", ylab = "estimation") points(beta_true,beta_est1,pch=20) points(beta_true,fitols$coef,col='red') points(beta_true,beta_est2,pch=20,col='cyan') points(beta_true,beta_est3,pch=20,col='orange') points(beta_true,beta_est4,pch=20,col='pink') points(beta_true,beta_est5,pch=20,col='lightgreen') legend("topleft", c("ols", "horseshoe", "laplace", "ridge", "sharkfin", "nonlocal"), col = c("red", "black", "cyan", "orange", "pink", "lightgreen"), pch = rep(1, 6))

6 hs_gibbs abline(0,1,col='red') rmseols = sqrt(sum((fitols$coef-beta_true)^2)) rmse1 = sqrt(sum((beta_est1-beta_true)^2)) rmse2 = sqrt(sum((beta_est2-beta_true)^2)) rmse3 = sqrt(sum((beta_est3-beta_true)^2)) rmse4 = sqrt(sum((beta_est4-beta_true)^2)) rmse5 = sqrt(sum((beta_est5-beta_true)^2)) print(cbind(ols = rmseols, hs = rmse1,laplace = rmse2, ridge = rmse3,sharkfin = rmse4,nonlocal = rmse5)) hs_gibbs Gibbs sampler of horseshoe regression Standard Gibbs sampler of horseshoe regression. Usage hs_gibbs(y, X, nsamps, a, b, scale_sigma_prior) Arguments Y X nsamps Response of regression. Matrix of regressors. Number of posterior samples. a Parameter of inverse Gamma prior on σ. b Parameter of inverse Gamma prior on σ. scale_sigma_prior Bool, if TRUE, use prior scaled by σ. Details This function implements standard Gibbs sampler of horseshoe regression. The prior is y β, σ 2, X MV N(Xβ, σ 2 I) β i τ, λ i, σ N(0, λ 2 i τ 2 σ 2 ) σ 2 IG(a, b) τ C + (0, 1) λ i C + (0, 1) Jingyu He

plot.mcmc 7 See Also summary.mcmc Examples x = matrix(rnorm(1000), 100, 10) y = x %*% rnorm(10) + rnorm(100) fit=hs_gibbs(y, x, 1000, 1, 1, TRUE) summary(fit) plot.mcmc Plot posterior summary plot.mcmc is an S3 method to plot empirical distribution of posterior draws. The input is a MCMC matrix Usage ## S3 method for class 'MCMC' plot(x,names,burnin=trunc(.1*nrow(x)),tvalues,traceplot=true,den=true,int=true, CHECK_NDRAWS=TRUE,... ) Arguments x A MCMC class matrix of posterior draws, such as bayeslm$beta. names an optional character vector of names for the columns of X. burnin Number of draws to burn-in (default value is 0.1 nrow(x)). tvalues vector of true values. TRACEPLOT logical, TRUE provide sequence plots of draws and acfs (default: TRUE) DEN logical, TRUE use density scale on histograms (default: TRUE) INT logical, TRUE put various intervals and points on graph (default: TRUE) CHECK_NDRAWS logical, TRUE check that there are at least 100 draws (default: TRUE)... optional arguments for generic function. Details This function is modified from package bayesm by Peter Rossi. It plots summary of posterior draws. Peter Rossi, Anderson School, UCLA, <perossichi@gmail.com>.

8 predict.bayeslm.fit See Also summary.bayeslm.fit Examples x = matrix(rnorm(1000), 100, 10) y = x %*% rnorm(10) + rnorm(100) fit=bayeslm(y~x) plot(fit$beta) predict.bayeslm.fit Predict new data predict.bayeslm.fit is an S3 method to predict response on new data. Usage ## S3 method for class 'bayeslm.fit' predict(object,data,burnin,x,...) Arguments object data burnin X object is output of bayeslm function. A data frame or list of new data to predict. number of draws to burn-in (default value is 0.1 nrow(x)). If call bayeslm with matrices input x and y but not formula, pass new matrix to predict here. See example for details.... optional arguments for generic function. Details Make prediction on new data set, users are allowed to adjust number of burn-in samples. Jingyu He

summary.bayeslm.fit 9 Examples x = matrix(rnorm(1000), 100, 10) y = x %*% rnorm(10) + rnorm(100) data = list(x = x, y = y) # Train the model with formula input fit1 = bayeslm(y ~ x, data = data) # predict pred1 = predict(fit1, data) # Train the model with matrices input fit2 = bayeslm(y = y, X = x) pred2 = predict(fit2, X = x) summary.bayeslm.fit Summarize fitted object of bayeslm summary.bayeslm.fit is an S3 method to summarize returned object of function bayeslm. The input should be bayeslm.fit object. Usage ## S3 method for class 'bayeslm.fit' summary(object,names,burnin=null,quantiles=false,trailer=true,...) Arguments Details object names burnin quantiles trailer object is a fitted object, returned by function bayeslm. an optional character vector of names for all the coefficients. number of draws to burn-in (if it is NULL, will set default value as 0.2 nrow(object$beta)) logical for should quantiles be displayed (def: FALSE) logical for should a trailer be displayed (def: TRUE)... optional arguments for generic function This function summarize returned object of function bayeslm. It prints mean, std Dev, effective sample size (computed by function effectivesize in package coda) coefficients posterior samples. If quantiles=true, quantiles of marginal distirbutions in the columns of X are displayed. The function also returns significance level, defined by whether the symmetric posterior quantilebased credible interval excludes zero. For example, a regression coefficient with one * has 0.025 quantile and 0.975 quantile with the same sign. Similarly, *** denotes 0.0005 and 0.9995, ** denotes 0.005 and 0.995, * denotes 0.025 and 0.975,. denotes 0.05 and 0.95 quantiles with the same sign.

10 summary.mcmc Jingyu He See Also summary.mcmc Examples x = matrix(rnorm(1000), 100, 10) y = x %*% rnorm(10) + rnorm(100) fit=bayeslm(y~x) summary(fit) summary.mcmc Summarize posterior draws summary.mcmc is an S3 method to summarize posterior draws of the model. The input should be a matrix of draws. Usage ## S3 method for class 'MCMC' summary(object,names,burnin=trunc(.1*nrow(x)),quantiles=false,trailer=true,...) Arguments Details object object is a matrix of draws, usually an object of class MCMC. It s same as X. names an optional character vector of names for the columns of X. burnin quantiles trailer number of draws to burn-in (default value is 0.1 nrow(x)). logical for should quantiles be displayed (def: FALSE). logical for should a trailer be displayed (def: TRUE).... optional arguments for generic function. This function is modified from package bayesm by Peter Rossi. It summarize object MCMC. Mean, Std Dev, effective sample size (computed by function effectivesize in package coda) are displayed. If quantiles=true, quantiles of marginal distirbutions in the columns of X are displayed. The function also returns significance level, defined by whether the symmetric posterior quantilebased credible interval excludes zero. For example, a regression coefficient with one * has 0.025 quantile and 0.975 quantile with the same sign. Similarly, *** denotes 0.0005 and 0.9995, ** denotes 0.005 and 0.995, * denotes 0.025 and 0.975,. denotes 0.05 and 0.95 quantiles with the same sign.

summary.mcmc 11 See Also Peter Rossi, Anderson School, UCLA, <perossichi@gmail.com>. summary.bayeslm.fit Examples x = matrix(rnorm(1000), 100, 10) y = x %*% rnorm(10) + rnorm(100) fit=bayeslm(y~x) summary(fit$beta)

Index Topic linear regression bayeslm, 2 Topic package bayeslm-package, 2 Topic sumamry summary.mcmc, 10 Topic summary plot.mcmc, 7 summary.bayeslm.fit, 9 Topic univar hs_gibbs, 6 predict.bayeslm.fit, 8 bayeslm, 2 bayeslm-package, 2 hs_gibbs, 6 plot.mcmc, 7 predict.bayeslm.fit, 8 summary.bayeslm.fit, 8, 9, 11 summary.mcmc, 10 summary.mcmc, 7, 10 12