Package basad. November 20, 2017

Similar documents
Package EMVS. April 24, 2018

Package horseshoe. November 8, 2016

Package factorqr. R topics documented: February 19, 2015

Package RZigZag. April 30, 2018

Package BNN. February 2, 2018

Package FDRreg. August 29, 2016

Package BayesNI. February 19, 2015

Package effectfusion

Package IGG. R topics documented: April 9, 2018

Package bayeslm. R topics documented: June 18, Type Package

Package flexcwm. R topics documented: July 11, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.0.

Package flexcwm. R topics documented: July 2, Type Package. Title Flexible Cluster-Weighted Modeling. Version 1.1.

Package sbgcop. May 29, 2018

Package clogitboost. R topics documented: December 21, 2015

Package IsingFit. September 7, 2016

Package Grace. R topics documented: April 9, Type Package

Package NlinTS. September 10, 2018

Package RI2by2. October 21, 2016

Package abundant. January 1, 2017

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression

Package ShrinkCovMat

Package covsep. May 6, 2018

Package icmm. October 12, 2017

Package jmcm. November 25, 2017

Package leiv. R topics documented: February 20, Version Type Package

Package gma. September 19, 2017

Package dhsic. R topics documented: July 27, 2017

Package darts. February 19, 2015

Package RcppSMC. March 18, 2018

Package MACAU2. R topics documented: April 8, Type Package. Title MACAU 2.0: Efficient Mixed Model Analysis of Count Data. Version 1.

Package gtheory. October 30, 2016

Package zic. August 22, 2017

Package SpatPCA. R topics documented: February 20, Type Package

Package SimSCRPiecewise

Package BayesTreePrior

Package mixture. R topics documented: February 13, 2018

Package HIMA. November 8, 2017

Package bhrcr. November 12, 2018

Package noncompliance

Package MicroMacroMultilevel

Package EMMIXmcfa. October 18, 2017

Package dina. April 26, 2017

Package IsingSampler

Package RTransferEntropy

Package spatial.gev.bma

Package spcov. R topics documented: February 20, 2015

Package TruncatedNormal

Package rbvs. R topics documented: August 29, Type Package Title Ranking-Based Variable Selection Version 1.0.

Package metafuse. October 17, 2016

Package MonoPoly. December 20, 2017

Package netcoh. May 2, 2016

Package varband. November 7, 2016

Package NonpModelCheck

Package SpatialVS. R topics documented: November 10, Title Spatial Variable Selection Version 1.1

Package bigtime. November 9, 2017

Package gwqs. R topics documented: May 7, Type Package. Title Generalized Weighted Quantile Sum Regression. Version 1.1.0

Package MAVE. May 20, 2018

Package penmsm. R topics documented: February 20, Type Package

Package covtest. R topics documented:

Package IUPS. February 19, 2015

Package RootsExtremaInflections

Package beam. May 3, 2018

Package drgee. November 8, 2016

Package Blendstat. February 21, 2018

Package BGGE. August 10, 2018

Package BNSL. R topics documented: September 18, 2017

Package IndepTest. August 23, 2018

Package SpatialNP. June 5, 2018

Package elhmc. R topics documented: July 4, Type Package

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

Package ICBayes. September 24, 2017

Package depth.plot. December 20, 2015

Package ForecastCombinations

Package SimCorMultRes

Package scio. February 20, 2015

Package esaddle. R topics documented: January 9, 2017

Package EMMIXmfa. October 18, Index 15

The sbgcop Package. March 9, 2007

Package rnmf. February 20, 2015

Package generalhoslem

Package fwsim. October 5, 2018

Package CovSel. September 25, 2013

Package ebal. R topics documented: February 19, Type Package

Package milineage. October 20, 2017

Package CPE. R topics documented: February 19, 2015

Package EDR. September 14, 2016

Package ltsbase. R topics documented: February 20, 2015

Package mistwosources

Package SK. May 16, 2018

Package shrink. August 29, 2013

Package CoxRidge. February 27, 2015

Package TwoStepCLogit

Package bwgr. October 5, 2018

Package invgamma. May 7, 2017

Package BLR. February 19, Index 9. Pedigree info for the wheat dataset

Package blme. August 29, 2016

Package rgabriel. February 20, 2015

Package LCAextend. August 29, 2013

Package weightquant. R topics documented: December 30, Type Package

Package CausalFX. May 20, 2015

Transcription:

Package basad November 20, 2017 Type Package Title Bayesian Variable Selection with Shrinking and Diffusing Priors Version 0.2.0 Date 2017-11-15 Author Qingyan Xiang <qxiang@illinois.edu>, Naveen Narisetty <naveen@illinois.edu> Maintainer Qingyan Xiang <qxiang@illinois.edu> Description Provides a Bayesian variable selection approach using continuous spike and slab prior distributions. The prior choices here are motivated by the shrinking and diffusing priors studied in Narisetty & He (2014) <DOI:10.1214/14-AOS1207>. License GPL (>= 3) Imports Rcpp, rmutil LinkingTo Rcpp, RcppEigen NeedsCompilation yes Repository CRAN Date/Publication 2017-11-20 21:41:07 UTC R topics documented: basad............................................ 2 predict.basad........................................ 5 print.basad.......................................... 6 Index 8 1

2 basad basad Bayesian variable selection with shrinking and diffusing priors Description This function performs the Bayesian variable selection procedure with shrinking and diffusing priors via Gibbs sampling. Three different prior options placed on the coefficients are provided: Gaussian, Student s t, Laplace. The posterior estimates of coefficients are returned and the final model is selected either by using the "BIC" criterion or the median probability model. Usage basad( x = NULL, y = NULL, K = -1, df = 5, nburn = 1000, niter = 1000, alternative = FALSE, verbose = FALSE, nsplit = 20, tau0 = NULL, tau1 = NULL, prior.dist = "Gauss", select.cri = "median", BIC.maxsize = 20) Arguments x y K df nburn niter alternative verbose nsplit tau0 tau1 prior.dist select.cri BIC.maxsize The matrix or data frame of covariates. The response variables. An initial value for the numbers of active covariates in the model. This value is related to the prior probability that a covariate is nonzero. If K is not specified greater than 3, this prior probability will be estimated by a Beta prior using Gibbs sampling (see details below). The degrees of freedom of t prior when prior.dist == "t". The number of iterations for burn-in. The number of iterations for estimation. If TRUE, an alternative sampling scheme from Bhattacharya will be used which can accelerate the speed of the algorithm for very large p. However, when using block updating (by setting nsplit to be greater than 1) this alternative sampling will not be invoked. If TRUE, verbose output is sent to the terminal. Numbers of splits for the block updating scheme. The scale of the prior distribution for inactive coefficients (see details below). The scale of the prior distribution for active coefficients (see details below). Choice of the base distribution for spike and slab priors. If prior.dist="t", the algorithm will place Student s t prior for regression coefficients. If prior.dist="laplace", the algorithm will place Laplace prior. Otherwise, it will place the default Gaussian priors. Model selection criteria. If select.cri="median", the algorithm will use the median probability model to select the active variables. If select.cri="bic", the algorithm will use the BIC criteria to select the active variables. The amount of the variables that are chosen to apply BIC criteria based on the ranking of their marginal posterior probabilities. If the input sample size is less than the default value 20, all variables will be considered when applying BIC.

basad 3 Details In the package, the regression coefficients have following hierarchical structure: β (Z = 0, σ 2 ) = N(0, τ 2 0 σ 2 ), β (Z = 1, σ 2 ) = N(0, τ 2 1 σ 2 ) where the latent variable Z i of value 0 or 1 indicates whether ith variable is in the slab and spike part of the prior. The package provides different prior choices for the coefficients: Gaussian, Student s t, Laplace. Through setting the parameter prior.dist, the coefficients will have the corresponding prior densities as follows: 1. The Gaussian priors case: β (Z = k, σ 2 1 ) = 2πτ 2 k σ 2 e β 2 2τ 2 k σ2 Value 2. The Student s t prior case: β (Z = k, σ 2 ) = Where ν is the degrees of freedom 3. The Laplace prior case: Γ( ν+1 2 ) ( ) Γ( ν 2 ) 1 + 1 ( β 2 ) ν+1 2 πντ k σ ν τk 2σ2 β (Z = k, σ 2 ) = 1 ( 2τk 2 exp β ) σ2 τk 2σ2 The τ k is the scale for the prior distribution. If user did not set a specific value, the prior scales are specified as follows: τ0 2 = 1 ( n a τ, τ1 2 = max 100τ0 2 τ 0 p ) n,, (1 p n )ρ where ρ is the prior density evaluated at f p (b τ log (p n + 1)), f p is the density function for the corresponding prior distribution. The parameter a and b are a τ = 1 and b τ = 2.4 by default. The prior probability q n = P (Z i = 1) that a covariate is nonzero can be specified by value K. The K represents a prior belief of the upper bound of the true covariates in the model. When user specifies a value of K greater than 3, setting q n = c/p n, through the calculation(see details in Naveen (2014)): Φ((K c)/ c) = 1 α The prior probability on the models with sizes greater than K will be α, and this α is set to 0.1 in the package. An object of class basad with the following components: all.var select.var Summary object for all the variables. Summary object for the selected variables.

4 basad beta.names verbose Variable names for the coefficients. Verbose details (used for printing). posteriorz A vector of the marginal posterior probabilities for the latent vector Z. model.index modelz est.b allb allz x y Author(s) A vector containing the indices of selected variables. A binary vector Z indicating whether the coefficient is true in the selected model. Estimated coefficient values from the posterior distribution through Gibbs sampling. A matrix of all sampled coefficient values along the entire chain. Each row represents the sampled values under each iteration. A matrix of all sampled posterior probabilities for the latent variable Z along the entire chain. Each row represents the sampled values under each iteration. Standardized x-matrix. Standardized y vector. Qingyan Xiang (<qxiang@illinois.edu>) Naveen Narisetty (<naveen@illinois.edu>) References Narisetty, N. N., & He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics, 42(2), 789-817. Bhattacharya, A., Chakraborty, A., & Mallick, B. K. (2016). Fast sampling with Gaussian scale mixture priors in high-dimensional regression. Biometrika, 4(103), 985-991. Barbieri, M. M., & Berger, J. O. (2004). Optimal predictive model selection. The Annals of Statistics, 32(3), 870-897. Examples #Generate Data: The simulated high dimensional data n = 100; p = 499; nz = 5 rho1=0.25;rho2=0.25;rho3=0.25 ### correlations Bc = c( 0,seq(0.6,3,length.out=nz), array(0, p-nz)) covr1=(1- rho1)*diag(nz) + array(rho1,c(nz,nz)) covr3=(1- rho3)*diag(p-nz) + array(rho3,c(p-nz,p-nz)) covr2=array(rho2,c(nz,p-nz)) covr=rbind( cbind(covr1,covr2), cbind(t(covr2),covr3) )

predict.basad 5 cove = eigen(covr) covsq = cove$vectors %*% diag( sqrt(cove$values) ) %*% t(cove$vectors) Xs = matrix( rnorm(n*p), nrow = n); Xn = covsq %*% t(xs) X = cbind(array(1, n), t(xn)) Y = X %*% Bc + rnorm(n); X <- X[,2:ncol(X)] #Example 1: Run the default setting of the Guassian priors obj <- basad( x = X, y = Y) print( obj ) #Example 2: Use different priors and slection criteria obj <- basad( x = X, y = Y, prior.dist = "t", select.cri = "BIC") print( obj ) predict.basad Basad prediction Description Predict the response values of test data using basad. Usage ## S3 method for class 'basad' predict(object, testx = NULL,...) Arguments object An object of class basad. testx Data frame or x-matrix containing test data.... Further arguments passed to or from other methods. Value A vector of fitted values for estimated response values.

6 print.basad Author(s) Qingyan Xiang (<qxiang@illinois.edu>) Naveen Narisetty (<naveen@illinois.edu>) References Narisetty, N. N., & He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics, 42(2), 789-817. Examples #Generate Data: The simulated high dimensional data n = 100; p = 499; nz = 5 rho1=0.25;rho2=0.25;rho3=0.25 ### correlations Bc = c( 0,seq(0.6,3,length.out=nz), array(0, p-nz)) covr1=(1- rho1)*diag(nz) + array(rho1,c(nz,nz)) covr3=(1- rho3)*diag(p-nz) + array(rho3,c(p-nz,p-nz)) covr2=array(rho2,c(nz,p-nz)) covr=rbind( cbind(covr1,covr2), cbind(t(covr2),covr3) ) cove = eigen(covr) covsq = cove$vectors %*% diag(sqrt(cove$values)) %*% t(cove$vectors) Xs = matrix(rnorm(n*p), nrow = n); Xn = covsq %*% t(xs) X = cbind(array(1, n), t(xn)) Y = X %*% Bc + rnorm(n); X <- X[,2:ncol(X)] #Run the algorithm and then predict obj <- basad( x = X, y = Y) predict( obj, testx = X ) print.basad Print summary output of analysis Description Print summary output from basad analysis. Note that this is the default print method for the package.

print.basad 7 Usage ## S3 method for class 'basad' print(x,...) Arguments x An object of class basad.... Further arguments passed to or from other methods. Author(s) Qingyan Xiang (<qxiang@illinois.edu>) Naveen Narisetty (<naveen@illinois.edu>) References Narisetty, N. N., & He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics, 42(2), 789-817.

Index Topic print print.basad, 6 Topic regression basad, 2 predict.basad, 5 basad, 2 predict.basad, 5 print.basad, 6 8