Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Similar documents
Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Lecture 16: Mixtures of Generalized Linear Models

Gibbs Sampling in Endogenous Variables Models

Bayesian Isotonic Regression and Trend Analysis

November 2002 STA Random Effects Selection in Linear Mixed Models

Bayes methods for categorical data. April 25, 2017

Bayesian Multivariate Logistic Regression

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Linear Models #1

STA 216, GLM, Lecture 16. October 29, 2007

Bayesian Linear Regression

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Default Priors and Effcient Posterior Computation in Bayesian

Bayesian Isotonic Regression and Trend Analysis

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Bayesian data analysis in practice: Three simple examples

Nonparametric Bayes Uncertainty Quantification

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Gibbs Sampling in Linear Models #2

Analysing geoadditive regression data: a mixed model approach

Partial factor modeling: predictor-dependent shrinkage for linear regression

MULTILEVEL IMPUTATION 1

Part 8: GLMs and Hierarchical LMs and GLMs

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Nonparametric Bayes tensor factorizations for big data

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Lecture 01: Introduction

Stat 5101 Lecture Notes

Advanced Herd Management Probabilities and distributions

Sparse Factor-Analytic Probit Models

Bayesian linear regression

The linear model is the most fundamental of all serious statistical models encompassing:

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Frequentist Statistics and Hypothesis Testing Spring

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

Bayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff

Online Supplementary Material to: Perfluoroalkyl Chemicals, Menstrual Cycle Length and Fecundity: Findings from a. Prospective Pregnancy Study

Accounting for Complex Sample Designs via Mixture Models

Frailty Probit model for multivariate and clustered interval-censor

Gibbs Sampling for the Probit Regression Model with Gaussian Markov Random Field Latent Variables

Bayesian Approaches Data Mining Selected Technique

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Statistics in medicine

Dynamic Generalized Linear Models

Part 6: Multivariate Normal and Linear Models

Fixed and random effects selection in linear and logistic models

Bayesian isotonic density regression

Contents. Part I: Fundamentals of Bayesian Inference 1

Fixed and Random Effects Selection in Linear and Logistic Models

Bayesian non-parametric model to longitudinally predict churn

Markov Chain Monte Carlo in Practice

Optimal rules for timing intercourse to achieve pregnancy

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

STAT Advanced Bayesian Inference

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Bayesian Linear Models

Large-scale Ordinal Collaborative Filtering

Bayesian Models in Machine Learning

Bayesian Regression (1/31/13)

A general mixed model approach for spatio-temporal regression data

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Lecture 13 Fundamentals of Bayesian Inference

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Probability and Information Theory. Sargur N. Srihari

Bayesian Methods for Machine Learning

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN.

Bayesian Learning (II)

CSC 2541: Bayesian Methods for Machine Learning

PMR Learning as Inference

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Discrete Multivariate Statistics

Inference for a Population Proportion

2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Bayesian Inference for Dirichlet-Multinomials

Bayesian Linear Models

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Non-Parametric Bayes

Bayesian Econometrics

A Bayesian multi-dimensional couple-based latent risk model for infertility

Classical and Bayesian inference

Marginal Specifications and a Gaussian Copula Estimation

Bayesian Linear Models

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Nonparametric Bayesian Methods (Gaussian Processes)

Generalized Linear Models and Exponential Families

Graphical Models and Kernel Methods

Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Transcription:

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives Often interest may focus on comparing a null hypothesis of no difference between groups to an ordered restricted alternative. For example, we may have a k level ordered categorical predictor, w i, y i = β 0 + k 1 1(w i = h + 1)β h + ɛ i, where ɛ i N(0, σ 2 ). h=1 H 0 : β 1 =... = β k 1 = 0 homogeneity (no association) H 1 : β 1... β k 1 simple increasing order How to assess from Bayesian perspective?

Suppose we assumed the conjugate prior density, β = (β 0,..., β k 1 ) N(µ 0, Σ 0 ) and σ 2 IG(a 0, b 0 ). Under this prior density, we could easily calculate the posterior density Posterior probabilities of β h < 0 can be calculated We can also calculate Pr(β 1... β k 1 data) How can we address H 0 vs H 1 using this posterior? Is there a better way?

The Bayes factor is a standard way of comparing two hypotheses, H 0 and H 1. To calculate the Bayes factor, we need to calculate the prior and posterior probabilities of each of the two hypotheses. What are these probabilities under the conjugate normal prior? Can we use Pr(H 0 ) = 1 Pr(H 1 ) = 1 Pr(β 1... β k 1 ) as the prior? Why or why not?

The problem with this approach is that the typical normal conjugate prior assigns zero probability to the null hypothesis. Thus, the above strategy doesn t make sense. Instead, we want to choose a prior density for β that allocates probability to H 0 and H 1, with these probabilities adding to one. Essentially, we need a prior that has support on the restricted space Ω = {β : β 1... β k 1 }, with positive probability assigned to equalities.

We would also like to have a prior is easy to elicit and results in easy computation To place order restrictions on parameters in Bayesian models, Gelfand, Smith and Lee (1992) proposed priors of the form π(β) 1(β Ω) N(µ 0, Σ 0 ), which is a truncated Gaussian density This prior allocates probability one to the restricted space Ω In addition, the full conditional densities of the β s follow a conditionally conjugate normal form Is this approach good for comparing H 0 and H 1?

Actually, we are still assigning zero prior probability to the null hypothesis H 0. By discard draws from the multivariate normal density that are inconsistent with β 1... β k 1, we ensure that strictly increasing order is satisfied However, we never draw a value of β such that β j = β h. A generalization is to include point masses to accommodate equalites

In particular, first reparameterize so that γ 1 = β 1 and γ j = β j β j 1 for j = 2,..., k 1. Then choose the following prior density: π(β 0, γ) = N(β 0 ; µ 0, σ 2 0) { k 1 h=1 π 0h 1(γ h = 0) + (1 π 0h )1(γ h > 0) 0 N(γ h ; µ h, σh) 2 N(z; µ h, σh)dz 2 } The γ h parameters are assigned prior densities consisting of mixtures of point masses at zero (with probability π 0h ) and normal densities truncated below by zero. The prior probability of equivalent means for individuals with w i = j and w i = j + 1 is π 0j, for j = 1,..., k 1. The prior probability of the overall null hypothesis H 0 is π 0 = k 1 j=1 π 0j.

Under this prior, Pr(H 0 ) = π 0 and Pr(H 1 ) = 1 π 0. The prior has support on the restricted space Ω. In addition, the prior density is conditionally conjugate with the posterior of γ h of the form π h 1(γ h = 0) + (1 π N(γ h ; h )1(γ h > 0) µ h, σ h) 2 0 N(z; µ h, σ h)dz 2, where µ h and σ 2 h are the posterior mean and variance derived under an unrestriced N(µ 0h, σ 2 0h) prior density for γ h. π h is the posterior probability of γ h = 0 given the data and other parameters.

Due to the simplicity of this form, we can simply proceed by a Gibbs sampling algorithm: 1. Specify initial values for β 0, γ and σ 2. 2. Update σ 2 by sampling from IG full conditional 3. Update β 0 by sampling from normal full conditional 4. Update γ h, for h = 1,..., k 1, by sampling from the zeroinflated truncated normal full conditional: (a) Sample from point mass by using Bernoulli( π h ). (b) If not in point mass sample from N( µ h, σ 2 h) truncated below by 0. 5. Repeat 2-4.

Calculation of Bayes factors for hypothesis testing From the Gibbs sampling output, we have samples from the posterior density for γ. The elements of γ that are equal to zero tell us which hypothesis we are in for a given sample. For example, γ 1 =... = γ k 1 = 0 implies H 0. Thus, we are effectively moving between different hypotheses in implementing the Gibbs sampler in the same way that stochastic search algorithms move between models with different predictors. Posterior probabilities for a given hypothesis can be calculated as simply the proportion of samples for which that hypothesis holds.

Discussion This strategy is very useful for inferences on effects of ordered categorical predictors. For binary and ordered categorical response data, this same approach can be used by using a probit model for the ordinal response and data augmentation (Albert and Chib, 1993) for computation. This same approach can also be used for analysis of discrete time survival data using a continuation ratio probit model to characterize the survival likelihood. For other GLMs similar approaches can be used but the prior is no longer conjugate, so computation can be more intensive.

Midterm Review Problem Set 1. Suppose that 2500 pregnant women are enrolled in a study and the outcome is the occurrence of preterm birth. Possible predictors of preterm birth include age of the woman, smoking, socioeconomic status, body mass index, bleeding during pregnancy, serum level of dde, and several dietary factors. Formulate the problem of selecting the important predictors of preterm birth in a generalized linear model (GLM) framework. Show the components of the GLM, including the link function and distribution (in exponential family form). Describe (briefly) how estimation and inference could proceed via a frequentist approach. 2. Women are enrolled in a study when they go off of contraception with the intention of achieving a pregnancy. Suppose there are 350 women in the study who provide information on the number of menstrual cycles required to achieve a pregnancy, whether or not they smoke cigarettes, and their age at beginning the attempt. Describe a statistical model for addressing the question: Is cigarette smoking related to time to pregnancy? Formulate the statistical model within a Bayesian framework and outline the details of model fitting and inference (including the form of the posterior density, an outline of the algorithm for posterior computation, and the approach for addressing the scientific question based on the posterior). 3. A study is connected examining the impact of alcohol intake during pregnancy on the occurrence of birth defects of 5 different types. Outcome data for a child consist of 5 binary indicators of the presence or absence of the different birth defects. A physician working with you on the study notes that certain

children have several birth defects, possibly due to defects in important unmeasured genes, while most children have no defects. Describe a latent variable model for analyzing these data and outline (briefly) the details of a Bayesian analysis (including the form of the posterior density, an outline of the algorithm for posterior computation, and the approach for addressing the scientific question based on the posterior). 4. A toxicology study is conducted in which pregnant mice are exposed to different doses of a chemical. The outcome data consist of an ordinal ranking of the sickness of each pup in each litter, with 1 = healthy, 2 = low birth weight but otherwise healthy, 3 = malformed, and 4 = dead. The goal of the study is to see if dose is associated with health of the pup. Describe a model and analytic strategy. What is the interpretation of the model parameters? What assumptions are being made and can they be relaxed?