Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor Data

Similar documents
Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor Data

Variational Bayesian Inference for Parametric and Nonparametric Regression with Missing Data

Density Estimation. Seungjin Choi

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Integrated Non-Factorized Variational Inference

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Bayesian linear regression

Bayesian Inference. Chapter 9. Linear models and regression

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

Bayesian Inference Course, WTCN, UCL, March 2013

Bayesian Methods for Machine Learning

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

GAUSSIAN PROCESS REGRESSION

Bayesian model selection: methodology, computation and applications

Some methods for handling missing values in outcome variables. Roderick J. Little

Markov Chain Monte Carlo methods

Implications of Missing Data Imputation for Agricultural Household Surveys: An Application to Technology Adoption

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Fractional Imputation in Survey Sampling: A Comparative Review

Mean Field Variational Bayes for Elaborate Distributions

Heriot-Watt University

Mean Field Variational Bayes for Elaborate Distributions

Accounting for Complex Sample Designs via Mixture Models

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

A general mixed model approach for spatio-temporal regression data

Statistical Machine Learning Lectures 4: Variational Bayes

Bayesian data analysis in practice: Three simple examples

Expectation Propagation Algorithm

The Bayesian approach to inverse problems

Foundations of Statistical Inference

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15

Lecture 1b: Linear Models for Regression

Lecture 16: Mixtures of Generalized Linear Models

Monte Carlo in Bayesian Statistics

Bayes: All uncertainty is described using probability.

Uncertainty Quantification for Inverse Problems. November 7, 2011

Bayesian Inference for the Multivariate Normal

Bayesian Linear Regression

Introduction to Bayesian Computation

The linear model is the most fundamental of all serious statistical models encompassing:

Analysing geoadditive regression data: a mixed model approach

Modelling geoadditive survival data

Penalized Loss functions for Bayesian Model Choice

Bayesian Additive Regression Tree (BART) with application to controlled trail data analysis

Small Area Estimation for Skewed Georeferenced Data

Unsupervised Learning

A note on multiple imputation for general purpose estimation

A short introduction to INLA and R-INLA

Foundations of Statistical Inference

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Lecture : Probabilistic Machine Learning

Probabilistic Graphical Models

Nonparameteric Regression:

ICES REPORT Model Misspecification and Plausibility

Variational Inference for Count Response Semiparametric Regression

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Bayesian Inference. Chapter 1. Introduction and basic concepts

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

CS Lecture 19. Exponential Families & Expectation Propagation

Bayesian Machine Learning

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Plausible Values for Latent Variables Using Mplus

Will Penny. SPM short course for M/EEG, London 2013

David Giles Bayesian Econometrics

A Geometric Variational Approach to Bayesian Inference

Bayesian Linear Models

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Variational Bayes for Hierarchical Mixture Models

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Fully Bayesian inference under ignorable missingness in the presence of auxiliary covariates

Expectation propagation as a way of life

Permutation Test for Bayesian Variable Selection Method for Modelling Dose-Response Data Under Simple Order Restrictions

A Bayesian Treatment of Linear Gaussian Regression

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Stat 5101 Lecture Notes

GWAS V: Gaussian processes

Multimodal Nested Sampling

Probabilistic Reasoning in Deep Learning

An Information Theoretic Interpretation of Variational Inference based on the MDL Principle and the Bits-Back Coding Scheme

Regression density estimation with variational methods and stochastic approximation

Variational Inference for Heteroscedastic Semiparametric Regression

MCMC algorithms for fitting Bayesian models

Monte Carlo Inference Methods

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Introduction into Bayesian statistics

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

Econ 673: Microeconometrics

Basics of Modern Missing Data Analysis

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Monte Carlo Methods in Bayesian Inference: Theory, Methods and Applications

Principles of Bayesian Inference

Nonparametric Bayes Uncertainty Quantification

Transcription:

for Parametric and Non-Parametric Regression with Missing Predictor Data August 23, 2010

Introduction Bayesian inference For parametric regression: long history (e.g. Box and Tiao, 1973; Gelman, Carlin, Stern and Rubin, 2004) For non-parametric regression: e.g. mixed model representations of penalized splines (e.g. Ruppert, Wand and Carroll, 2003) For dealing with missingness in data: allows incorporation of standard missing data models (e.g. Little and Rubin, 2004; Daniels and Hogan, 2008) Easy via MCMC, but can be costly in processing time

Introduction Variational Bayes inference Part of mainstream Computer Science methodology (e.g. Bishop, 2006) Recently, used in statistical problems (e.g. Teschendorff et al. 2005; McGrory & Titterington, 2007; Ormerod & Wand, 2010) Deterministic approach that yields approximate inference Involves approximation of posterior densities by other densities for which inference is more tractable Faes, Ormerod and Wand (2010): develop and investigate variational Bayes for regression analysis with missing data

Elements of Variational Bayes Bayesian inference is based on the posterior density function p(θ y) = p(y, θ) p(y) For an arbitrary density function q over Θ, the following inequality holds ( { } ) p(y, θ) p(y) p(y; q) = exp q(θ)log dθ q(θ) Variational Bayes relies on product density restrictions: M q(θ) = q i (θ i ) for some partition {θ 1,..., θ M } of θ i=1 The optimal densities (with minimum KL divergence) can be shown to satisfy q i (θ i ) exp {E θi log p(θ i rest)}

Simple Linear Regression with Missing Predictor Data Assume the model y i = β 0 + β 1 x i + ɛ i, ɛ i N(0, σ 2 ɛ) Cough this in Bayesian framework by taking β 0, β 1 N(0, σ 2 β ) and σ 2 ɛ IG(A ɛ, B ɛ ). Suppose that predictors are susceptible to missingness and assume x i N(µ x, σ 2 x) with hyperpriors µ x N(0, σ 2 µ x ) and σ 2 x IG(A x, B x ) Let R i be the missingness indicators and consider the missingness mechanisms: 1 P(R i = 1) = p: MCAR 2 P(R i = 1) = Φ(φ 0 + φ 1 y i ) for φ 0, φ 1 N(0, σφ 2 ): MAR 3 P(R i = 1) = Φ(φ 0 + φ 1 x i ) for φ 0, φ 1 N(0, σφ 2 ): MNAR Use auxiliary variables a i φ N((Y φ) i, 1) or a i φ N((X φ) i, 1) for the probit regression components

Approximate Inference via Variational Bayes We impose the product density restrictions: MCAR: q(β, σ 2 ɛ, x mis, µ x, σ 2 x ) = q(β, µ x )q(σ 2 ɛ, σ 2 x )q(x mis ) MAR: MNAR: q(β, σ 2 ɛ, x mis, µx, σ2 x, φ, a) = q(β, µx, φ)q(σ2 ɛ, σ2 x )q(x mis )q(a) q(β, σ 2 ɛ, x mis, µx, σ2 x, φ, a) = q(β, µx, φ)q(σ2 ɛ, σ2 x )q(x mis )q(a) For the MCAR, this leads to optimal densities of the form q (β) = Bivariate normal density q (µ x ) = Univariate normal density q (σ 2 ɛ) = Inverse Gamma density q (σ 2 x ) = Inverse Gamma density q (x mis ) = product of Univariate Normal densities For MAR and MNAR situation, derivations of optimal densities for φ and a have easy expressions as well Non-parametric regression give rise to non-standard forms and numerical integration is required (we use numerical integration via quadrature)

Simulation Simple Linear Regression with predictor MCAR Accuracy measure defined as accuracy(q ) = 1 (IAE(q )/sup q IAE(q)) = 1 1 2 IAE(q ) with IAE the integrated absolute error of q

Simulation Simple Linear Regression with predictor MNAR Accuracy drops when amount of missing data is large and when data are noisy Accuracy of missing covariates is high in all situations Poor performance for missing mechanism parameters (due to strong correlation between φ and a)

Nonparametric Regression with Missing Predictor Data Good agreement between variational Bayes and MCMC in fitted functions Time needed: 75 seconds for variational Bayes, 15.5 hours for MCMC

Nonparametric Regression with Missing Predictor Data Variational Bayes are able to handle the multimodality of posteriors of the x mis (coming from periodic nature of f ) Good to excellent performance for all parameters (except for missing mechanism parameters)

Conclusions Variational Bayes inference achieves good to excellent accuracy for main parameters of interest Poor accuracy is realized for the missing data mechanism parameters Better accuracy maybe achieved with a more elaborate variational scheme in situations where they are of interest Variational Bayes approximates multimodal posterior densities with high degree of accuracy Speed-up in the order of several hundreds

Contact Information Christel Faes I-BioStat, Center for Statistics Hasselt University Diepenbeek, Belgium link to paper: http://www.uow.edu.au/ mwand/papers.html