Flexible Regression Modeling using Bayesian Nonparametric Mixtures

Size: px
Start display at page:

Download "Flexible Regression Modeling using Bayesian Nonparametric Mixtures"

Transcription

1 Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham Young University November 6, 2008

2 Outline Outline 1. Introduction and motivation 2. Dirichlet process mixture models 3. Curve fitting using Dirichlet process mixtures 4. Bayesian nonparametric quantile regression 5. Modeling for stock-recruitment relationships 6. Current/future work 1/26

3 Introduction and motivation 1. Introduction and motivation Two dominant trends in the Bayesian regression literature: seek increasingly flexible regression function models, and accompany these models with more comprehensive uncertainty quantification Typically, Bayesian nonparametric modeling focuses on either the regression function or the error distribution Bayesian nonparametric extension of implied conditional regression: use flexible nonparametric mixture model for the joint distribution of response and covariates obtain full inference for the desired conditional distribution for response given covariates Both the response distribution and, implicitly, the regression relationship are modeled nonparametrically, thus providing a flexible framework for the general regression problem 2/26

4 Introduction and motivation The area of Bayesian nonparametrics provides the framework for such modeling instead of specifying unknown functions and distributions up to a (small) number of parameters, treat them as the random model parameters nonparametric priors support the underlying spaces of random functions/distributions resulting in flexible inferences and more reliable predictions Modeling utilizes Dirichlet process mixtures, a flexible class of nonparametric mixture models 3/26

5 Dirichlet process mixture models 2. Dirichlet process mixture models The Dirichlet process (DP) (Ferguson, 1973) is a random probability measure on distributions characterized by two parameters: a base distribution G 0 (the center of the process) and a (precision) parameter α > 0 DP constructive definition (Sethuraman, 1994) let {z s, s = 1, 2,...} and {φ j, j = 1, 2,...} be independent sequences of random variables, with z s i.i.d. Beta(1, α), and φ j i.i.d. G 0 define ω 1 = z 1, ω j = z j j 1 s=1 (1 z s), j 2 (stick-breaking construction) then, a realization G from DP(α, G 0 ) is (almost surely) of the form G( ) = ω j δ φj ( ) j=1 i.e., a discrete distribution that can be represented as a countable mixture of point masses 4/26

6 Dirichlet process mixture models w P(X<x) x x DP with G 0 = N(0, 1) and α = 20. In the left panel, the spiked lines are located at 1000 sampled values of x drawn from N(0, 1) with heights given by the weights, ω l, calculated using the stick-breaking algorithm (a truncated version so that the weights sum to 1). These spikes are then summed from left to right to generate one cdf sample path from the DP. The right panel shows 8 such sample paths indicated by the lighter jagged lines. The heavy smooth line indicates the N(0, 1) cdf. 5/26

7 Dirichlet process mixture models Dirichlet process mixture model: for a parametric family of distributions K( ; θ), θ Θ R q, define F ( ; G) = K( ; θ)dg(θ), G DP(α, G 0 ) DP mixture prior can model both discrete and continuous distributions Hierarchical model: for y 1,..., y n i.i.d., given G, from F ( ; G), y i θ i θ i G ind. i.i.d. K( ; θ i ), i = 1,..., n G, i = 1,..., n G DP(α, G 0 ) typically, hyperpriors on α and/or the parameters ψ of G 0 G 0 (ψ) are added 6/26

8 Curve fitting using Dirichlet process mixtures 3. Curve fitting using Dirichlet process mixtures Focus on univariate continuous response y (though extensions currently studied for categorical and/or multivariate responses) DP mixture model for the joint density f(y, x) of the response y and the vector of covariates x: f(y, x) f(y, x; G) = k(y, x; θ)dg(θ), G DP(α, G 0 (ψ)) For the mixture kernel k(y, x; θ) use: multivariate normal for (real-valued) continuous response and covariates mixed continuous/discrete distribution to incorporate both categorical and continuous covariates kernel component for y supported by R + for problems in survival/reliability analysis 7/26

9 Curve fitting using Dirichlet process mixtures Again, introduce latent mixing parameters θ = {θ i : i = 1,..., n} for each response/covariate observation (y i, x i ), i = 1,..., n full posterior: p(g, θ, α, ψ data) = p(g θ, α, ψ)p(θ, α, ψ data) p(θ, α, ψ data) is the posterior of the finite-dimensional parameter vector that results by marginalizing G over its DP prior MCMC posterior simulation to sample from this marginal posterior p(g θ, α, { ψ) is a DP with precision parameter α + n and mean (α + n) 1 αg 0 ( ; ψ) + } n j=1 n jδ θ j ( ), where n is the number of distinct θ i, and n j is the size of the j-th distinct component sample using the DP stick-breaking definition with a truncation approximation Alternatively, G can be truncated from the outset resulting in a finite mixture model that can be fitted with Gibbs sampling 8/26

10 Curve fitting using Dirichlet process mixtures For any grid of values (y 0, x 0 ), obtain posterior samples for: joint density f(y 0, x 0 ; G), marginal density f(x 0 ; G), and therefore, conditional density f(y 0 x 0 ; G) conditional expectation E(y x 0 ; G), which, estimated over grid in x, provides inference for the regression relationship conditioning in f(y 0 x 0 ; G) and/or E(y x 0 ; G) may involve only a portion of vector x Key features of the modeling approach: full and exact nonparametric inference (no need for asymptotics) model for both non-linear regression curves and non-standard shapes for the conditional response density model does not rely on additive regression formulations; it can uncover interactions between covariates that might influence the regression relationship 9/26

11 Curve fitting using Dirichlet process mixtures Data Example Simulated data set with a continuous response y, one continuous covariate x c, and one binary categorical covariate x d x ci ind. N(0, 1) x di x ci ind. Bernoulli(probit(x ci )) y i x ci, x di ind. N(h(x ci ), σ xdi ), with σ 0 = 0.25, σ 1 = 0.5, and h(x c ) = 0.4x c sin(2.7x c ) + 1.1(1 + x 2 c) 1 two sample sizes: n = 200 and n = 2000 DP mixture model with a mixed normal/bernoulli kernel: f(y, x c, x d ; G) = N 2 (y, x c ; µ, Σ)π x d (1 π) 1 x d dg(µ, Σ, π), with G DP(α, G 0 (µ, Σ, π) = N 2 (µ; m, V ) IWish(Σ; ν, S) Beta(π; a, b)) 10/26

12 Curve fitting using Dirichlet process mixtures h(x) Posterior point and 90% interval estimates (dashed and dotted lines) for conditional response expectation E(y xc, x d = 0; G) (left panels), E(y xc, x d = 1; G) (middle panels), and E(y xc; G) (right panels). The corresponding data is plotted in grey for the sample of size n = 200 (top panels) and n = 2000 (bottom panels). The solid line denotes the true curve. x 11/26

13 Bayesian nonparametric quantile regression 4. Bayesian nonparametric quantile regression In regression settings, the covariates may have effect not only on the center of the response distribution but also on its shape Quantile regression quantifies relationship between a set of quantiles of response distribution and covariates, and thus, provides a more complete explanation of the response distribution in terms of available covariates Semiparametric additive quantile regression framework: y i = h(x i ) + ε i. where the ε i are i.i.d. from a distribution with p-th quantile equal to 0 earlier work on Bayesian semiparametric modeling with parametric quantile regression functions and nonparametric priors for unimodal error densities (Kottas & Krnjajić, 2008) 12/26

14 Bayesian nonparametric quantile regression Alternative model-based nonparametric approach (Taddy & Kottas, 2007) model joint density f(y, x) of the response y and the M-variate vector of (continuous) covariates x with a DP mixture of normals: f(y, x; G) = N M+1 (y, x; µ, Σ)dG(µ, Σ), G DP(α, G 0 ) with G 0 (µ, Σ) = N M+1 (µ; m, V ) IWish(Σ; ν, S) For any grid of values (y 0, x 0 ), obtain posterior samples for: conditional density f(y 0 x 0 ; G) and conditional cdf F (y 0 x 0 ; G) conditional quantile regression q p (x 0 ; G), for any 0 < p < 1 13/26

15 Bayesian nonparametric quantile regression Key features: modeling framework enables simultaneous inference for more than one quantile regression model allows flexible response distributions and non-linear quantile regression relationships Extensions to modeling for partially observed responses (and/or covariates): fully nonparametric Tobit quantile regression for econometrics data 14/26

16 Bayesian nonparametric quantile regression Data Example Moral hazard data on the relationship between shareholder concentration and several indices for managerial moral hazard in the form of expenditure with scope for private benefit (Yafeh & Yoshua, 2003) data set includes a variety of variables describing 185 Japanese industrial chemical firms listed on the Tokyo stock exchange response y: index MH5, consisting of general sales and administrative expenses deflated by sales four-dimensional covariate vector x: Leverage (ratio of debt to total assets); log(assets); Age of the firm; and TOPTEN (the percent of ownership held by the ten largest shareholders) 15/26

17 Bayesian nonparametric quantile regression Marginal Average Medians with 90% CI Moral Hazard Moral Hazard TOPTEN Leverage Moral Hazard Moral Hazard Age Log(Assets) Posterior mean and 90% interval estimates for median regression for M H5 conditional on each individual covariate. Data scatterplots are shown in grey. 16/26

18 Bayesian nonparametric quantile regression Marginal Average 90th Percentiles with 90% CI Moral Hazard Moral Hazard TOPTEN Leverage Moral Hazard Moral Hazard Age Log(Assets) Posterior mean and 90% interval estimates for 90th percentile regression for M H5 conditional on each individual covariate. Data scatterplots are shown in grey. 17/26

19 Bayesian nonparametric quantile regression Leverage TOPTEN Posterior estimates of median surfaces (left column) and 90th percentile surfaces (right column) for M H5 conditional on Leverage and TOPTEN. The posterior mean is shown on the top row and the posterior interquartile range on the bottom. 18/26

20 Bayesian nonparametric quantile regression Conditional density for MH MH5 Posterior mean and 90% interval estimates for response densities f(y x 0 ; G) conditional on four combinations of values x 0 for the covariate vector (TOPTEN, Leverage, Age, log(assets)) 19/26

21 Modeling for stock-recruitment relationships 5. Modeling for stock-recruitment relationships Relationship between the number of mature individuals of a species (stock biomass, S) and the production of offspring (recruitment, R) is fundamental to the behavior of any ecological system Special relevance in fisheries research, where the stock-recruitment (S-R) relationship applies directly to decision problems of fishery management A common way of writing this relationship is log(r/s) = g(s) + ɛ where g is the S-R function and ɛ are additive (typically, normal) errors work part of NSF project (joint with Steve Munch, Stony Brook University) 20/26

22 Modeling for stock-recruitment relationships Standard ecological assumption: as stock abundance increases, successful recruitment per individual (reproductive success) decreases a wide variety of factors affect the S-R relationship, and there are many competing models for the influence of biological and physical mechanisms small amounts of noisy data typically available to infer S-R relationships Traditional (parametric) models may be too limited to extract the relevant information from the data, and to provide reliable predictions and/or temporal forecasts DP mixture modeling approach to capture the nature of recruitment dependence upon stock without making parametric assumptions for either the S-R function or the errors around it (Fronczyk, Kottas & Munch, 2008) 21/26

23 Modeling for stock-recruitment relationships DP mixture of bivariate normals for joint distribution of log-reproductive success, y = log(r/s), and stock biomass, x = S, f(y, x; G) = N 2 (y, x; µ, Σ)dG(µ, Σ), G DP(α, G 0 ) Various types of practically important inference: inference for S-R relationship through conditional expectation functional E(y x; G) inference for log-reproductive success for any specified stock biomass value, x 0 = S 0, through conditional density f(y x 0 ; G) inference for biological reference points through conditional density f(x y 0 ; G) for specific log-reproductive success values y 0 22/26

24 Modeling for stock-recruitment relationships Cod data from six North Atlantic regions. For each region, posterior mean (blue) and 95% interval estimates (red) for the conditional mean log-reproductive success. 23/26

25 Modeling for stock-recruitment relationships Cod data. For the NE Arctic (top panels) and West of Scotland (bottom panels) regions, posterior mean (blue) and 95% interval estimates (red) for the conditional density of log-reproductive success at four specified stock biomass values. 24/26

26 Current/future work 6. Current/future work General framework with several potentially important applications: nonparametric switching regression modeling (Taddy & Kottas, 2008) modeling and inference for marked point processes over time or space (with Matt Taddy) fully nonparametric regression for censored survival data nonparametric regression models for multivariate ordinal responses (with Kassie Fronczyk) sensitivity analysis and inversion for computer model experiments (with Marian Farah) 25/26

27 Contact info: web: thanos UCSC Department of Applied Math and Statistics: Technical Reports series: THANKS!!! 26/26

A Bayesian Nonparametric Approach to Inference for Quantile Regression

A Bayesian Nonparametric Approach to Inference for Quantile Regression A Bayesian Nonparametric Approach to Inference for Quantile Regression Matthew TADDY The University of Chicago Booth School of Business, Chicago, IL 60637 (matt.taddy@chicagogsb.edu) Athanasios KOTTAS

More information

A Nonparametric Model-based Approach to Inference for Quantile Regression

A Nonparametric Model-based Approach to Inference for Quantile Regression A Nonparametric Model-based Approach to Inference for Quantile Regression Matthew Taddy and Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz, CA

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Nonparametric Bayesian modeling for dynamic ordinal regression relationships Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria

More information

Flexible modeling for stock-recruitment relationships using Bayesian nonparametric mixtures

Flexible modeling for stock-recruitment relationships using Bayesian nonparametric mixtures Flexible modeling for stock-recruitment relationships using Bayesian nonparametric mixtures Kassandra Fronczyk, Athanasios Kottas Department of Applied Mathematics and Statistics, University of California,

More information

Applied Bayesian Nonparametric Mixture Modeling Session 3 Applications of DP mixture models

Applied Bayesian Nonparametric Mixture Modeling Session 3 Applications of DP mixture models Applied Bayesian Nonparametric Mixture Modeling Session 3 Applications of DP mixture models Athanasios Kottas (thanos@ams.ucsc.edu) Abel Rodriguez (abel@ams.ucsc.edu) University of California, Santa Cruz

More information

UNIVERSITY OF CALIFORNIA SANTA CRUZ

UNIVERSITY OF CALIFORNIA SANTA CRUZ UNIVERSITY OF CALIFORNIA SANTA CRUZ BAYESIAN NONPARAMETRIC ANALYSIS OF CONDITIONAL DISTRIBUTIONS AND INFERENCE FOR POISSON POINT PROCESSES A dissertation submitted in partial satisfaction of the requirements

More information

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets Athanasios Kottas Department of Applied Mathematics and Statistics,

More information

Mixture Modeling for Marked Poisson Processes

Mixture Modeling for Marked Poisson Processes Mixture Modeling for Marked Poisson Processes Matthew A. Taddy taddy@chicagobooth.edu The University of Chicago Booth School of Business 5807 South Woodlawn Ave, Chicago, IL 60637, USA Athanasios Kottas

More information

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Valerie Poynor Department of Applied Mathematics and Statistics, University of California, Santa Cruz April 28, 212 1/3 Outline

More information

Bayesian Nonparametric Modeling for Multivariate Ordinal Regression

Bayesian Nonparametric Modeling for Multivariate Ordinal Regression Bayesian Nonparametric Modeling for Multivariate Ordinal Regression arxiv:1408.1027v3 [stat.me] 20 Sep 2016 Maria DeYoreo Department of Statistical Science, Duke University and Athanasios Kottas Department

More information

Bayesian semiparametric modeling and inference with mixtures of symmetric distributions

Bayesian semiparametric modeling and inference with mixtures of symmetric distributions Bayesian semiparametric modeling and inference with mixtures of symmetric distributions Athanasios Kottas 1 and Gilbert W. Fellingham 2 1 Department of Applied Mathematics and Statistics, University of

More information

Bayesian nonparametric Poisson process modeling with applications

Bayesian nonparametric Poisson process modeling with applications Bayesian nonparametric Poisson process modeling with applications Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Neyman Seminar Department of Statistics

More information

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006)

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006) A nonparametric Bayesian approach to inference for non-homogeneous Poisson processes Athanasios Kottas 1 Department of Applied Mathematics and Statistics, Baskin School of Engineering, University of California,

More information

A Nonparametric Bayesian Model for Multivariate Ordinal Data

A Nonparametric Bayesian Model for Multivariate Ordinal Data A Nonparametric Bayesian Model for Multivariate Ordinal Data Athanasios Kottas, University of California at Santa Cruz Peter Müller, The University of Texas M. D. Anderson Cancer Center Fernando A. Quintana,

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distributions

Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distributions Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distributions ATHANASIOS KOTTAS University of California, Santa Cruz ABSTRACT. Bayesian nonparametric methods have been applied to survival

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Bayesian nonparametric modeling approaches for quantile regression

Bayesian nonparametric modeling approaches for quantile regression Bayesian nnparametric mdeling appraches fr quantile regressin Athanasis Kttas Department f Applied Mathematics and Statistics University f Califrnia, Santa Cruz Department f Statistics Athens University

More information

Nonparametric Bayes Uncertainty Quantification

Nonparametric Bayes Uncertainty Quantification Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Quantifying the Price of Uncertainty in Bayesian Models

Quantifying the Price of Uncertainty in Bayesian Models Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Quantifying the Price of Uncertainty in Bayesian Models Author(s)

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Nonparametric Bayesian Inference for Mean Residual. Life Functions in Survival Analysis

Nonparametric Bayesian Inference for Mean Residual. Life Functions in Survival Analysis Nonparametric Bayesian Inference for Mean Residual Life Functions in Survival Analysis VALERIE POYNOR 1 and ATHANASIOS KOTTAS 2 1 California State University, Fullerton 2 University of California, Santa

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

processes Sai Xiao, Athanasios Kottas and Bruno Sansó Abstract

processes Sai Xiao, Athanasios Kottas and Bruno Sansó Abstract Nonparametric Bayesian modeling and inference for renewal processes Sai Xiao, Athanasios Kottas and Bruno Sansó Abstract We propose a flexible approach to modeling and inference for renewal processes.

More information

Developmental Toxicity Studies

Developmental Toxicity Studies A Bayesian Nonparametric Modeling Framework for Developmental Toxicity Studies Kassandra Fronczyk and Athanasios Kottas Abstract: We develop a Bayesian nonparametric mixture modeling framework for replicated

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

arxiv: v3 [stat.me] 3 May 2016

arxiv: v3 [stat.me] 3 May 2016 A Bayesian Nonparametric Markovian Model for Nonstationary Time Series arxiv:1601.04331v3 [stat.me] 3 May 2016 Maria DeYoreo and Athanasios Kottas Abstract Stationary time series models built from parametric

More information

Particle Learning for General Mixtures

Particle Learning for General Mixtures Particle Learning for General Mixtures Hedibert Freitas Lopes 1 Booth School of Business University of Chicago Dipartimento di Scienze delle Decisioni Università Bocconi, Milano 1 Joint work with Nicholas

More information

Bayesian Nonparametric Predictive Modeling of Group Health Claims

Bayesian Nonparametric Predictive Modeling of Group Health Claims Bayesian Nonparametric Predictive Modeling of Group Health Claims Gilbert W. Fellingham a,, Athanasios Kottas b, Brian M. Hartman c a Brigham Young University b University of California, Santa Cruz c University

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis

Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis ATHANASIOS KOTTAS Department of Applied Mathematics and Statistics, University of California,

More information

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data Nonparametric Bayesian Modeling for Multivariate Ordinal Data Athanasios Kottas, Peter Müller and Fernando Quintana Abstract We propose a probability model for k-dimensional ordinal outcomes, i.e., we

More information

Bayesian Non-parametric Modeling With Skewed and Heavy-Tailed Data 1

Bayesian Non-parametric Modeling With Skewed and Heavy-Tailed Data 1 Bayesian Non-parametric Modeling With Skewed and Heavy-tailed Data David Draper (joint work with Milovan Krnjajić, Thanasis Kottas and John Wallerius) Department of Applied Mathematics and Statistics University

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Modeling conditional distributions with mixture models: Theory and Inference

Modeling conditional distributions with mixture models: Theory and Inference Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005

More information

Gaussian kernel GARCH models

Gaussian kernel GARCH models Gaussian kernel GARCH models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics 7 June 2013 Motivation A regression model is often

More information

Lecture 16: Mixtures of Generalized Linear Models

Lecture 16: Mixtures of Generalized Linear Models Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently

More information

Spatial modeling for risk assessment of extreme values from environmental time series: A Bayesian nonparametric approach

Spatial modeling for risk assessment of extreme values from environmental time series: A Bayesian nonparametric approach Spatial modeling for risk assessment of extreme values from environmental time series: A Bayesian nonparametric approach Athanasios Kottas, Ziwei Wang and Abel Rodríguez Department of Applied Mathematics

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Chapter 2. Data Analysis

Chapter 2. Data Analysis Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Bayesian Mixture Modeling of Significant P Values: A Meta-Analytic Method to Estimate the Degree of Contamination from H 0 : Supplemental Material

Bayesian Mixture Modeling of Significant P Values: A Meta-Analytic Method to Estimate the Degree of Contamination from H 0 : Supplemental Material Bayesian Mixture Modeling of Significant P Values: A Meta-Analytic Method to Estimate the Degree of Contamination from H 0 : Supplemental Material Quentin Frederik Gronau 1, Monique Duizer 1, Marjan Bakker

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017 Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h

More information

Modeling for Dynamic Ordinal Regression Relationships: An. Application to Estimating Maturity of Rockfish in California

Modeling for Dynamic Ordinal Regression Relationships: An. Application to Estimating Maturity of Rockfish in California Modeling for Dynamic Ordinal Regression Relationships: An Application to Estimating Maturity of Rockfish in California Maria DeYoreo and Athanasios Kottas Abstract We develop a Bayesian nonparametric framework

More information

Bayesian semiparametric inference for the accelerated failure time model using hierarchical mixture modeling with N-IG priors

Bayesian semiparametric inference for the accelerated failure time model using hierarchical mixture modeling with N-IG priors Bayesian semiparametric inference for the accelerated failure time model using hierarchical mixture modeling with N-IG priors Raffaele Argiento 1, Alessandra Guglielmi 2, Antonio Pievatolo 1, Fabrizio

More information

Hybrid Dirichlet processes for functional data

Hybrid Dirichlet processes for functional data Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA

More information

Bayesian model selection for computer model validation via mixture model estimation

Bayesian model selection for computer model validation via mixture model estimation Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet

More information

Nonparametric Bayes Density Estimation and Regression with High Dimensional Data

Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Abhishek Bhattacharya, Garritt Page Department of Statistics, Duke University Joint work with Prof. D.Dunson September 2010

More information

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data Nonparametric Bayesian Modeling for Multivariate Ordinal Data Athanasios Kottas, Peter Müller and Fernando Quintana August 18, 2004 Abstract We propose a probability model for k-dimensional ordinal outcomes,

More information

A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models

A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models William Barcella 1, Maria De Iorio 1 and Gianluca Baio 1 1 Department of Statistical Science,

More information

On Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process

On Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process On Simulations form the Two-Parameter arxiv:1209.5359v1 [stat.co] 24 Sep 2012 Poisson-Dirichlet Process and the Normalized Inverse-Gaussian Process Luai Al Labadi and Mahmoud Zarepour May 8, 2018 ABSTRACT

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

A Nonparametric Model for Stationary Time Series

A Nonparametric Model for Stationary Time Series A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes Chapter 1 Introduction What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes 1.1 What are longitudinal and panel data? With regression

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments

More information

BAYESIAN NONPARAMETRIC MODELLING WITH THE DIRICHLET PROCESS REGRESSION SMOOTHER

BAYESIAN NONPARAMETRIC MODELLING WITH THE DIRICHLET PROCESS REGRESSION SMOOTHER Statistica Sinica 20 (2010), 1507-1527 BAYESIAN NONPARAMETRIC MODELLING WITH THE DIRICHLET PROCESS REGRESSION SMOOTHER J. E. Griffin and M. F. J. Steel University of Kent and University of Warwick Abstract:

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian estimation of bandwidths for a nonparametric regression model

More information

Normalized kernel-weighted random measures

Normalized kernel-weighted random measures Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27 Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations Bayesian Density Regression We observe data (x 1,

More information

Modeling and Predicting Healthcare Claims

Modeling and Predicting Healthcare Claims Bayesian Nonparametric Regression Models for Modeling and Predicting Healthcare Claims Robert Richardson Department of Statistics, Brigham Young University Brian Hartman Department of Statistics, Brigham

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

Bayesian mixture modeling for spectral density estimation

Bayesian mixture modeling for spectral density estimation Bayesian mixture modeling for spectral density estimation Annalisa Cadonna a,, Athanasios Kottas a, Raquel Prado a a Department of Applied Mathematics and Statistics, University of California at Santa

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Scaling up Bayesian Inference

Scaling up Bayesian Inference Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother J. E. Griffin and M. F. J. Steel University of Warwick Bayesian Nonparametric Modelling with the Dirichlet Process Regression

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Bayesian Nonparametric Regression through Mixture Models

Bayesian Nonparametric Regression through Mixture Models Bayesian Nonparametric Regression through Mixture Models Sara Wade Bocconi University Advisor: Sonia Petrone October 7, 2013 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression

More information

Bayesian Estimation of log N log S

Bayesian Estimation of log N log S Bayesian Estimation of log N log S Paul D. Baines Department of Statistics University of California, Davis May 10th, 2013 Introduction Project Goals Develop a comprehensive method to infer (properties

More information

A Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation

A Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation Submitted to Operations Research manuscript Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information