Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Similar documents
A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Bayesian Nonparametric Modeling for Multivariate Ordinal Regression

Modeling for Dynamic Ordinal Regression Relationships: An. Application to Estimating Maturity of Rockfish in California

Flexible Regression Modeling using Bayesian Nonparametric Mixtures

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Bayes methods for categorical data. April 25, 2017

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

A Nonparametric Bayesian Model for Multivariate Ordinal Data

arxiv: v3 [stat.me] 3 May 2016

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

STA 216, GLM, Lecture 16. October 29, 2007

A Nonparametric Model-based Approach to Inference for Quantile Regression

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data

Mixture Modeling for Marked Poisson Processes

A Bayesian Nonparametric Approach to Inference for Quantile Regression

Bayesian Nonparametric Regression through Mixture Models

Bayesian non-parametric model to longitudinally predict churn

Nonparametric Bayesian Modeling for Multivariate Ordinal. Data

Non-Parametric Bayes

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

On the Support of MacEachern s Dependent Dirichlet Processes and Extensions

STAT Advanced Bayesian Inference

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Order-q stochastic processes. Bayesian nonparametric applications

Nonparametric Bayesian Methods - Lecture I

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Bayesian semiparametric modeling and inference with mixtures of symmetric distributions

Bayesian Multivariate Logistic Regression

CMPS 242: Project Report

Bayesian estimation of the discrepancy with misspecified parametric models

Hybrid Dirichlet processes for functional data

STAT 518 Intro Student Presentation

Flexible modeling for stock-recruitment relationships using Bayesian nonparametric mixtures

Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis

Bayesian nonparametric Poisson process modeling with applications

Bayesian Nonparametric Spatio-Temporal Models for Disease Incidence Data

Modeling and Predicting Healthcare Claims

Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distributions

Foundations of Nonparametric Bayesian Methods

A Bayesian nonparametric dynamic AR model for multiple time series analysis

Developmental Toxicity Studies

Spatial modeling for risk assessment of extreme values from environmental time series: A Bayesian nonparametric approach

UNIVERSITY OF CALIFORNIA SANTA CRUZ

Gibbs Sampling in Latent Variable Models #1

Bayesian linear regression

Nonparametric Bayesian models through probit stick-breaking processes

Bayesian Modeling of Conditional Distributions

Stat 5101 Lecture Notes

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Spatial Bayesian Nonparametrics for Natural Image Segmentation

A Nonparametric Approach Using Dirichlet Process for Hierarchical Generalized Linear Mixed Models

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Normalized kernel-weighted random measures

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Large-scale Ordinal Collaborative Filtering

Marginal Specifications and a Gaussian Copula Estimation

Gibbs Sampling in Endogenous Variables Models

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Generalized Spatial Dirichlet Process Models

Lecture 16: Mixtures of Generalized Linear Models

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

A Nonparametric Model for Stationary Time Series

Modeling conditional distributions with mixture models: Theory and Inference

Research Division Federal Reserve Bank of St. Louis Working Paper Series

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

Applied Bayesian Nonparametric Mixture Modeling Session 3 Applications of DP mixture models

Nonparametric Bayes regression and classification through mixtures of product kernels

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006)

Supplementary Material for Analysis of Job Satisfaction: The Case of Japanese Private Companies

Nonparametric Bayesian Inference for Mean Residual. Life Functions in Survival Analysis

processes Sai Xiao, Athanasios Kottas and Bruno Sansó Abstract

Curve Fitting Re-visited, Bishop1.2.5

Dynamic Generalized Linear Models

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

CS Lecture 19. Exponential Families & Expectation Propagation

13: Variational inference II

Partial factor modeling: predictor-dependent shrinkage for linear regression

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Dependent mixture models: clustering and borrowing information

Bayesian GMM. Minchul Shin University of Pennsylvania. This version: November 16, 2014

Chapter 2. Data Analysis

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PMR Learning as Inference

Linear Dynamical Systems

Slice Sampling Mixture Models

Nonparametric Bayes Inference on Manifolds with Applications

A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models

Wrapped Gaussian processes: a short review and some new results

Methods for the Comparability of Scores

A Workshop on Bayesian Nonparametric Regression Analysis

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

A Process over all Stationary Covariance Kernels

On Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process

Latent factor density regression models

Nonparametric Bayesian Methods (Gaussian Processes)

Default Priors and Effcient Posterior Computation in Bayesian

Transcription:

Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria DeYoreo (Department of Statistical Science, Duke University) 10th Conference on Bayesian Nonparametrics NCSU, Raleigh, NC June 22-26, 2015 1 / 23

Motivation Regression modeling for one or more ordinal categorical responses recorded over discrete time Focus on applications, including problems in ecology and the environmental sciences, where it is natural/necessary to model the joint stochastic mechanism for the response(s) and covariates Motivating application: study of dynamically evolving natural selection surfaces in evolutionary biology Data example: maturity (recorded on an ordinal scale),, and for Chilipepper rockfish, collected over 15 years along the coast of California 2 / 23

Motivation Regression modeling for one or more ordinal categorical responses recorded over discrete time Focus on applications, including problems in ecology and the environmental sciences, where it is natural/necessary to model the joint stochastic mechanism for the response(s) and covariates Motivating application: study of dynamically evolving natural selection surfaces in evolutionary biology Data example: maturity (recorded on an ordinal scale),, and for Chilipepper rockfish, collected over 15 years along the coast of California 2 / 23

Motivation Regression modeling for one or more ordinal categorical responses recorded over discrete time Focus on applications, including problems in ecology and the environmental sciences, where it is natural/necessary to model the joint stochastic mechanism for the response(s) and covariates Motivating application: study of dynamically evolving natural selection surfaces in evolutionary biology Data example: maturity (recorded on an ordinal scale),, and for Chilipepper rockfish, collected over 15 years along the coast of California 2 / 23

Modeling through latent continuous responses Assume each ordinal response represents a discretized version of an underlying latent continuous response 1 k ordinal variables Y = (Y 1,..., Y k ), with y j (continuous) covariates X = (X 1,..., X p ) {1,..., C j }, and p Assume Y j = l if-f γ j,l 1 < Z j γ j,l, for j = 1,..., k, and l = 1,..., C j (with γ j,0 = and γ j,cj = ) Multivariate normal distribution for Z = (Z 1,..., Z k ) multivariate ordinal probit model symmetric, unimodal latent response distribution with mean x T β implies restrictive effects of covariates on the probability response curves computational challenges in estimating cut-off points 1 e.g., Albert and Chib, 1993 3 / 23

Modeling through latent continuous responses Assume each ordinal response represents a discretized version of an underlying latent continuous response 1 k ordinal variables Y = (Y 1,..., Y k ), with y j (continuous) covariates X = (X 1,..., X p ) {1,..., C j }, and p Assume Y j = l if-f γ j,l 1 < Z j γ j,l, for j = 1,..., k, and l = 1,..., C j (with γ j,0 = and γ j,cj = ) Multivariate normal distribution for Z = (Z 1,..., Z k ) multivariate ordinal probit model symmetric, unimodal latent response distribution with mean x T β implies restrictive effects of covariates on the probability response curves computational challenges in estimating cut-off points 1 e.g., Albert and Chib, 1993 3 / 23

Modeling through latent continuous responses Assume each ordinal response represents a discretized version of an underlying latent continuous response 1 k ordinal variables Y = (Y 1,..., Y k ), with y j (continuous) covariates X = (X 1,..., X p ) {1,..., C j }, and p Assume Y j = l if-f γ j,l 1 < Z j γ j,l, for j = 1,..., k, and l = 1,..., C j (with γ j,0 = and γ j,cj = ) Multivariate normal distribution for Z = (Z 1,..., Z k ) multivariate ordinal probit model symmetric, unimodal latent response distribution with mean x T β implies restrictive effects of covariates on the probability response curves computational challenges in estimating cut-off points 1 e.g., Albert and Chib, 1993 3 / 23

Objectives For univariate responses, more general methods have been explored, relaxing either the distributional or linearity assumption 1 In the multivariate setting, complications arise from issues of constrained covariance matrices and inference for the cut-offs, and methods for general Bayesian inference are limited In contrast to semiparametric approaches, our aim is flexible modeling and inference for the ordinal regression relationships and for the response distribution 1 e.g., Newton et al., 1996; Mukhopadhyay and Gelfand, 1997; Denison et al., 2002; Chib and Greenberg, 2010 4 / 23

Objectives For univariate responses, more general methods have been explored, relaxing either the distributional or linearity assumption 1 In the multivariate setting, complications arise from issues of constrained covariance matrices and inference for the cut-offs, and methods for general Bayesian inference are limited In contrast to semiparametric approaches, our aim is flexible modeling and inference for the ordinal regression relationships and for the response distribution 1 e.g., Newton et al., 1996; Mukhopadhyay and Gelfand, 1997; Denison et al., 2002; Chib and Greenberg, 2010 4 / 23

Objectives For univariate responses, more general methods have been explored, relaxing either the distributional or linearity assumption 1 In the multivariate setting, complications arise from issues of constrained covariance matrices and inference for the cut-offs, and methods for general Bayesian inference are limited In contrast to semiparametric approaches, our aim is flexible modeling and inference for the ordinal regression relationships and for the response distribution 1 e.g., Newton et al., 1996; Mukhopadhyay and Gelfand, 1997; Denison et al., 2002; Chib and Greenberg, 2010 4 / 23

The nonparametric mixture model We use a version of implied conditional regression 1 modeling the joint latent response-covariate distribution f (z, x) inference for f (z x), and for Pr(Y x), implied through f (z, x) and f (x) Dirichlet Process (DP) mixture model for f (z, x): f (z, x G) = N(z, x µ, Σ)dG(µ, Σ), G α, ψ DP(α, G 0 ( ψ)) DP constructive definition 2 : f (z, x G) = r=1 p rn(z, x µ r, Σ r ), where θ r = (µ r, Σ r ) iid G 0, and the weights p 1, p 2,... are determined through stick-breaking: stick-breaking proportions β s iid beta(1, α), s = 1, 2,... p 1 = β 1, and p r = β r 1 r m=1 (1 βm), for r = 2, 3,... 1 Nadaraya, 1964; Watson, 1964; Müller et al., 1996 2 Sethuraman, 1994 5 / 23

The nonparametric mixture model We use a version of implied conditional regression 1 modeling the joint latent response-covariate distribution f (z, x) inference for f (z x), and for Pr(Y x), implied through f (z, x) and f (x) Dirichlet Process (DP) mixture model for f (z, x): f (z, x G) = N(z, x µ, Σ)dG(µ, Σ), G α, ψ DP(α, G 0 ( ψ)) DP constructive definition 2 : f (z, x G) = r=1 p rn(z, x µ r, Σ r ), where θ r = (µ r, Σ r ) iid G 0, and the weights p 1, p 2,... are determined through stick-breaking: stick-breaking proportions β s iid beta(1, α), s = 1, 2,... p 1 = β 1, and p r = β r 1 r m=1 (1 βm), for r = 2, 3,... 1 Nadaraya, 1964; Watson, 1964; Müller et al., 1996 2 Sethuraman, 1994 5 / 23

The nonparametric mixture model We use a version of implied conditional regression 1 modeling the joint latent response-covariate distribution f (z, x) inference for f (z x), and for Pr(Y x), implied through f (z, x) and f (x) Dirichlet Process (DP) mixture model for f (z, x): f (z, x G) = N(z, x µ, Σ)dG(µ, Σ), G α, ψ DP(α, G 0 ( ψ)) DP constructive definition 2 : f (z, x G) = r=1 p rn(z, x µ r, Σ r ), where θ r = (µ r, Σ r ) iid G 0, and the weights p 1, p 2,... are determined through stick-breaking: stick-breaking proportions β s iid beta(1, α), s = 1, 2,... p 1 = β 1, and p r = β r 1 r m=1 (1 βm), for r = 2, 3,... 1 Nadaraya, 1964; Watson, 1964; Müller et al., 1996 2 Sethuraman, 1994 5 / 23

Ordinal regression functions Flexible model for f (z, x) flexible inference for Pr(Y x) Implied regression functions provide a nonparametric extension of probit regression (with random covariates): Pr(Y = (l 1,..., l k ) x; G) = r=1 γk,lk γ1,l1 w r (x) N(z m r (x), S r )dz γ k,lk 1 γ 1,l1 1 with covariate dependent weights w r(x) p rn(x µ x r, Σxx r ) and covariate dependent probabilities, where m r(x) = µ z r+σ zx r (Σ xx µ x r) and S r = Σ zz r Σ zx r (Σ xx r ) 1 Σ xz r r ) 1 (x 6 / 23

Ordinal regression functions Flexible model for f (z, x) flexible inference for Pr(Y x) Implied regression functions provide a nonparametric extension of probit regression (with random covariates): Pr(Y = (l 1,..., l k ) x; G) = r=1 γk,lk γ1,l1 w r (x) N(z m r (x), S r )dz γ k,lk 1 γ 1,l1 1 with covariate dependent weights w r(x) p rn(x µ x r, Σxx r ) and covariate dependent probabilities, where m r(x) = µ z r+σ zx r (Σ xx µ x r) and S r = Σ zz r Σ zx r (Σ xx r ) 1 Σ xz r r ) 1 (x 6 / 23

Model properties Provided C j > 2, both µ and Σ are identifiable in the induced mixture kernel for (Y, X), under fixed cut-off points The prior model has large support again under fixed cut-offs it assigns positive probability to all Kullback-Leibler (KL) neighborhoods of a mixed ordinal-continuous distribution, p 0(x, y), as well as to all KL neighborhoods of the implied conditional distribution, p 0(y x) Identifiability result + KL property obtained under fixed cut-offs computational advant over parametric models 7 / 23

Model properties Provided C j > 2, both µ and Σ are identifiable in the induced mixture kernel for (Y, X), under fixed cut-off points The prior model has large support again under fixed cut-offs it assigns positive probability to all Kullback-Leibler (KL) neighborhoods of a mixed ordinal-continuous distribution, p 0(x, y), as well as to all KL neighborhoods of the implied conditional distribution, p 0(y x) Identifiability result + KL property obtained under fixed cut-offs computational advant over parametric models 7 / 23

Model properties Provided C j > 2, both µ and Σ are identifiable in the induced mixture kernel for (Y, X), under fixed cut-off points The prior model has large support again under fixed cut-offs it assigns positive probability to all Kullback-Leibler (KL) neighborhoods of a mixed ordinal-continuous distribution, p 0(x, y), as well as to all KL neighborhoods of the implied conditional distribution, p 0(y x) Identifiability result + KL property obtained under fixed cut-offs computational advant over parametric models 7 / 23

Model properties Interactions and dependence between covariates are implicit in joint modeling framework Inference for inverse relationships covariate distribution across ordinal responses values, f (x Y = y) Model can accommodate directly continuous covariates as well as discrete covariates which have some ordering modification of methodology to handle nominal categorical covariates 8 / 23

Model properties Interactions and dependence between covariates are implicit in joint modeling framework Inference for inverse relationships covariate distribution across ordinal responses values, f (x Y = y) Model can accommodate directly continuous covariates as well as discrete covariates which have some ordering modification of methodology to handle nominal categorical covariates 8 / 23

Model properties Interactions and dependence between covariates are implicit in joint modeling framework Inference for inverse relationships covariate distribution across ordinal responses values, f (x Y = y) Model can accommodate directly continuous covariates as well as discrete covariates which have some ordering modification of methodology to handle nominal categorical covariates 8 / 23

Hierarchical model for the data y ij = l iff γ j,l 1 < z ij γ j,l, i = 1,..., n, j = 1,..., k ind. (z i, x i ) {µ l, Σ l }, L i N(µ Li, Σ Li ), N L i p iid p l δ l (L i ), l=1 i = 1,..., n i = 1,..., n p α GD((1, 1,..., 1), (α, α,..., α)) (µ l, Σ l ) ψ iid N(µ l ; m, V)IW(Σ l ; ν, S), l = 1,..., N and the full model is completed with conditionally conjugate priors on ψ = (m, V, S) and α 9 / 23

Ozone concentration data example Data set comprising 111 measurements of ozone concentration (ppb), wind speed (mph), radiation (langleys), and temperature (degrees Fahrenheit) Ozone concentration recorded on continuous scale To construct an ordinal response: define high as above 100 ppb, medium as (50, 100] ppb, and low as less than 50 ppb Comparison of inferences from the model for (Y, X) with those from a DP mixture of normals model for (Z, X) 10 / 23

Ozone concentration data example Data set comprising 111 measurements of ozone concentration (ppb), wind speed (mph), radiation (langleys), and temperature (degrees Fahrenheit) Ozone concentration recorded on continuous scale To construct an ordinal response: define high as above 100 ppb, medium as (50, 100] ppb, and low as less than 50 ppb Comparison of inferences from the model for (Y, X) with those from a DP mixture of normals model for (Z, X) 10 / 23

Ozone concentration data example Data set comprising 111 measurements of ozone concentration (ppb), wind speed (mph), radiation (langleys), and temperature (degrees Fahrenheit) Ozone concentration recorded on continuous scale To construct an ordinal response: define high as above 100 ppb, medium as (50, 100] ppb, and low as less than 50 ppb Comparison of inferences from the model for (Y, X) with those from a DP mixture of normals model for (Z, X) 10 / 23

Ozone data pr(low) pr(medium) pr(high) 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0 50 150 250 radiation 0 50 150 250 radiation 0 50 150 250 radiation 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 60 70 80 90 temperature 60 70 80 90 temperature 60 70 80 90 temperature 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 5 10 15 20 5 10 15 20 5 10 15 20 wind speed wind speed wind speed Figure: Posterior mean (solid) and 95% interval estimates (dashed) for Pr(Y = l x m; G) (black) compared to Pr(γ l 1 < Z γ l x m; G) (red). 11 / 23

Ozone data pr(low) pr(medium) pr(high) temperature 60 70 80 90 temperature 60 70 80 90 temperature 60 70 80 90 50 100 150 200 250 300 radiation 50 100 150 200 250 300 radiation 50 100 150 200 250 300 radiation Figure: Posterior mean estimates for Pr(Y = l x 1, x 2 ; G), for l = 1, 2, 3, corresponding to low (left), medium (middle) and high (right). Red represents a value of 1, white represents 0. 12 / 23

Extension to dynamic ordinal regression modeling Focusing on a univariate ordinal response, we seek to extend to a model for Pr t (Y x), for t T = {1, 2,... } Build on the earlier framework by extending to a prior model for {f (z, x G t ) : t T }, and thus for {Pr(Y x; G t ) : t T } Motivating application: data from NMFS on female Chilipepper rockfish collected between 1993 and 2007 along the coast of California three ordinal levels for maturity: immature (1), pre-spawning mature (2), and post-spawning mature (3) measured in millimeters recorded on an ordinal scale: j implies the fish was between j and j + 1 years of (data range 1 to 25) incorporate into the model in the same fashion with the maturity variable 13 / 23

Extension to dynamic ordinal regression modeling Focusing on a univariate ordinal response, we seek to extend to a model for Pr t (Y x), for t T = {1, 2,... } Build on the earlier framework by extending to a prior model for {f (z, x G t ) : t T }, and thus for {Pr(Y x; G t ) : t T } Motivating application: data from NMFS on female Chilipepper rockfish collected between 1993 and 2007 along the coast of California three ordinal levels for maturity: immature (1), pre-spawning mature (2), and post-spawning mature (3) measured in millimeters recorded on an ordinal scale: j implies the fish was between j and j + 1 years of (data range 1 to 25) incorporate into the model in the same fashion with the maturity variable 13 / 23

Extension to dynamic ordinal regression modeling Focusing on a univariate ordinal response, we seek to extend to a model for Pr t (Y x), for t T = {1, 2,... } Build on the earlier framework by extending to a prior model for {f (z, x G t ) : t T }, and thus for {Pr(Y x; G t ) : t T } Motivating application: data from NMFS on female Chilipepper rockfish collected between 1993 and 2007 along the coast of California three ordinal levels for maturity: immature (1), pre-spawning mature (2), and post-spawning mature (3) measured in millimeters recorded on an ordinal scale: j implies the fish was between j and j + 1 years of (data range 1 to 25) incorporate into the model in the same fashion with the maturity variable 13 / 23

Rockfish data t=1993 t=1994 t=1995 200 400 200 400 200 400 0 5 10 15 20 25 t=1996 0 5 10 15 20 25 t=1997 0 5 10 15 20 25 t=1998 200 400 200 400 200 400 0 5 10 15 20 25 t=1999 0 5 10 15 20 25 t=2000 0 5 10 15 20 25 t=2001 200 400 200 400 200 400 0 5 10 15 20 25 t=2002 0 5 10 15 20 25 t=2004 0 5 10 15 20 25 t=2007 200 400 200 400 200 400 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 Figure: Bivariate plots of versus at each year of data, with data points colored according to maturity level: red level 1; green level 2; blue level 3. 14 / 23

DDP model extension To retain model properties at each t, use DDP prior for {G t : t T } 1 Time-dependent weights and atoms: f (z, x G t ) = r=1 { (1 β r,t ) r 1 m=1 β m,t } N(z, x µ r,t, Σ r ) Stochastic process with beta(α, 1) marginals for the {β r,t : t T }: B = { ( β t = exp ζ2 + ηt 2 ) } : t T 2α where ζ N(0, 1) and, independently, {η t : t T } is an AR(1) process with N(0, 1) marginals (η t η t 1, φ N(φη t 1, 1 φ 2 ), with φ < 1) Vector autoregressive model for the {µ r,t : t T } 1 MacEachern, 2000; Taddy, 2010; Nieto-Barajas et al., 2012 15 / 23

DDP model extension To retain model properties at each t, use DDP prior for {G t : t T } 1 Time-dependent weights and atoms: f (z, x G t ) = r=1 { (1 β r,t ) r 1 m=1 β m,t } N(z, x µ r,t, Σ r ) Stochastic process with beta(α, 1) marginals for the {β r,t : t T }: B = { ( β t = exp ζ2 + ηt 2 ) } : t T 2α where ζ N(0, 1) and, independently, {η t : t T } is an AR(1) process with N(0, 1) marginals (η t η t 1, φ N(φη t 1, 1 φ 2 ), with φ < 1) Vector autoregressive model for the {µ r,t : t T } 1 MacEachern, 2000; Taddy, 2010; Nieto-Barajas et al., 2012 15 / 23

DDP model extension To retain model properties at each t, use DDP prior for {G t : t T } 1 Time-dependent weights and atoms: f (z, x G t ) = r=1 { (1 β r,t ) r 1 m=1 β m,t } N(z, x µ r,t, Σ r ) Stochastic process with beta(α, 1) marginals for the {β r,t : t T }: B = { ( β t = exp ζ2 + ηt 2 ) } : t T 2α where ζ N(0, 1) and, independently, {η t : t T } is an AR(1) process with N(0, 1) marginals (η t η t 1, φ N(φη t 1, 1 φ 2 ), with φ < 1) Vector autoregressive model for the {µ r,t : t T } 1 MacEachern, 2000; Taddy, 2010; Nieto-Barajas et al., 2012 15 / 23

DDP model extension To retain model properties at each t, use DDP prior for {G t : t T } 1 Time-dependent weights and atoms: f (z, x G t ) = r=1 { (1 β r,t ) r 1 m=1 β m,t } N(z, x µ r,t, Σ r ) Stochastic process with beta(α, 1) marginals for the {β r,t : t T }: B = { ( β t = exp ζ2 + ηt 2 ) } : t T 2α where ζ N(0, 1) and, independently, {η t : t T } is an AR(1) process with N(0, 1) marginals (η t η t 1, φ N(φη t 1, 1 φ 2 ), with φ < 1) Vector autoregressive model for the {µ r,t : t T } 1 MacEachern, 2000; Taddy, 2010; Nieto-Barajas et al., 2012 15 / 23

Hierarchical model for the data y t,i = j γ j 1 < z t,i γ j, t s c, i = 1,..., n t u t,i = j log(j) < w t,i log(j + 1), t s c, i = 1,..., n t {y t,i = (z t,i, w t,i, x t,i )} {µ l,t }, {Σ l }, {L t,i } n t N(µ Lt,i,t, Σ Lt,i ) t s c η l,t η l,t 1, φ N(φη l,t 1, 1 φ 2 ), i=1 {L t,i } {η l,t }, {ζ l } n t t s c i=1 l=1 N p l,t δ l (L t,i ) {ζ l }, {η l,1 } ind. N(0, 1), l = 1,..., N 1 l = 1,..., N 1, t = 2,..., T µ l,1 m 0, V 0 N(m 0, V 0 ), l = 1,..., N µ l,t µ l,t 1, Θ, m, V N(m + Θµ l,t 1, V), l = 1,..., N, t = 2,..., T with priors on α, φ, Θ, and ψ = (m, V, D) Σ l ν, D iid IW(Σ l ; ν, D), l = 1,..., N 16 / 23

Rockfish data 200 350 500 t=1993 200 350 500 t=1994 200 350 500 t=1995 5 10 15 20 5 10 15 20 5 10 15 20 200 350 500 t=1996 200 350 500 t=1997 200 350 500 t=1998 5 10 15 20 5 10 15 20 5 10 15 20 200 350 500 t=1999 200 350 500 t=2000 200 350 500 t=2001 5 10 15 20 5 10 15 20 5 10 15 20 200 350 500 t=2002 200 350 500 t=2003 200 350 500 t=2004 5 10 15 20 5 10 15 20 5 10 15 20 200 350 500 t=2005 200 350 500 t=2006 200 350 500 t=2007 5 10 15 20 5 10 15 20 5 10 15 20 Figure: Posterior mean estimates for the bivariate density of and across all years. 17 / 23

Rockfish data t=1993 t=2000 t=2002 200 300 400 500 200 300 400 500 200 300 400 500 5 10 15 5 10 15 5 10 15 Figure: Posterior mean and 95% interval bands for the expected value of over (continuous), across three years. Overlaid are the data (in blue) and the estimated von Bertalanffy growth curves (in red). 18 / 23

Rockfish data t=1993 t=1994 t=1995 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 250 350 450 300 400 500 250 350 450 t=1996 t=1997 t=1998 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 250 350 450 300 400 500 300 400 500 t=1999 t=2000 t=2001 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 200 400 300 400 500 300 400 t=2002 t=2003 t=2004 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 pr t (y x) 0.0 0.6 200 350 500 200 350 500 250 350 450 Figure: Posterior mean and 95% interval bands for the ordinal probability curves associated with : immature (solid); pre-spawning mature (dashed); post-spawning mature (dotted). 19 / 23

Rockfish data t=1993 t=1994 t=1995 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 5 10 15 5 10 15 5 10 15 t=1996 t=1997 t=1998 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 5 10 15 5 10 15 5 10 15 t=1999 t=2000 t=2001 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 5 10 15 5 10 15 5 10 15 t=2002 t=2003 t=2004 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 pr t (y u*) 0.0 0.4 0.8 5 10 15 5 10 15 5 10 15 Figure: Posterior mean and 95% interval bands for the ordinal probability curves associated with : immature (solid); pre-spawning mature (dashed); post-spawning mature (dotted). 20 / 23

Rockfish data at 90% maturity 2 3 4 5 6 7 8 9 at 90% maturity 250 300 350 400 1994 1996 1998 2000 2002 2004 year 1994 1996 1998 2000 2002 2004 year Figure: Posterior mean and 90% intervals for the smallest value of above 2 years at which probability of maturity first exceeds 90% (left), and similar inference for (right). 21 / 23

Conclusions Modeling framework for ordinal regression problems with a small to moderate number of covariates, and for settings where modeling the joint response-covariate distribution is appropriate (or necessary) DeYoreo, M. & Kottas, A. (2015). "Bayesian nonparametric modeling for multivariate ordinal regression." (under review) DeYoreo, M. & Kottas, A. (2015). "Modeling for dynamic ordinal regression relationships: An application to estimating maturity of rockfish in California." (submitted for publication) Binary responses require a different model due to identifiability constraints DeYoreo, M. & Kottas, A. (2015). "A fully nonparametric modeling approach to binary regression." (revised) 22 / 23

Conclusions Modeling framework for ordinal regression problems with a small to moderate number of covariates, and for settings where modeling the joint response-covariate distribution is appropriate (or necessary) DeYoreo, M. & Kottas, A. (2015). "Bayesian nonparametric modeling for multivariate ordinal regression." (under review) DeYoreo, M. & Kottas, A. (2015). "Modeling for dynamic ordinal regression relationships: An application to estimating maturity of rockfish in California." (submitted for publication) Binary responses require a different model due to identifiability constraints DeYoreo, M. & Kottas, A. (2015). "A fully nonparametric modeling approach to binary regression." (revised) 22 / 23

Conclusions Modeling framework for ordinal regression problems with a small to moderate number of covariates, and for settings where modeling the joint response-covariate distribution is appropriate (or necessary) DeYoreo, M. & Kottas, A. (2015). "Bayesian nonparametric modeling for multivariate ordinal regression." (under review) DeYoreo, M. & Kottas, A. (2015). "Modeling for dynamic ordinal regression relationships: An application to estimating maturity of rockfish in California." (submitted for publication) Binary responses require a different model due to identifiability constraints DeYoreo, M. & Kottas, A. (2015). "A fully nonparametric modeling approach to binary regression." (revised) 22 / 23

MANY THANKS!!! 23 / 23