Order-q stochastic processes. Bayesian nonparametric applications

Similar documents
Dependence structures with applications to actuarial science

A Bayesian nonparametric dynamic AR model for multiple time series analysis

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Foundations of Nonparametric Bayesian Methods

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models

On the Support of MacEachern s Dependent Dirichlet Processes and Extensions

A Time-Series DDP for Functional Proteomics Profiles

CS Lecture 19. Exponential Families & Expectation Propagation

Introduction to BGPhazard

Chapter 2. Data Analysis

Normalized kernel-weighted random measures

Curve Fitting Re-visited, Bishop1.2.5

Bayesian semiparametric analysis of short- and long- term hazard ratios with covariates

Bayesian Modeling of Conditional Distributions

Truncation error of a superposed gamma process in a decreasing order representation

Analysing geoadditive regression data: a mixed model approach

Nonparametric Bayesian Methods - Lecture I

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

Characterising variation of nonparametric random probability measures using the Kullback-Leibler divergence

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Image segmentation combining Markov Random Fields and Dirichlet Processes

Bayesian non-parametric model to longitudinally predict churn

Colouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles

Non-Parametric Bayes

Lecture 3. Truncation, length-bias and prevalence sampling

Bayesian Nonparametrics for Speech and Signal Processing

ABC methods for phase-type distributions with applications in insurance risk problems

Survival Distributions, Hazard Functions, Cumulative Hazards

On the posterior structure of NRMI

Truncation error of a superposed gamma process in a decreasing order representation

Construction of Dependent Dirichlet Processes based on Poisson Processes

13: Variational inference II

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making

R-INLA. Sam Clifford Su-Yun Kang Jeff Hsieh. 30 August Bayesian Research and Analysis Group 1 / 14

A general mixed model approach for spatio-temporal regression data

Hmms with variable dimension structures and extensions

Part 3 Robust Bayesian statistics & applications in reliability networks

On Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation

ST495: Survival Analysis: Maximum likelihood

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

A Brief Overview of Nonparametric Bayesian Models

Modeling and Predicting Healthcare Claims

Modelling geoadditive survival data

The Monte Carlo Method: Bayesian Networks

Nonparametric inference in hidden Markov and related models

PMR Learning as Inference

Chapter 3 - Temporal processes

Asymptotics for posterior hazards

A Spatio-Temporal Point Process Model for Ambulance Demand

IMATI-Milano Department

Bayesian nonparametric models for bipartite graphs

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Simulating Random Variables

Lecture 5 Models and methods for recurrent event data

On the Truncation Error of a Superposed Gamma Process

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Slice Sampling Mixture Models

Joint Modeling of Longitudinal Item Response Data and Survival

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

A Bayesian Analysis of Some Nonparametric Problems

Adaptive shrinkage in Pólya tree type models

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A spatially explicit modelling framework for assessing ecotoxicological risks at the landscape scale

Quantifying the Price of Uncertainty in Bayesian Models

CPSC 540: Machine Learning

Bayesian Nonparametric Regression through Mixture Models

Flexible Regression Modeling using Bayesian Nonparametric Mixtures

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

Revisiting the Contact Process

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Frailty Modeling for clustered survival data: a simulation study

Bayesian Econometrics - Computer section

STAT Advanced Bayesian Inference

You must continuously work on this project over the course of four weeks.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Approaches for Multiple Disease Mapping: MCAR and SANOVA

3003 Cure. F. P. Treasure

Master s Written Examination

Hierarchical Bayesian Modeling with Approximate Bayesian Computation

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Disease mapping with Gaussian processes

Dirichlet process Bayesian clustering with the R package PReMiuM

A Bayesian hierarchical model for related densities using Polya trees

Coupled Hidden Markov Models: Computational Challenges

Inference for non-stationary, non-gaussian, Irregularly-Sampled Processes

Efficient adaptive covariate modelling for extremes

Asymptotics for posterior hazards

Bayesian Nonparametric Inference Why and How

arxiv: v1 [math.st] 29 Nov 2018

Weather generators for studying climate change

Tom Salisbury

AARMS Homework Exercises

Practical considerations for survival models

Heteroskedasticity in Time Series

Bayesian Inference. Chapter 2: Conjugate models

Methods for the Comparability of Scores

Transcription:

Order-q dependent stochastic processes in Bayesian nonparametric applications Department of Statistics, ITAM, Mexico BNP 2015, Raleigh, NC, USA 25 June, 2015

Contents Order-1 process Application in survival analysis Application in proteomics (DDP) Order-q process Application in time series modeling Application in multiple time series (Dependent Polya tree) Application in disease mapping Extensions

Order1 process η 1 η 2 η 3 η 4 η 5 θ 1 θ 2 θ 3 θ 4 θ 5 Dependence among {θ k } is induced through a latents {η k } Close form expressions when use conjugate distributions Want to ensure a given marginal distribution

Order1 process Nieto-Barajas & Walker (2001):

Order1 process Nieto-Barajas & Walker (2001): Beta process: {θ k } BeP 1 (a, b, c) θ 1 Be(a, b), η k θ k Bin(c, θ k ), θ k+1 η k Be(a + η k, b + c k η k ) θ k Be(a, b) marginally

Order1 process Nieto-Barajas & Walker (2001): Beta process: {θ k } BeP 1 (a, b, c) θ 1 Be(a, b), η k θ k Bin(c, θ k ), θ k+1 η k Be(a + η k, b + c k η k ) θ k Be(a, b) marginally Gamma process: {θ k } GaP 1 (a, b, c) θ 1 Ga(a, b), η k θ k Po(c k θ k ), θ k+1 η k Ga(a + c k, b + η k ) θ k Ga(a, b) marginally

Survival Analysis Hazard rate modelling If T is a discrete r.v. with support on τ k then h(t) = θ k I (t = τ k ) with {θ k } BeP 1 (a, b, c) If T is a continuous r.v. and {τ k } are a partition of IR + then h(t) = θ k I (τ k1 < t τ k ) with {θ k } GaP 1 (a, b, c)

Survival Analysis Hazard rate modelling If T is a discrete r.v. with support on τ k then with {θ k } BeP 1 (a, b, c) h(t) = θ k I (t = τ k ) If T is a continuous r.v. and {τ k } are a partition of IR + then with {θ k } GaP 1 (a, b, c) h(t) = θ k I (τ k1 < t τ k ) This is old stuff!, but what it is new is that there is an R-package called BGPhazard that implements these models

Survival Analysis: Order-1 Beta process 0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0 Estimate of hazard rates time Hazard rate + + + + + + + + + + + + + Hazard function Confidence band (95%) NelsonAalen based estimate

Survival Analysis: Order-1 Beta process Estimate of Survival Function 0.0 0.2 0.4 0.6 0.8 1.0 Model estimate Confidence bound (95%) KaplanMeier KM Confidence bound (95%) 0 5 10 15 20 times

Survival Analysis: Order-1 Gamma process Estimate of Survival Function 0.0 0.2 0.4 0.6 0.8 1.0 Model estimate Confidence bound (95%) KaplanMeier KM Confidence bound (95%) 0 20 40 60 80 100 120 140 times

Dependent Dirichlet Process (DDP) Nieto-Barajas & al. (2012): Use order1 Beta P. to define a DDP Time series of random prob. measures F = {F 1, F 2,...} Consider stick break. rep. F t = w th δ µth h=1, w th = θ th (1 θ tj ) j<h

Dependent Dirichlet Process (DDP) Nieto-Barajas & al. (2012): Use order1 Beta P. to define a DDP Time series of random prob. measures F = {F 1, F 2,...} Consider stick break. rep. F t = w th δ µth h=1, w th = θ th (1 θ tj ) j<h Common locations across time: µ th = µ h iid G for all t

Dependent Dirichlet Process (DDP) Nieto-Barajas & al. (2012): Use order1 Beta P. to define a DDP Time series of random prob. measures F = {F 1, F 2,...} Consider stick break. rep. F t = w th δ µth h=1, w th = θ th (1 θ tj ) j<h Common locations across time: µ th = µ h iid G for all t Dependent (unnormalized) weights: For each h {θ th } BeP 1 (1, b, c h )

Dependent Dirichlet Process (DDP) Nieto-Barajas & al. (2012): Use order1 Beta P. to define a DDP Time series of random prob. measures F = {F 1, F 2,...} Consider stick break. rep. F t = w th δ µth h=1, w th = θ th (1 θ tj ) j<h Common locations across time: µ th = µ h iid G for all t Dependent (unnormalized) weights: For each h {θ th } BeP 1 (1, b, c h ) Say that F DDP(b, G, c) and marginally F t DP(b, G)

Proteomics Study: Pathway inhibition experiment to study the effects of drug Lapatinib on ovarian cancer cell lines An ovarian cell line was initially treated with Lapatinib, and then stimulated over time T = 8 measurement time points, t d = 0, 5, 15, 30, 60, 90, 120, and 240 minutes, n = 30 proteins were probed using RPPA Difference scores (posttreatment - pretreatment intensities) were recorded

Histograms of expression scores t=0 t=5 t=15 t=30 Histogram 0 2 4 6 8 10 12 Histogram 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Histogram 0.0 0.5 1.0 1.5 2.0 2.5 Histogram 0 1 2 3 4 3.0 1.5 0.0 3.0 1.5 0.0 3.0 1.5 0.0 3.0 1.5 0.0 Y Y Y Y t=60 t=90 t=120 t=240 Histogram 0.0 0.5 1.0 1.5 2.0 2.5 Histogram 0.0 0.5 1.0 1.5 2.0 Histogram 0.0 0.5 1.0 1.5 2.0 Histogram 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.0 1.5 0.0 3.0 1.5 0.0 3.0 1.5 0.0 3.0 1.5 0.0 Y Y Y Y

Time series of expression scores Difference scores 2.5 2.0 1.5 1.0 0.5 0.0 0 50 100 150 200 Time

Proteomics The full model for the data is a random effects model y ti = x ti + u i + ɛ ti Temporal effect: x ti iid Ft and (F 1,..., F T ) DDP(b, G, c) with c th = c/ t to account for the unequally spacing of the observations Pathway effect: u i s.t. (u 1,..., u n ) CAR based on consensus interactions Measurement error: ɛ ti iid N(0, τt )

Plots of ˆF t = E(F t data) c.d.f. 0.00 0.02 0.04 0.06 0.08 0.10 0.12 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 50 100 150 200 250 Time X Increasing suppression over t 1 = 0 through t 5 = 60. From t 6 = 90 the effect is wearing off

Order2 process η 1 η 2 η 3 η 4 η 5 θ 1 θ 2 θ 3 θ 4 θ 5 Throw more arrows to induce higher order dependence There is no way to obtain a given marginal distribution: say beta or gamma Unless we include an extra latent (layer)

Order2 process ω η 1 η 2 η 3 η 4 η 5 θ 1 θ 2 θ 3 θ 4 θ 5 With this common ancestor ω we can through more arrows and still ensure a given marginal

Space and time process This idea can be use to induce time and/or spatial dependence t = 1 t = 2 t = 3 θ 1,1 (η 1,1 ) θ 1,3 (η 1,3 ) θ 1,2 (η 1,2 ) θ 1,4 (η 1,4 ) θ 2,1 (η 2,1 ) θ 2,3 (η 2,3 ) θ 2,2 (η 2,2 ) θ 2,4 (η 2,4 ) θ 3,1 θ 3,2 (η 3,1 ) (η 6 3,2 ) θ 3,3 (η 3,3 ) θ 3,4 (η 3,4 ) ω

Order-q process Jara & al. (2013): Orderq (AR) beta process: {θ t } BeP q (a, b, c) ω Be(a, b) η t ω ind Bin(c t, ω) q q θ t η Be a + η tj, b + (c tj η tj ) θ t Be(a, b) marginally j=0 j=0

Time series: θ t = Unemployement in Chile 0.0 0.1 0.2 0.3 0.4 BeP BDM 1980 1990 2000 2010 2020 Year

Dependent Polya Tree Nieto-Barajas & Quintana (2015): Use orderq Beta P. to define a DPT Time series of random prob. measures F = {F 1, F 2,...} Consider Polya Trees F t with nested partition Π t = {B tmj } and branching probs. Θ t = {θ t,m,j } with m = level, j = 1,..., 2 m

Dependent Polya Tree Nieto-Barajas & Quintana (2015): Use orderq Beta P. to define a DPT Time series of random prob. measures F = {F 1, F 2,...} Consider Polya Trees F t with nested partition Π t = {B tmj } and branching probs. Θ t = {θ t,m,j } with m = level, j = 1,..., 2 m Common nested partitions across time: Π t = Π for all t

Dependent Polya Tree Nieto-Barajas & Quintana (2015): Use orderq Beta P. to define a DPT Time series of random prob. measures F = {F 1, F 2,...} Consider Polya Trees F t with nested partition Π t = {B tmj } and branching probs. Θ t = {θ t,m,j } with m = level, j = 1,..., 2 m Common nested partitions across time: Π t = Π for all t Dependent branching probs: For each m and j {θ t,m,j } BeP q (aρ(m), aρ(m), c)

Dependent Polya Tree Nieto-Barajas & Quintana (2015): Use orderq Beta P. to define a DPT Time series of random prob. measures F = {F 1, F 2,...} Consider Polya Trees F t with nested partition Π t = {B tmj } and branching probs. Θ t = {θ t,m,j } with m = level, j = 1,..., 2 m Common nested partitions across time: Π t = Π for all t Dependent branching probs: For each m and j {θ t,m,j } BeP q (aρ(m), aρ(m), c) Say that F DPT q (Π, a, ρ, c) and marginally F t PT ρ(m) = m δ with δ {1.1, 2} (see Watson & al. 2015)

Multiple time series analysis Study: bioeconomic activity indicators (ITAEE) for the 32 States of Mexico Values are reported every 3 months from 2003 Available are 32 series of length 46 Values are transformed to constant prices of 2008 and are destationalized Took second differences to make data stationary

Original and second differences time series 70 80 90 100 110 120 20 10 0 10 2004 2006 2008 2010 2012 2014 Time 0 10 20 30 40 Time

Multiple time series analysis The model proposed for the data: X i = {X ti, t 1}, i = 1,..., n is an AR(p) process for each series X ti = β 1i X t1,i + + β pi X tp,i + ε ti, iid ε ti F t Ft, for i = 1,..., n {F 1, F 2,...} σ DPT q (Π σ, a, ρ, c) σ f (σ) Note that there is a further mixture: B mj are quantiles of N(0, σ 2 ) Dependence in {F t } dependence in {ε ti } resembling a MA(q) process, however the dependece is not necessarily exponentially decaying

Estimated {F t } Density 0.00 0.05 0.10 0.15 0.20 0.25

Spatial process

Spatial process Nieto-Barajas & Bandyopadhyay (2013): Spatial gamma process: {θ t } SGaP(a, b, c) ω Ga(a, b) η ij ω ind Ga(c ij, ω) θ i η Ga a + c ij, b + η ij j i j i i is the set of neighbours of region i θ t Ga(a, b) marginally

Disease mapping Study: Mortality in pregnant women due to hypertensive disorder in Mexico in 2009. Areas are the States Y i = Number of deaths in region i E i = At risk: Number of births (in thousands) λ i = Maternity mortality rate Zero-inflated model f (y i ) = π i I (y i = 0)+(1π i )Po(y i λ i E i ), λ i = θ i exp(β x i ) β is a vector of reg. coeff. s.t. β k N(0, σ 2 0 ) θ i SGaP(a, a, c) Six explanatory variables

Estimated mortality rate λ i 2 26 8 5 [3.05,6.33) [6.33,6.67) [6.67,7.38) [7.38,8.73) [8.73,21.07] 3 25 10 19 28 32 24 1 18 11 22 14 13 6 16 15 9 29 30 17 21 12 20 27 7 4 31 23

Estimated zero inflated prob. π i 2 26 8 5 [0,0.01) [0.01,0.04) [0.04,0.06) [0.06,0.5) [0.5,0.6] 3 25 10 19 28 32 24 1 18 11 22 14 13 6 16 15 9 29 30 17 21 12 20 27 7 4 31 23

Extensions Use same ideas with stochastic processes instead of random variables Dependent Dirichlet processes using multinomial processes as latents Dependent gamma processes using Poisson processes as latents These constructions are currently under study

References Jara, A., Nieto-Barajas, L. E. & Quintana, F. (2013). A time series model for responses on the unit interval. Bayesian Analysis 8, 723 740. Nieto-Barajas, L. E. & Bandyopadhyay, D. (2013). A zero-inflated spatial gamma process model with applications to disease mapping. Journal of Agricultural, Biological and Environmental Statistics 18, 137 158. Nieto-Barajas, L. E., Müller, P., Ji, Y., Lu, Y. & Mills, G. (2012). A time series DDP for functional proteomics profiles. Biometrics 68, 859 868. Nieto-Barajas, L. E. & Quintana, F. A. (2015). A Bayesian nonparametric dynamic AR model for multiple time series analysis. Preprint. Nieto-Barajas, L. E. & Walker, S. G. (2002). Markov beta and gamma processes for modelling hazard rates. Scandinavian Journal of Statistics 29, 413 424. Watson, J. Nieto-Barajas, L. E. & Holmes, C. (2015). Characterising variation of nonparametric random probability measures using the Kullback-Leibler divergence. Preprint.