Normalized kernel-weighted random measures

Similar documents
The Ornstein-Uhlenbeck Dirichlet Process and other time-varying processes for Bayesian nonparametric inference

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Bayesian nonparametric models of sparse and exchangeable random graphs

BAYESIAN NONPARAMETRIC MODELLING WITH THE DIRICHLET PROCESS REGRESSION SMOOTHER

arxiv: v1 [stat.ml] 20 Nov 2012

Gaussian processes for inference in stochastic differential equations

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Compound Random Measures

On the posterior structure of NRMI

Slice Sampling Mixture Models

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Bayesian non-parametric model to longitudinally predict churn

On the Support of MacEachern s Dependent Dirichlet Processes and Extensions

A Process over all Stationary Covariance Kernels

Asymptotics for posterior hazards

Nonparametric Bayesian Methods - Lecture I

Construction of Dependent Dirichlet Processes based on Poisson Processes

Ornstein-Uhlenbeck processes for geophysical data analysis

An adaptive truncation method for inference in Bayesian nonparametric models

Foundations of Nonparametric Bayesian Methods

CPSC 540: Machine Learning

CS Lecture 19. Exponential Families & Expectation Propagation

Bayesian Nonparametric Regression through Mixture Models

Statistics & Data Sciences: First Year Prelim Exam May 2018

Bayesian Modeling of Conditional Distributions

Generalized Spatial Dirichlet Process Models

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

A Nonparametric Model for Stationary Time Series

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Bayesian nonparametric models for bipartite graphs

Spatial Normalized Gamma Process

GARCH processes continuous counterparts (Part 2)

Hybrid Dirichlet processes for functional data

A marginal sampler for σ-stable Poisson-Kingman mixture models

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

STAT Advanced Bayesian Inference

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Bayesian inference with stochastic volatility models using continuous superpositions of non-gaussian Ornstein-Uhlenbeck processes

Gibbs Sampling in Linear Models #2

Bayesian Nonparametrics: Dirichlet Process

Lecture 3a: Dirichlet processes

Flexible Regression Modeling using Bayesian Nonparametric Mixtures

Asymptotics for posterior hazards

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Construction of Dependent Dirichlet Processes based on Poisson Processes

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions

Non-Parametric Bayes

A short introduction to INLA and R-INLA

Kernel Stick-Breaking Processes

The Laplace driven moving average a non-gaussian stationary process

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

A Brief Overview of Nonparametric Bayesian Models

Order-q stochastic processes. Bayesian nonparametric applications

Spatial Statistics with Image Analysis. Lecture L08. Computer exercise 3. Lecture 8. Johan Lindström. November 25, 2016

Partial factor modeling: predictor-dependent shrinkage for linear regression

New Dirichlet Mean Identities

A Simple Proof of the Stick-Breaking Construction of the Dirichlet Process

Modeling conditional distributions with mixture models: Theory and Inference

MA6451 PROBABILITY AND RANDOM PROCESSES

Bayesian Methods for Machine Learning

Nonparametric Function Estimation with Infinite-Order Kernels

Modelling and computation using NCoRM mixtures for. density regression

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Hierarchical Modeling for Univariate Spatial Data

Dirichlet Process. Yee Whye Teh, University College London

State Space Representation of Gaussian Processes

Colouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Minimax Estimation of Kernel Mean Embeddings

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Truncation error of a superposed gamma process in a decreasing order representation

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

STAT 518 Intro Student Presentation

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Dependent mixture models: clustering and borrowing information

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research

Nonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group

Fitting Narrow Emission Lines in X-ray Spectra

CTDL-Positive Stable Frailty Model

Towards inference for skewed alpha stable Levy processes

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

Markov chain Monte Carlo

A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models

Bayesian nonparametric latent feature models

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Kernel Sequential Monte Carlo

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Kernels for Automatic Pattern Discovery and Extrapolation

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Gaussian Process Regression

Markov Chain Monte Carlo methods

Transcription:

Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27

Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations

Bayesian Density Regression We observe data (x 1, y 1 ),..., (x n, y n ) and we assume that y i F xi. We want to estimate F x for x X.

Bayesian Density Regression We observe data (x 1, y 1 ),..., (x n, y n ) and we assume that y i F xi. We want to estimate F x for x X. We could build the hierarchical model y i υ(ψ i, φ) and θ 1, θ 2, θ 3, i.i.d. H. G xi ψ i G xi d = p i (x i )δ θi i=1

Bayesian Density Regression We observe data (x 1, y 1 ),..., (x n, y n ) and we assume that y i F xi. We want to estimate F x for x X. We could build the hierarchical model y i υ(ψ i, φ) and θ 1, θ 2, θ 3, i.i.d. H. G xi ψ i G xi d = p i (x i )δ θi i=1 Then F x can be estimated by E G,φ y [ k(y i ψ, φ) dg x (ψ)].

Bayesian Density Regression We would like G x to be stationary.

Bayesian Density Regression We would like G x to be stationary.... and to have a way of controlling the dependence between G x and G y.

Possible approaches Usually we generalize standard construction of priors for exchangeable sequences:

Possible approaches Usually we generalize standard construction of priors for exchangeable sequences: Dirichlet process - DDP (MacEachern 1999)

Possible approaches Usually we generalize standard construction of priors for exchangeable sequences: Dirichlet process - DDP (MacEachern 1999) Stick-breaking - πddp (Griffin and Steel, 26), Kernel-weighted stick-breaking (Dunson and Park, 26)

Possible approaches Usually we generalize standard construction of priors for exchangeable sequences: Dirichlet process - DDP (MacEachern 1999) Stick-breaking - πddp (Griffin and Steel, 26), Kernel-weighted stick-breaking (Dunson and Park, 26) Pólya urn scheme (Caron et al, 27)

Normalized random measures We could extend the class of normalized random measures (Regazzini et al 23, James et al 25)).

Normalized random measures We could extend the class of normalized random measures (Regazzini et al 23, James et al 25)). Let (J, θ) follow an homogeneous Poisson process on R + Θ with intensity κ(j)h(θ) and define i=1 G = J iδ θi i=1 J i

Normalized random measures We could extend the class of normalized random measures (Regazzini et al 23, James et al 25)). Let (J, θ) follow an homogeneous Poisson process on R + Θ with intensity κ(j)h(θ) and define i=1 G = J iδ θi i=1 J i then G follows a (homogeneous) NRM under suitable conditions for κ we have a random probability measure (infinite activity) and h is the density of the centring distribution.

Examples of NRMs Dirichlet process - Normalized gamma process κ(j) = M exp{ J} J Normalized Generalized Gamma process κ(j) = γ Γ(1 γ) J γ exp{ rj

Normalized kernel-weighted measures Let (τ, J, θ) follow an homogeneous Poisson process on X R + Θ with intensity κ(j)h(θ) and define i=1 G x = k(x, τ i)j i δ θi i=1 k(x, τ i)j i for some kernel function k(x, τ i ) centred at τ i

Normalized kernel-weighted measures Let (τ, J, θ) follow an homogeneous Poisson process on X R + Θ with intensity κ(j)h(θ) and define i=1 G x = k(x, τ i)j i δ θi i=1 k(x, τ i)j i for some kernel function k(x, τ i ) centred at τ i For modelling, we wish to control Dependence between G x and G y. In these process, for a measureable set B, we can measure correlation through Corr(G x (B), G y (B)) which usually won t depend on B. The marginal prior of G x for all x.

Normalized kernel-weighted measures Dependence The correlation of the unnormalized random measures is k(x, τ)k(y, τ) dτ k(x, τ) 2 dτ This correlation will typically carry over to the normalized version unless we have a marginal processs that gives distributions with a few large jumps.

Normalized kernel-weighted measures Dependence The correlation of the unnormalized random measures is k(x, τ)k(y, τ) dτ k(x, τ) 2 dτ This correlation will typically carry over to the normalized version unless we have a marginal processs that gives distributions with a few large jumps. Stationarity The form of κ can be derived to give particular marginal processes.

Ornstein-Uhlenbeck Dirichlet Process With a 1D regressor, typically time, we fix the kernel function to be k(x, τ) = exp{ λ(x τ)}i(x > τ). and assume a marginal Dirichlet process.

Ornstein-Uhlenbeck Dirichlet Process With a 1D regressor, typically time, we fix the kernel function to be k(x, τ) = exp{ λ(x τ)}i(x > τ). and assume a marginal Dirichlet process. The unnormalized process must be a Gamma process.

Ornstein-Uhlenbeck Dirichlet Process With a 1D regressor, typically time, we fix the kernel function to be k(x, τ) = exp{ λ(x τ)}i(x > τ). and assume a marginal Dirichlet process. The unnormalized process must be a Gamma process. The ideas of Barndorff-Nielsen and Shephard are useful to define this process. Let φ 1, φ 2, φ 3,... are i.i.d. exponential (1) and τ 1, τ 2, τ 3,... follow a Poisson process with intensity Mλ then γ t = I(τ i < t) exp{ λ i τ i }φ i i=1 is Ga(M, 1) distributed for all t.

Definition of OUDP This is a construction when the covariate x is time. Define i=1 G x = I(τ i < x) exp{ λ(x τ i )}J i δ θi i=1 I(τ i < x) exp{ λ(x τ i )}J i or τ follows a Poisson process with intensity λm. J 1, J 2, J 3, i.i.d. Ex(1) θ 1, θ 2, θ 3, i.i.d. H 3 2.5 2 1.5 1 (τ, J, θ) follows a Poisson process with intensity.5 λm exp{ J}h 2 15 1 5 x

1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 The autocorrelation at lag k is approximately exp{ λk} [1 + 1M ] (1 exp{ λk} λ =.25 λ = 1 = 1 = 4

1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 1.9.8.7.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 Dynamics of moments The dynamics of the mean are µ t = w t µ t 1 + (1 w t )µ G λ =.125 λ =.5 λ = 2 M = 1 M = 16

Computation The stationarity of the process makes inference possible using fairly standard methods exp{ λt}γ G t = exp{ λt}γ + m i=1 exp{ λ(t τ G i)}j i m i=1 + exp{ λ(t τ i)}j i exp{ λt}γ + m i=1 exp{ λ(t τ i)}j i where G follows a Dirichlet process and γ follows a gamma distribution with shape parameter M. Inference using: Gibbs sampling Particle filtering

Example - Brazilian stock index.25.2.15.1 return.5.5.1.15 2 4 6 8 1 We observe r 1, r 2,..., r T which are daily log returns and let r t σ 2 t N(, σ 2 t ) σ 2 t F t where {F t } T follows an OUDP, centred on an inverse Gaussian distribution, whose parameters are estimated from the marginal distribution of the data.

Example Brazilian stock index.25 Data.25 Smoothed Predictive.2.15.2 return.1.5.15.1.5.1.5.15 2 4 6 8 1 2 4 6 8 1

Generalizing to other marginal processes For other marginal processes, let w(a) be the Lévy density of the unnormalized marginal process.

Generalizing to other marginal processes For other marginal processes, let w(a) be the Lévy density of the unnormalized marginal process. The intensity of the Poisson process of the unnormalized process with the kernel will be w (J)h(θ) where w (J) = λjw(j).

Generalizing to other marginal processes For other marginal processes, let w(a) be the Lévy density of the unnormalized marginal process. The intensity of the Poisson process of the unnormalized process with the kernel will be w (J)h(θ) where w (J) = λjw(j). A marginal NGG process arising from assuming the intensity κ(j) = γλ Γ(1 γ) J1 γ exp{ rj which is a finite activity Poisson process.

In general, if we define a kernel K (x, τ) then the two measures are linked by the integral equation a w(j) dj = where ν is Lesbesgue measure. a w (J)ν({τ K (, τ) > a/j}) dj This is a Volterra integral equation and can be solved using standard methods (in principle).

Generalizing to other kernels In 2D, if the kernel k(x, τ) = exp{ λ x t 2 } and we want a marginal Dirichlet process then the intensity function is λ π exp{ J}h(θ) (which is proportional to the intensity function for the OUDP) M = 1 M = 5 1 1.8.8.6.6.4.4.2.2 1 1.8.6.4.2.2.4.6.8 1.8.6.4.2.2.4.6.8 1

Discussion Normalized Kernel-Weighted Random Measures offer a way to model dependent nonparametric processes: Flexible kernels and marginal processes allow a large range of models to be defined. Computation is helped by representations through finite activity Poisson processes for some elements. and include continuous process on the space of measures.