Exchangeability. Peter Orbanz. Columbia University

Size: px
Start display at page:

Download "Exchangeability. Peter Orbanz. Columbia University"

Transcription

1 Exchangeability Peter Orbanz Columbia University

2 PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] Inference idea data = underlying pattern + independent noise Peter Orbanz 2 / 25

3 TERMINOLOGY Parametric model Number of parameters fixed (or constantly bounded) w.r.t. sample size Nonparametric model Number of parameters grows with sample size -dimensional parameter space Example: Density estimation x 2 p(x) µ x 1 Parametric Nonparametric Peter Orbanz 3 / 25

4 NONPARAMETRIC BAYESIAN MODEL Definition A nonparametric Bayesian model is a Bayesian model on an -dimensional parameter space. Interpretation Parameter space T = set of possible patterns. Recall previous tutorials: Model T Application Gaussian process Smooth functions Regression problems DP mixtures Smooth densities Density estimation CRP, 2-param. CRP Parititons Clustering Solution to Bayesian problem = posterior distribution on patterns Peter Orbanz [Sch95] 4 / 25

5 DE FINETTI S THEOREM Infinite exchangeability For all π S (= infinite symmetric group): P(X 1, X 2,... ) = P(X π(1), X π(2),...) or π(p) = P Theorem (de Finetti) P exchangeable P(X 1, X 2,... ) = M(X ) ( n=1 ) Q(X n) dν(q) Q is a random measure ν uniquely determined by P Peter Orbanz 5 / 25

6 FINITE EXCHANGEABILITY Finite sequence X 1,..., X n Exchangeability of finite sequence de Finetti-representation Example: Two exchangeable random bits X 1 = 0 X 1 = 1 X 2 = 0 0 1/2 X 2 = 1 1/2 0 Suppose de Finetti holds; then { } P(X1 = X 2 = 1) = [0,1] 0 = p2 dν(p) P(X 1 = X 2 = 0) = (1 [0,1] p)2 dν(p) ν{p = 0} = 1 ν{p = 1} = 1 Intuition Finite exchangeability does not eliminate sequential patterns. Peter Orbanz [DF80] 6 / 25

7 SUPPORT OF PRIORS Model P 0 outside model: misspecified P 0 = P θ0 M(X ) Peter Orbanz [Gho10, KvdV06] 7 / 25

8 SUPPORT OF NONPARAMETRIC PRIORS Large support Support of nonparametric priors is larger ( -dimensional) than of parametric priors (finite-dimensional). However: No uniform prior (or even neutral improper prior) exists on M(X ). Interpretation of nonparametric prior assumptions Concentration of nonparametric prior on subset of M(X ) typically represents structural prior assumption. GP regression with unknown bandwidth: Any continuous function possible Prior can express e.g. very smooth functions are more probable Clustering: Expected number of clusters is......small CRP prior...power law two-parameter CRP Peter Orbanz 8 / 25

9 PARAMETERIZED MODELS Probability model Ω X X(ω) X P(X) = X[P] P ω Θ Parameterized model P[X Θ] Θ(ω) T Ω X X F M(X ) P T T Θ P = {P[X θ] θ T } F law of large numbers T : P[. Θ = θ] θ bijection Θ := T F X Peter Orbanz [Sch95] 9 / 25

10 JUSTIFICATION: BY EXCHANGEABILITY Again: de Finetti P(X 1, X 2,... ) = M(X ) ( n=1 ) Q(X n) dν(q) = T ( n=1 ) Q(X n Θ = θ) dν T (θ) Θ random measure (since Θ(ω) M(X )) Convergence results The de Finetti theorem comes with a convergence result attached: Empirical measure: F n weakly θ as n Posterior Λ n(θ X 1,..., X n) = Λ n(., ω) in M(T ) exists Posterior convergence: Λ n(., ω) n δ Θ(ω) Peter Orbanz [Kal01] 10 / 25

11 SPECIAL TYPES OF EXCHANGEABLE DATA

12 MODIFICATIONS Pólya Urns P(X n+1 X 1 = x 1,..., X n = x n) = 1 α + n Exchangeable: ν is DP(α, G 0) n δ xj (X n+1) + α α + n G0(Xn+1) j=1 n=1 Q(Xn θ) = n=1 θ(xn) = n=1 ( Exchangeable increment processes (H. Bühlmann) ) j=1 cjδt j(x n) Stationary, exchangeable increment process = mixture of Lévy processes P((X t) t R+ ) = L α,γ,µ((x t) t R+ )dν(α, γ, µ) L α,γ,µ = Lévy process with jump measure µ Peter Orbanz [B 60, Kal01] 12 / 25

13 MODIFICATION 2: RANDOM PARTITIONS Random partition of N Π = {B 1, B 2,...} e.g. {{1, 3, 5,...}, {2, 4}, {10},...} Paint-box distribution Weights s 1, s 2,... 0 with s j 1 U 3 U 1 U 2 U 1, U 2,... Uniform[0, 1] s 1 s 2 Sampling Π β[. s]: 1 j sj i, j N in same block U i, U j in same interval {i} separate block U i in interval 1 s j Theorem (Kingman) Π exchangeable P(Π. ) = β[π. s]q(ds) Peter Orbanz [Kin78] 13 / 25

14 ROTATION INVARIANCE Rotatable sequence Infinite case P n(x 1,..., X n) = P n(r n(x 1,..., X n)) for all R n O(n) X 1, X 2,... rotatable : X 1,..., X n rotatable for all n Theorem (Freedman) Infinite sequence rotatable iff N σ denotes (0, σ)-gaussian P(X 1, X 2,... ) = R + ( n=1 ) N σ(x n) dν R+ (σ) Peter Orbanz 14 / 25

15 TWO INTERPRETATIONS As special case of de Finetti Rotatable exchangeable General de Finetti: Parameter space T = M(X ) Rotation invariance: T shrinks to {N σ σ R +} As invariance under different symmetry Exchangeability = invariance of P(X 1, X 2,...) under group action Freedman: Different group (O(n) rather than S ) In these cases: symmetry decomposition theorem Peter Orbanz 15 / 25

16 NON-EXCHANGEABLE DATA

17 EXCHANGEABILITY: RANDOM GRAPHS Random graph with independent edges Given: θ : [0, 1] 2 [0, 1] symmetric function U 1, U 2,... Uniform[0, 1] Edge (i, j) present: (i, j) Bernoulli(θ(U i, U j)) θ Call this distribution Γ(G. θ) Theorem (Aldous; Hoover) A random (dense) graph G is exchangeable iff P(G. ) = T Γ(G. θ)q(dθ) Peter Orbanz [Ald81, Hoo79] 17 / 25

18 EXCHANGEABILITY: RANDOM GRAPHS Random graph with independent edges Given: θ : [0, 1] 2 [0, 1] symmetric function U 1, U 2,... Uniform[0, 1] Edge (i, j) present: (i, j) Bernoulli(θ(U i, U j)) Call this distribution Γ(G. θ). 0 0 U 1 U 2 U 1 U θ 0 Pr{edge 1, 2} 1 Theorem (Aldous; Hoover) A random (dense) graph G is exchangeable iff P(G. ) = T Γ(G. θ)q(dθ) Peter Orbanz [Ald81, Hoo79] 17 / 25

19 DE FINETTI: GEOMETRY Finite case e 1 P = e i E ν ie i ν 1 E = {e 1, e 2, e 3} (ν 1, ν 2, ν 3) barycentric coordinates P ν 2 ν 3 Infinite/continuous case P(. ) = E e(. )dν(e) = T e 2 e 3 k(θ,. )dν T (θ) k : T E M(X ) probability kernel (= conditional probability) k is random measure with values k(θ,. ) E de Finetti: k(θ,. ) = n N Q(. θ) and T = M(X ) Peter Orbanz 18 / 25

20 DECOMPOSITION BY SYMMETRY Theorem (Varadarajan) G nice group on space Y Call measure µ ergodic if µ(a) {0, 1} for all G-invariant sets A. E := {ergodic probability measures} Then there is a Markov kernel k : Y E s.t.: P M(V) G-invariant P(A) = k(θ, A)dν(θ) de Finetti G = S and Y = X G-invariant sets = exchangeable events E = factorial distributions ( Hewitt-Savage 0-1 law ) T Peter Orbanz [Var63] 19 / 25

21 SYMMETRY AND SUFFICIENCY

22 SUFFICIENT STATISTICS Problem Apparently no direct connection with standard models Sufficient Statistic Functions S n of data sufficient if: Intuitively: Formally: S n(x 1,..., X n) contains all information sample provides on parameter P n(x 1,..., X n Θ, S n) = P(X 1,..., X n S) for all n Sufficiency and symmetry P exchangeable S n(x 1,..., x n) = 1 n n i=1 δxn sufficient n P rotatable S n(x 1,..., x n) = i=1 x2 i = (x 1,..., x n) 2 sufficient Peter Orbanz 21 / 25

23 DECOMPOSITION BY SUFFICIENCY Theorem (Diaconis and Freedman; Lauritzen; several others) Given: Sufficient statistic S n for each n k n(., s n) = conditional probability of X 1,..., X n given s n 1. k n converges to a limit function: k n(., S n(x 1(ω),..., X n(ω))) n k (., ω) 2. P(X 1, X 2,... ) has the decomposition P(. ) = k (., ω)dν(ω) 3. The model P M(X ) is a convex set with extreme points k (., ω) 4. The measure ν is uniquely determined by P (Theorem statement omits technical conditions.) Peter Orbanz 22 / 25

24 EXAMPLES de Finetti s theorem P exchangeable S n(x 1,..., x n) = 1 n n δ xn sufficient i=1 Rotation invariance P rotatable S n(x 1,..., x n) = (x 1,..., x n) 2 sufficient Kingman s theorem Π exchangeable asymptotic block sizes are sufficient statistic Exponential families (Küchler and Lauritzen) Choose X = R. Under suitable regularity conditions: S n additive, i.e. S n(x 1,..., x n) = 1 n n S 0(x i) i=1 if and only if ergodic measures are exponential family. Peter Orbanz [KL89] 23 / 25

25 SUMMARY Non-exchangeable data Identify invariance principle and its ergodic measures Ergodic measures generalize i.i.d. distributions likelihood Prior = distribution on ergodic measures Random structure Theorem of Mixtures of... Exchangeable sequences de Finetti product distributions Hewitt & Savage Processes with exch. increments Bühlmann Lévy processes Exchangeable partitions Kingman "paint-box distributions" Exchangeable arrays Aldous sampling scheme on [0, 1] 2 Hoover Kallenberg Block-exchangeable sequences Diaconis & Freedman Markov chains Exchangeable R d -sequences with Küchler & Lauritzen Exponential families additive sufficient statistics Peter Orbanz 24 / 25

26 REFERENCES I [Ald81] David J. Aldous. Representations for partially exchangeable arrays of random variables. J. Multivariate Anal., 11(4): , [B 60] H. Bühlmann. Austauschbare stochastische Variabeln und ihre Grenzwertsätze. PhD thesis, University of California Press, [DF80] P. Diaconis and D. Freedman. Finite exchangeable sequences. The Annals of Probability, 8(4):pp , [Gho10] S. Ghosal. Dirichlet process, related priors and posterior asymptotics. In N. L. Hjort et al., editors, Bayesian Nonparametrics, pages Cambridge University Press, [Hoo79] D. N. Hoover. Relations on probability spaces and arrays of random variables. Technical report, Institute of Advanced Study, Princeton, [Kal01] O. Kallenberg. Foundations of Modern Probability. Springer, 2nd edition, [Kin78] J. F. C. Kingman. The representation of partition structures. J. London Math. Soc., 2(18): , [KL89] U. Küchler and S. L. Lauritzen. Exponential families, extreme point models and minimal space-time invariant functions for stochastic processes with stationary and independent increments. Scand. J. Stat., 16: , [KvdV06] B. J. K. Kleijn and A. W. van der Vaart. Misspecification in infinite-dimensional Bayesian statistics. Annals of Statistics, 34(2): , [Sch95] M. J. Schervish. Theory of Statistics. Springer, [Var63] V. S. Varadarajan. Groups of automorphisms of Borel spaces. Transactions of the American Mathematical Society, 109(2):pp , Peter Orbanz 25 / 25

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

Bayesian Nonparametrics: Models Based on the Dirichlet Process

Bayesian Nonparametrics: Models Based on the Dirichlet Process Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Modern Bayesian Nonparametrics

Modern Bayesian Nonparametrics Modern Bayesian Nonparametrics Peter Orbanz Yee Whye Teh Cambridge University and Columbia University Gatsby Computational Neuroscience Unit, UCL NIPS 2011 Peter Orbanz & Yee Whye Teh 1 / 71 OVERVIEW 1.

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

arxiv: v2 [math.st] 7 Jan 2011

arxiv: v2 [math.st] 7 Jan 2011 CONJUGATE PROJECTIVE LIMITS Peter Orbanz University of Cambridge arxiv:1012.0363v2 [math.st] 7 Jan 2011 We characterize conjugate nonparametric Bayesian models as projective limits of conjugate, finite-dimensional

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Bayesian nonparametric models of sparse and exchangeable random graphs

Bayesian nonparametric models of sparse and exchangeable random graphs Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Acta Universitatis Carolinae. Mathematica et Physica

Acta Universitatis Carolinae. Mathematica et Physica Acta Universitatis Carolinae. Mathematica et Physica František Žák Representation form of de Finetti theorem and application to convexity Acta Universitatis Carolinae. Mathematica et Physica, Vol. 52 (2011),

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5)

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Tamara Broderick ITT Career Development Assistant Professor Electrical Engineering & Computer Science MIT Bayes Foundations

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet

More information

Some highlights from the theory of multivariate symmetries

Some highlights from the theory of multivariate symmetries Rendiconti di Matematica, Serie VII Volume 28, Roma (2008), 19 32 Some highlights from the theory of multivariate symmetries OLAV KALLENBERG Abstract: We explain how invariance in distribution under separate

More information

Exchangeable random arrays

Exchangeable random arrays Exchangeable random arrays Tim Austin Notes for IAS workshop, June 2012 Abstract Recommended reading: [Ald85, Aus08, DJ07, Ald10]. Of these, [Aus08] and [DJ07] give special emphasis to the connection with

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Functional Conjugacy in Parametric Bayesian Models

Functional Conjugacy in Parametric Bayesian Models Functional Conjugacy in Parametric Bayesian Models Peter Orbanz University of Cambridge Abstract We address a basic question in Bayesian analysis: Can updates of the posterior under observations be represented

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Stochastic Processes, Kernel Regression, Infinite Mixture Models

Stochastic Processes, Kernel Regression, Infinite Mixture Models Stochastic Processes, Kernel Regression, Infinite Mixture Models Gabriel Huang (TA for Simon Lacoste-Julien) IFT 6269 : Probabilistic Graphical Models - Fall 2018 Stochastic Process = Random Function 2

More information

Limit Theorems for Exchangeable Random Variables via Martingales

Limit Theorems for Exchangeable Random Variables via Martingales Limit Theorems for Exchangeable Random Variables via Martingales Neville Weber, University of Sydney. May 15, 2006 Probabilistic Symmetries and Their Applications A sequence of random variables {X 1, X

More information

G(t) := i. G(t) = 1 + e λut (1) u=2

G(t) := i. G(t) = 1 + e λut (1) u=2 Note: a conjectured compactification of some finite reversible MCs There are two established theories which concern different aspects of the behavior of finite state Markov chains as the size of the state

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

Nonparametric inference for ergodic, stationary time series.

Nonparametric inference for ergodic, stationary time series. G. Morvai, S. Yakowitz, and L. Györfi: Nonparametric inference for ergodic, stationary time series. Ann. Statist. 24 (1996), no. 1, 370 379. Abstract The setting is a stationary, ergodic time series. The

More information

An ergodic theorem for partially exchangeable random partitions

An ergodic theorem for partially exchangeable random partitions Electron. Commun. Probab. 22 (2017), no. 64, 1 10. DOI: 10.1214/17-ECP95 ISSN: 1083-589X ELECTRONIC COMMUNICATIONS in PROBABILITY An ergodic theorem for partially exchangeable random partitions Jim Pitman

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

Department of Statistics. University of California. Berkeley, CA May 1998

Department of Statistics. University of California. Berkeley, CA May 1998 Prediction rules for exchangeable sequences related to species sampling 1 by Ben Hansen and Jim Pitman Technical Report No. 520 Department of Statistics University of California 367 Evans Hall # 3860 Berkeley,

More information

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M. Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although

More information

Dirichlet Process. Yee Whye Teh, University College London

Dirichlet Process. Yee Whye Teh, University College London Dirichlet Process Yee Whye Teh, University College London Related keywords: Bayesian nonparametrics, stochastic processes, clustering, infinite mixture model, Blackwell-MacQueen urn scheme, Chinese restaurant

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Nonparametric Bayesian Uncertainty Quantification

Nonparametric Bayesian Uncertainty Quantification Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery

More information

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Annalisa Cerquetti Collegio Nuovo, University of Pavia, Italy Interview Day, CETL Lectureship in Statistics,

More information

Random function priors for exchangeable arrays with applications to graphs and relational data

Random function priors for exchangeable arrays with applications to graphs and relational data Random function priors for exchangeable arrays with applications to graphs and relational data James Robert Lloyd Department of Engineering University of Cambridge Peter Orbanz Department of Statistics

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Hybrid Dirichlet processes for functional data

Hybrid Dirichlet processes for functional data Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian Sparse Linear Regression with Unknown Symmetric Error Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of

More information

Properties of Bayesian nonparametric models and priors over trees

Properties of Bayesian nonparametric models and priors over trees Properties of Bayesian nonparametric models and priors over trees David A. Knowles Computer Science Department Stanford University July 24, 2013 Introduction Theory: what characteristics might we want?

More information

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010 Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT

More information

ICES REPORT Model Misspecification and Plausibility

ICES REPORT Model Misspecification and Plausibility ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin

More information

An inverse of Sanov s theorem

An inverse of Sanov s theorem An inverse of Sanov s theorem Ayalvadi Ganesh and Neil O Connell BRIMS, Hewlett-Packard Labs, Bristol Abstract Let X k be a sequence of iid random variables taking values in a finite set, and consider

More information

Construction of Dependent Dirichlet Processes based on Poisson Processes

Construction of Dependent Dirichlet Processes based on Poisson Processes 1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

Stochastic Realization of Binary Exchangeable Processes

Stochastic Realization of Binary Exchangeable Processes Stochastic Realization of Binary Exchangeable Processes Lorenzo Finesso and Cecilia Prosdocimi Abstract A discrete time stochastic process is called exchangeable if its n-dimensional distributions are,

More information

General Glivenko-Cantelli theorems

General Glivenko-Cantelli theorems The ISI s Journal for the Rapid Dissemination of Statistics Research (wileyonlinelibrary.com) DOI: 10.100X/sta.0000......................................................................................................

More information

Non-parametric Clustering with Dirichlet Processes

Non-parametric Clustering with Dirichlet Processes Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Predictivist Bayes density estimation

Predictivist Bayes density estimation Predictivist Bayes density estimation P. Richard Hahn Abstract: This paper develops a novel computational approach for Bayesian density estimation, using a kernel density representation of the Bayesian

More information

BOOK REVIEW PERSI DIACONIS

BOOK REVIEW PERSI DIACONIS BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0273-0979(XX)0000-0 BOOK REVIEW PERSI DIACONIS Probabilistic Symmetries and Invariance Principles by Olav

More information

Abrahamse, A.F. (1970). A comparison between the Martin boundary theory and the theory of likelihood ratios. Ann. Math. Statist. ~ l

Abrahamse, A.F. (1970). A comparison between the Martin boundary theory and the theory of likelihood ratios. Ann. Math. Statist. ~ l LITERATURE Abrahamse, A.F. (1970). A comparison between the Martin boundary theory and the theory of likelihood ratios. Ann. Math. Statist. ~ l 1064-1067., Accardi, L. and Pistone, G. (1982). de Finetti's

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Dependent hierarchical processes for multi armed bandits

Dependent hierarchical processes for multi armed bandits Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino

More information

Gaussian with mean ( µ ) and standard deviation ( σ)

Gaussian with mean ( µ ) and standard deviation ( σ) Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (

More information

WXML Final Report: Chinese Restaurant Process

WXML Final Report: Chinese Restaurant Process WXML Final Report: Chinese Restaurant Process Dr. Noah Forman, Gerandy Brito, Alex Forney, Yiruey Chou, Chengning Li Spring 2017 1 Introduction The Chinese Restaurant Process (CRP) generates random partitions

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

Fast Non-Parametric Bayesian Inference on Infinite Trees

Fast Non-Parametric Bayesian Inference on Infinite Trees Marcus Hutter - 1 - Fast Bayesian Inference on Trees Fast Non-Parametric Bayesian Inference on Infinite Trees Marcus Hutter Istituto Dalle Molle di Studi sull Intelligenza Artificiale IDSIA, Galleria 2,

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Scalable Gaussian process models on matrices and tensors

Scalable Gaussian process models on matrices and tensors Scalable Gaussian process models on matrices and tensors Alan Qi CS & Statistics Purdue University Joint work with F. Yan, Z. Xu, S. Zhe, and IBM Research! Models for graph and multiway data Model Algorithm

More information

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d

More information

Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises

Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises Hongwei Long* Department of Mathematical Sciences, Florida Atlantic University, Boca Raton Florida 33431-991,

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models

Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models Bayesian Analysis (2014) 9, Number 2, pp. 307 330 Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models Juhee Lee Steven N. MacEachern Yiling Lu Gordon B. Mills Abstract. We address

More information

Kernel families of probability measures. Saskatoon, October 21, 2011

Kernel families of probability measures. Saskatoon, October 21, 2011 Kernel families of probability measures Saskatoon, October 21, 2011 Abstract The talk will compare two families of probability measures: exponential, and Cauchy-Stjelties families. The exponential families

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

HOMOGENEOUS CUT-AND-PASTE PROCESSES

HOMOGENEOUS CUT-AND-PASTE PROCESSES HOMOGENEOUS CUT-AND-PASTE PROCESSES HARRY CRANE Abstract. We characterize the class of exchangeable Feller processes on the space of partitions with a bounded number of blocks. This characterization leads

More information

Bayesian Consistency for Markov Models

Bayesian Consistency for Markov Models Bayesian Consistency for Markov Models Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. Stephen G. Walker University of Texas at Austin, USA. Abstract We consider sufficient conditions for

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0074.R2 Title The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors Manuscript ID SS-2017-0074.R2 URL http://www.stat.sinica.edu.tw/statistica/

More information

Hakone Seminar Recent Developments in Statistics

Hakone Seminar Recent Developments in Statistics Hakone Seminar Recent Developments in Statistics November 12-14, 2015 Hotel Green Plaza Hakone: http://www.hgp.co.jp/language/english/sp/ Organizer: Masanobu TANIGUCHI (Research Institute for Science &

More information

Semiparametric posterior limits

Semiparametric posterior limits Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations

More information

The Essential Equivalence of Pairwise and Mutual Conditional Independence

The Essential Equivalence of Pairwise and Mutual Conditional Independence The Essential Equivalence of Pairwise and Mutual Conditional Independence Peter J. Hammond and Yeneng Sun Probability Theory and Related Fields, forthcoming Abstract For a large collection of random variables,

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information