Exchangeability. Peter Orbanz. Columbia University
|
|
- Gloria Johnson
- 6 years ago
- Views:
Transcription
1 Exchangeability Peter Orbanz Columbia University
2 PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] Inference idea data = underlying pattern + independent noise Peter Orbanz 2 / 25
3 TERMINOLOGY Parametric model Number of parameters fixed (or constantly bounded) w.r.t. sample size Nonparametric model Number of parameters grows with sample size -dimensional parameter space Example: Density estimation x 2 p(x) µ x 1 Parametric Nonparametric Peter Orbanz 3 / 25
4 NONPARAMETRIC BAYESIAN MODEL Definition A nonparametric Bayesian model is a Bayesian model on an -dimensional parameter space. Interpretation Parameter space T = set of possible patterns. Recall previous tutorials: Model T Application Gaussian process Smooth functions Regression problems DP mixtures Smooth densities Density estimation CRP, 2-param. CRP Parititons Clustering Solution to Bayesian problem = posterior distribution on patterns Peter Orbanz [Sch95] 4 / 25
5 DE FINETTI S THEOREM Infinite exchangeability For all π S (= infinite symmetric group): P(X 1, X 2,... ) = P(X π(1), X π(2),...) or π(p) = P Theorem (de Finetti) P exchangeable P(X 1, X 2,... ) = M(X ) ( n=1 ) Q(X n) dν(q) Q is a random measure ν uniquely determined by P Peter Orbanz 5 / 25
6 FINITE EXCHANGEABILITY Finite sequence X 1,..., X n Exchangeability of finite sequence de Finetti-representation Example: Two exchangeable random bits X 1 = 0 X 1 = 1 X 2 = 0 0 1/2 X 2 = 1 1/2 0 Suppose de Finetti holds; then { } P(X1 = X 2 = 1) = [0,1] 0 = p2 dν(p) P(X 1 = X 2 = 0) = (1 [0,1] p)2 dν(p) ν{p = 0} = 1 ν{p = 1} = 1 Intuition Finite exchangeability does not eliminate sequential patterns. Peter Orbanz [DF80] 6 / 25
7 SUPPORT OF PRIORS Model P 0 outside model: misspecified P 0 = P θ0 M(X ) Peter Orbanz [Gho10, KvdV06] 7 / 25
8 SUPPORT OF NONPARAMETRIC PRIORS Large support Support of nonparametric priors is larger ( -dimensional) than of parametric priors (finite-dimensional). However: No uniform prior (or even neutral improper prior) exists on M(X ). Interpretation of nonparametric prior assumptions Concentration of nonparametric prior on subset of M(X ) typically represents structural prior assumption. GP regression with unknown bandwidth: Any continuous function possible Prior can express e.g. very smooth functions are more probable Clustering: Expected number of clusters is......small CRP prior...power law two-parameter CRP Peter Orbanz 8 / 25
9 PARAMETERIZED MODELS Probability model Ω X X(ω) X P(X) = X[P] P ω Θ Parameterized model P[X Θ] Θ(ω) T Ω X X F M(X ) P T T Θ P = {P[X θ] θ T } F law of large numbers T : P[. Θ = θ] θ bijection Θ := T F X Peter Orbanz [Sch95] 9 / 25
10 JUSTIFICATION: BY EXCHANGEABILITY Again: de Finetti P(X 1, X 2,... ) = M(X ) ( n=1 ) Q(X n) dν(q) = T ( n=1 ) Q(X n Θ = θ) dν T (θ) Θ random measure (since Θ(ω) M(X )) Convergence results The de Finetti theorem comes with a convergence result attached: Empirical measure: F n weakly θ as n Posterior Λ n(θ X 1,..., X n) = Λ n(., ω) in M(T ) exists Posterior convergence: Λ n(., ω) n δ Θ(ω) Peter Orbanz [Kal01] 10 / 25
11 SPECIAL TYPES OF EXCHANGEABLE DATA
12 MODIFICATIONS Pólya Urns P(X n+1 X 1 = x 1,..., X n = x n) = 1 α + n Exchangeable: ν is DP(α, G 0) n δ xj (X n+1) + α α + n G0(Xn+1) j=1 n=1 Q(Xn θ) = n=1 θ(xn) = n=1 ( Exchangeable increment processes (H. Bühlmann) ) j=1 cjδt j(x n) Stationary, exchangeable increment process = mixture of Lévy processes P((X t) t R+ ) = L α,γ,µ((x t) t R+ )dν(α, γ, µ) L α,γ,µ = Lévy process with jump measure µ Peter Orbanz [B 60, Kal01] 12 / 25
13 MODIFICATION 2: RANDOM PARTITIONS Random partition of N Π = {B 1, B 2,...} e.g. {{1, 3, 5,...}, {2, 4}, {10},...} Paint-box distribution Weights s 1, s 2,... 0 with s j 1 U 3 U 1 U 2 U 1, U 2,... Uniform[0, 1] s 1 s 2 Sampling Π β[. s]: 1 j sj i, j N in same block U i, U j in same interval {i} separate block U i in interval 1 s j Theorem (Kingman) Π exchangeable P(Π. ) = β[π. s]q(ds) Peter Orbanz [Kin78] 13 / 25
14 ROTATION INVARIANCE Rotatable sequence Infinite case P n(x 1,..., X n) = P n(r n(x 1,..., X n)) for all R n O(n) X 1, X 2,... rotatable : X 1,..., X n rotatable for all n Theorem (Freedman) Infinite sequence rotatable iff N σ denotes (0, σ)-gaussian P(X 1, X 2,... ) = R + ( n=1 ) N σ(x n) dν R+ (σ) Peter Orbanz 14 / 25
15 TWO INTERPRETATIONS As special case of de Finetti Rotatable exchangeable General de Finetti: Parameter space T = M(X ) Rotation invariance: T shrinks to {N σ σ R +} As invariance under different symmetry Exchangeability = invariance of P(X 1, X 2,...) under group action Freedman: Different group (O(n) rather than S ) In these cases: symmetry decomposition theorem Peter Orbanz 15 / 25
16 NON-EXCHANGEABLE DATA
17 EXCHANGEABILITY: RANDOM GRAPHS Random graph with independent edges Given: θ : [0, 1] 2 [0, 1] symmetric function U 1, U 2,... Uniform[0, 1] Edge (i, j) present: (i, j) Bernoulli(θ(U i, U j)) θ Call this distribution Γ(G. θ) Theorem (Aldous; Hoover) A random (dense) graph G is exchangeable iff P(G. ) = T Γ(G. θ)q(dθ) Peter Orbanz [Ald81, Hoo79] 17 / 25
18 EXCHANGEABILITY: RANDOM GRAPHS Random graph with independent edges Given: θ : [0, 1] 2 [0, 1] symmetric function U 1, U 2,... Uniform[0, 1] Edge (i, j) present: (i, j) Bernoulli(θ(U i, U j)) Call this distribution Γ(G. θ). 0 0 U 1 U 2 U 1 U θ 0 Pr{edge 1, 2} 1 Theorem (Aldous; Hoover) A random (dense) graph G is exchangeable iff P(G. ) = T Γ(G. θ)q(dθ) Peter Orbanz [Ald81, Hoo79] 17 / 25
19 DE FINETTI: GEOMETRY Finite case e 1 P = e i E ν ie i ν 1 E = {e 1, e 2, e 3} (ν 1, ν 2, ν 3) barycentric coordinates P ν 2 ν 3 Infinite/continuous case P(. ) = E e(. )dν(e) = T e 2 e 3 k(θ,. )dν T (θ) k : T E M(X ) probability kernel (= conditional probability) k is random measure with values k(θ,. ) E de Finetti: k(θ,. ) = n N Q(. θ) and T = M(X ) Peter Orbanz 18 / 25
20 DECOMPOSITION BY SYMMETRY Theorem (Varadarajan) G nice group on space Y Call measure µ ergodic if µ(a) {0, 1} for all G-invariant sets A. E := {ergodic probability measures} Then there is a Markov kernel k : Y E s.t.: P M(V) G-invariant P(A) = k(θ, A)dν(θ) de Finetti G = S and Y = X G-invariant sets = exchangeable events E = factorial distributions ( Hewitt-Savage 0-1 law ) T Peter Orbanz [Var63] 19 / 25
21 SYMMETRY AND SUFFICIENCY
22 SUFFICIENT STATISTICS Problem Apparently no direct connection with standard models Sufficient Statistic Functions S n of data sufficient if: Intuitively: Formally: S n(x 1,..., X n) contains all information sample provides on parameter P n(x 1,..., X n Θ, S n) = P(X 1,..., X n S) for all n Sufficiency and symmetry P exchangeable S n(x 1,..., x n) = 1 n n i=1 δxn sufficient n P rotatable S n(x 1,..., x n) = i=1 x2 i = (x 1,..., x n) 2 sufficient Peter Orbanz 21 / 25
23 DECOMPOSITION BY SUFFICIENCY Theorem (Diaconis and Freedman; Lauritzen; several others) Given: Sufficient statistic S n for each n k n(., s n) = conditional probability of X 1,..., X n given s n 1. k n converges to a limit function: k n(., S n(x 1(ω),..., X n(ω))) n k (., ω) 2. P(X 1, X 2,... ) has the decomposition P(. ) = k (., ω)dν(ω) 3. The model P M(X ) is a convex set with extreme points k (., ω) 4. The measure ν is uniquely determined by P (Theorem statement omits technical conditions.) Peter Orbanz 22 / 25
24 EXAMPLES de Finetti s theorem P exchangeable S n(x 1,..., x n) = 1 n n δ xn sufficient i=1 Rotation invariance P rotatable S n(x 1,..., x n) = (x 1,..., x n) 2 sufficient Kingman s theorem Π exchangeable asymptotic block sizes are sufficient statistic Exponential families (Küchler and Lauritzen) Choose X = R. Under suitable regularity conditions: S n additive, i.e. S n(x 1,..., x n) = 1 n n S 0(x i) i=1 if and only if ergodic measures are exponential family. Peter Orbanz [KL89] 23 / 25
25 SUMMARY Non-exchangeable data Identify invariance principle and its ergodic measures Ergodic measures generalize i.i.d. distributions likelihood Prior = distribution on ergodic measures Random structure Theorem of Mixtures of... Exchangeable sequences de Finetti product distributions Hewitt & Savage Processes with exch. increments Bühlmann Lévy processes Exchangeable partitions Kingman "paint-box distributions" Exchangeable arrays Aldous sampling scheme on [0, 1] 2 Hoover Kallenberg Block-exchangeable sequences Diaconis & Freedman Markov chains Exchangeable R d -sequences with Küchler & Lauritzen Exponential families additive sufficient statistics Peter Orbanz 24 / 25
26 REFERENCES I [Ald81] David J. Aldous. Representations for partially exchangeable arrays of random variables. J. Multivariate Anal., 11(4): , [B 60] H. Bühlmann. Austauschbare stochastische Variabeln und ihre Grenzwertsätze. PhD thesis, University of California Press, [DF80] P. Diaconis and D. Freedman. Finite exchangeable sequences. The Annals of Probability, 8(4):pp , [Gho10] S. Ghosal. Dirichlet process, related priors and posterior asymptotics. In N. L. Hjort et al., editors, Bayesian Nonparametrics, pages Cambridge University Press, [Hoo79] D. N. Hoover. Relations on probability spaces and arrays of random variables. Technical report, Institute of Advanced Study, Princeton, [Kal01] O. Kallenberg. Foundations of Modern Probability. Springer, 2nd edition, [Kin78] J. F. C. Kingman. The representation of partition structures. J. London Math. Soc., 2(18): , [KL89] U. Küchler and S. L. Lauritzen. Exponential families, extreme point models and minimal space-time invariant functions for stochastic processes with stationary and independent increments. Scand. J. Stat., 16: , [KvdV06] B. J. K. Kleijn and A. W. van der Vaart. Misspecification in infinite-dimensional Bayesian statistics. Annals of Statistics, 34(2): , [Sch95] M. J. Schervish. Theory of Statistics. Springer, [Var63] V. S. Varadarajan. Groups of automorphisms of Borel spaces. Transactions of the American Mathematical Society, 109(2):pp , Peter Orbanz 25 / 25
Bayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationLecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu
Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro
More informationBayesian Nonparametrics
Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationBayesian nonparametrics
Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability
More informationModern Bayesian Nonparametrics
Modern Bayesian Nonparametrics Peter Orbanz Yee Whye Teh Cambridge University and Columbia University Gatsby Computational Neuroscience Unit, UCL NIPS 2011 Peter Orbanz & Yee Whye Teh 1 / 71 OVERVIEW 1.
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationarxiv: v2 [math.st] 7 Jan 2011
CONJUGATE PROJECTIVE LIMITS Peter Orbanz University of Cambridge arxiv:1012.0363v2 [math.st] 7 Jan 2011 We characterize conjugate nonparametric Bayesian models as projective limits of conjugate, finite-dimensional
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationBayesian nonparametric models of sparse and exchangeable random graphs
Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /
More informationBayesian nonparametric models for bipartite graphs
Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers
More informationActa Universitatis Carolinae. Mathematica et Physica
Acta Universitatis Carolinae. Mathematica et Physica František Žák Representation form of de Finetti theorem and application to convexity Acta Universitatis Carolinae. Mathematica et Physica, Vol. 52 (2011),
More informationErgodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.
Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions
More informationNonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5)
Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Tamara Broderick ITT Career Development Assistant Professor Electrical Engineering & Computer Science MIT Bayes Foundations
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationCS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I
X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet
More informationSome highlights from the theory of multivariate symmetries
Rendiconti di Matematica, Serie VII Volume 28, Roma (2008), 19 32 Some highlights from the theory of multivariate symmetries OLAV KALLENBERG Abstract: We explain how invariance in distribution under separate
More informationExchangeable random arrays
Exchangeable random arrays Tim Austin Notes for IAS workshop, June 2012 Abstract Recommended reading: [Ald85, Aus08, DJ07, Ald10]. Of these, [Aus08] and [DJ07] give special emphasis to the connection with
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationFunctional Conjugacy in Parametric Bayesian Models
Functional Conjugacy in Parametric Bayesian Models Peter Orbanz University of Cambridge Abstract We address a basic question in Bayesian analysis: Can updates of the posterior under observations be represented
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationStochastic Processes, Kernel Regression, Infinite Mixture Models
Stochastic Processes, Kernel Regression, Infinite Mixture Models Gabriel Huang (TA for Simon Lacoste-Julien) IFT 6269 : Probabilistic Graphical Models - Fall 2018 Stochastic Process = Random Function 2
More informationLimit Theorems for Exchangeable Random Variables via Martingales
Limit Theorems for Exchangeable Random Variables via Martingales Neville Weber, University of Sydney. May 15, 2006 Probabilistic Symmetries and Their Applications A sequence of random variables {X 1, X
More informationG(t) := i. G(t) = 1 + e λut (1) u=2
Note: a conjectured compactification of some finite reversible MCs There are two established theories which concern different aspects of the behavior of finite state Markov chains as the size of the state
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationNonparametric inference for ergodic, stationary time series.
G. Morvai, S. Yakowitz, and L. Györfi: Nonparametric inference for ergodic, stationary time series. Ann. Statist. 24 (1996), no. 1, 370 379. Abstract The setting is a stationary, ergodic time series. The
More informationAn ergodic theorem for partially exchangeable random partitions
Electron. Commun. Probab. 22 (2017), no. 64, 1 10. DOI: 10.1214/17-ECP95 ISSN: 1083-589X ELECTRONIC COMMUNICATIONS in PROBABILITY An ergodic theorem for partially exchangeable random partitions Jim Pitman
More informationA nonparametric test for path dependence in discrete panel data
A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,
More informationA Brief Overview of Nonparametric Bayesian Models
A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine
More informationDepartment of Statistics. University of California. Berkeley, CA May 1998
Prediction rules for exchangeable sequences related to species sampling 1 by Ben Hansen and Jim Pitman Technical Report No. 520 Department of Statistics University of California 367 Evans Hall # 3860 Berkeley,
More informationLecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.
Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although
More informationDirichlet Process. Yee Whye Teh, University College London
Dirichlet Process Yee Whye Teh, University College London Related keywords: Bayesian nonparametrics, stochastic processes, clustering, infinite mixture model, Blackwell-MacQueen urn scheme, Chinese restaurant
More informationBayesian Nonparametrics for Speech and Signal Processing
Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer
More informationNonparametric Bayesian Uncertainty Quantification
Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery
More informationBayesian Nonparametrics: some contributions to construction and properties of prior distributions
Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Annalisa Cerquetti Collegio Nuovo, University of Pavia, Italy Interview Day, CETL Lectureship in Statistics,
More informationRandom function priors for exchangeable arrays with applications to graphs and relational data
Random function priors for exchangeable arrays with applications to graphs and relational data James Robert Lloyd Department of Engineering University of Cambridge Peter Orbanz Department of Statistics
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationHybrid Dirichlet processes for functional data
Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationBayesian Sparse Linear Regression with Unknown Symmetric Error
Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of
More informationProperties of Bayesian nonparametric models and priors over trees
Properties of Bayesian nonparametric models and priors over trees David A. Knowles Computer Science Department Stanford University July 24, 2013 Introduction Theory: what characteristics might we want?
More informationStatistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010
Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT
More informationICES REPORT Model Misspecification and Plausibility
ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin
More informationAn inverse of Sanov s theorem
An inverse of Sanov s theorem Ayalvadi Ganesh and Neil O Connell BRIMS, Hewlett-Packard Labs, Bristol Abstract Let X k be a sequence of iid random variables taking values in a finite set, and consider
More informationConstruction of Dependent Dirichlet Processes based on Poisson Processes
1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More informationA PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction
tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type
More informationMean-field dual of cooperative reproduction
The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationLecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.
1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January
More informationStochastic Realization of Binary Exchangeable Processes
Stochastic Realization of Binary Exchangeable Processes Lorenzo Finesso and Cecilia Prosdocimi Abstract A discrete time stochastic process is called exchangeable if its n-dimensional distributions are,
More informationGeneral Glivenko-Cantelli theorems
The ISI s Journal for the Rapid Dissemination of Statistics Research (wileyonlinelibrary.com) DOI: 10.100X/sta.0000......................................................................................................
More informationNon-parametric Clustering with Dirichlet Processes
Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationJUSTIN HARTMANN. F n Σ.
BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationGoodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach
Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationPredictivist Bayes density estimation
Predictivist Bayes density estimation P. Richard Hahn Abstract: This paper develops a novel computational approach for Bayesian density estimation, using a kernel density representation of the Bayesian
More informationBOOK REVIEW PERSI DIACONIS
BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0273-0979(XX)0000-0 BOOK REVIEW PERSI DIACONIS Probabilistic Symmetries and Invariance Principles by Olav
More informationAbrahamse, A.F. (1970). A comparison between the Martin boundary theory and the theory of likelihood ratios. Ann. Math. Statist. ~ l
LITERATURE Abrahamse, A.F. (1970). A comparison between the Martin boundary theory and the theory of likelihood ratios. Ann. Math. Statist. ~ l 1064-1067., Accardi, L. and Pistone, G. (1982). de Finetti's
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationDependent hierarchical processes for multi armed bandits
Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationWXML Final Report: Chinese Restaurant Process
WXML Final Report: Chinese Restaurant Process Dr. Noah Forman, Gerandy Brito, Alex Forney, Yiruey Chou, Chengning Li Spring 2017 1 Introduction The Chinese Restaurant Process (CRP) generates random partitions
More informationPriors for the frequentist, consistency beyond Schwartz
Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationFast Non-Parametric Bayesian Inference on Infinite Trees
Marcus Hutter - 1 - Fast Bayesian Inference on Trees Fast Non-Parametric Bayesian Inference on Infinite Trees Marcus Hutter Istituto Dalle Molle di Studi sull Intelligenza Artificiale IDSIA, Galleria 2,
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationScalable Gaussian process models on matrices and tensors
Scalable Gaussian process models on matrices and tensors Alan Qi CS & Statistics Purdue University Joint work with F. Yan, Z. Xu, S. Zhe, and IBM Research! Models for graph and multiway data Model Algorithm
More informationStein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d
More informationLeast Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises
Least Squares Estimators for Stochastic Differential Equations Driven by Small Lévy Noises Hongwei Long* Department of Mathematical Sciences, Florida Atlantic University, Boca Raton Florida 33431-991,
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationLocal-Mass Preserving Prior Distributions for Nonparametric Bayesian Models
Bayesian Analysis (2014) 9, Number 2, pp. 307 330 Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models Juhee Lee Steven N. MacEachern Yiling Lu Gordon B. Mills Abstract. We address
More informationKernel families of probability measures. Saskatoon, October 21, 2011
Kernel families of probability measures Saskatoon, October 21, 2011 Abstract The talk will compare two families of probability measures: exponential, and Cauchy-Stjelties families. The exponential families
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationHOMOGENEOUS CUT-AND-PASTE PROCESSES
HOMOGENEOUS CUT-AND-PASTE PROCESSES HARRY CRANE Abstract. We characterize the class of exchangeable Feller processes on the space of partitions with a bounded number of blocks. This characterization leads
More informationBayesian Consistency for Markov Models
Bayesian Consistency for Markov Models Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. Stephen G. Walker University of Texas at Austin, USA. Abstract We consider sufficient conditions for
More informationLecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.
Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2017-0074.R2 Title The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors Manuscript ID SS-2017-0074.R2 URL http://www.stat.sinica.edu.tw/statistica/
More informationHakone Seminar Recent Developments in Statistics
Hakone Seminar Recent Developments in Statistics November 12-14, 2015 Hotel Green Plaza Hakone: http://www.hgp.co.jp/language/english/sp/ Organizer: Masanobu TANIGUCHI (Research Institute for Science &
More informationSemiparametric posterior limits
Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations
More informationThe Essential Equivalence of Pairwise and Mutual Conditional Independence
The Essential Equivalence of Pairwise and Mutual Conditional Independence Peter J. Hammond and Yeneng Sun Probability Theory and Related Fields, forthcoming Abstract For a large collection of random variables,
More informationGentle Introduction to Infinite Gaussian Mixture Modeling
Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for
More informationGaussian processes for inference in stochastic differential equations
Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017
More information