Variational Inference with Copula Augmentation

Size: px
Start display at page:

Download "Variational Inference with Copula Augmentation"

Transcription

1 Variational Inference with Copula Augmentation Dustin Tran 1 David M. Blei 2 Edoardo M. Airoldi 1 1 Department of Statistics, Harvard University 2 Department of Statistics & Computer Science, Columbia University Presented by Shaobo Han, Duke University September 4, 2015

2 Outline 1 Introduction 2 Background: Vine Pair Copulas 3 Copula Variational Inference Sampling from the copula-augmented variational distribution Calculating the gradients 4 Experiments Mixture of Gaussians Latent space model D. Tran et al., 2015 Variational Inference with Copula Augmentation 1 / 15

3 Introduction The authors aim to do scalable, generic Bayesian inference: p(z x) q(z λ) Mean-field VI is fast but highly biased, underestimates the variance, and is sensitive to local optima (and hyper-parameter) Structured VI incorporates dependency but requires explicit knowledge of model and is difficult to construct The proposed approach automatically learns the dependency structure within a black box framework, and generalizes both approaches. D. Tran et al., 2015 Variational Inference with Copula Augmentation 2 / 15

4 Outline 1 Introduction 2 Background: Vine Pair Copulas 3 Copula Variational Inference Sampling from the copula-augmented variational distribution Calculating the gradients 4 Experiments Mixture of Gaussians Latent space model D. Tran et al., 2015 Variational Inference with Copula Augmentation 2 / 15

5 Variational inference Variational inference minimizes KL(q p) by maximizing the ELBO L(λ) E q [log p(x, z)] E q [log q(z λ)] }{{}}{{} energy entropy (1) Any random variable z = {z 1,..., z d } q can be factorized as [ d ] q(z) = q(z i ) c(q(z 1 ),..., Q(z d )) (2) i=1 where c is a joint density known as the copula. Bivariate Gaussian copula: c Gaussian (u 1, u 2 ; ρ) Φ ρ (Φ 1 (u 1 ), Φ 1 (u 2 )) D. Tran et al., 2015 Variational Inference with Copula Augmentation 3 / 15

6 Vine copulas Limitations: Standard multivariate copulas can be inflexible in high dimensions do not allow for different pairwise dependency structures Vine copulas for higher-dimensional data Bivariate copulas are building blocks, selected from a wide range of (parametric) families The dependency structure is determined by the bivariate copulas and a nest set of trees. D. Tran et al., 2015 Variational Inference with Copula Augmentation 4 / 15

7 Preliminaries: bivariate copulas Key basic identities Sklar s theorem (1959)[1]: F (x 1, x 2 ) = C(F 1 (x 1 ), F 2 (x 2 )) (3) Joint density f(x 1, x 2 ) = c 12 (F 1 (x 1 ), F 2 (x 2 )) f 1 (x 1 ) f 2 (x 2 ) (4) Conditional density f(x 2 x 1 ) = c 12 (F 1 (x 1 ), F 2 (x 2 )) f 2 (x 2 ) (5) Conditional distribution function F (x 2 x 1 ) = C 12 (F 1 (x 1 ), F 2 (x 2 ))/ F 1 (x 1 ) (6) [1] A. Sklar, Fonctions de Répartition à n Dimensions Et Leurs Marges, 1959 D. Tran et al., 2015 Variational Inference with Copula Augmentation 5 / 15

8 Pair-copula construction (PCC) Represent a density f(x 1,..., x d ) as a product of pair copula densities and marginal densities Example [2]: d = 3 dimensions. One possible decomposition of f(x 1, x 2, x 3 ) = f 1 (x 1 ) f 2 (x 2 ) f 3 (x 3 ) c 12 (F 1 (x 1 ), F 2 (x 2 )) c 23 (F 2 (x 2 ), F 3 (x 3 )) c 13 2 (F 1 2 (x 1 x 2 ), F 3 2 (x 3 x 2 )) For high-dimensional distributions, there are a significant number of possible pair-copula constructions. [2] N. Krämer & U. Schepsmeier, Introduction to Vine Copulas, NIPS workshop, 2011 D. Tran et al., 2015 Variational Inference with Copula Augmentation 6 / 15

9 Regular vine structure Bedford and Cooke (2001) [3] introduce graphical models denoted regular vines structure (R-vines) to help organize them. Regular vine A regular vine is a sequence of d 1 linked trees where: Tree T 1 is a tree on nodes 1 to d Tree T j has d + 1 j nodes and d j edges Edges in tree T j become nodes in tree T j+1 Proximity condition: Two nodes in tree T j+1 can be joined by an edge only if the corresponding edges in tree T j share a node [3] T.Bedford & R. Cooke, Probabilistic density decomposition for conditionally dependent random variables modeled by vines, 2001 D. Tran et al., 2015 Variational Inference with Copula Augmentation 7 / 15

10 Example [4]: Density f =f 1 f 2 f 3 f 4 f 5 c 14 c 15 c 24 c 34 c 12 4 c 13 4 c 45 1 c c c Multivariate copula Product of pair copula: c(u 1,..., u d ; η) = d 1 j=1 e(i,k) E j c ik D(e) (7) [4] C. Czado & K. Was, Pair-copula constructions -even more flexible than copulas, 2013 D. Tran et al., 2015 Variational Inference with Copula Augmentation 8 / 15

11 Outline 1 Introduction 2 Background: Vine Pair Copulas 3 Copula Variational Inference Sampling from the copula-augmented variational distribution Calculating the gradients 4 Experiments Mixture of Gaussians Latent space model D. Tran et al., 2015 Variational Inference with Copula Augmentation 8 / 15

12 Methodology λ: the original parameters (mean-field or structured) η: the augmented parameters (copula). [ d ] q(z λ, η) = q(z i λ) c(q(z 1 λ),..., Q(z 2 λ); η) }{{} i=1 }{{} copula mean-field (8) Gradients Expectations: {λ,η} L = E q [ {λ,η} log q(z λ, η) (log p(x, z) log q(z λ, η))] (9) D. Tran et al., 2015 Variational Inference with Copula Augmentation 9 / 15

13 Difficulties: 1. Sample from q 2. Calculate the gradient log q D. Tran et al., 2015 Variational Inference with Copula Augmentation 10 / 15

14 Simulation from an R-Vine copula model [5] 1. Generate u = (u 1,..., u d ) where each u i U(0, 1) 2. Calculate v = (v 1,..., v d ) which follows a joint uniform distribution with dependencies given by the copula: v 1 = u 1 v 2 = Q (u 2 v 1 ) v 3 = Q (u 3 v 1, v 2 ). v d = Q 1 d 12...d 1 (u d v 1, v 2,..., v d 1 ) 3. Calculate z = (Q 1 1 (v 1),..., Q 1 d (v d)), which is a sample from the copula-augmented distribution q(z λ, η). Use a recursive approach, refer to [5] for more details. [5] J. Dissmann, Statistical Inference for Regular Vines and Application, 2010 D. Tran et al., 2015 Variational Inference with Copula Augmentation 11 / 15

15 Calculating the gradients {λ,η} log q(z λ, η) [ d λ = i=1 log q(z i λ i ) + λ log c(q(z 1 λ),..., Q(z d λ); η) η log c(q(z 1 λ),..., Q(z d λ); η) ] (10) λi log q(z λ, η) = λi log q(z i λ i ) + Q(zi λ i ) log c(q(z 1 λ),..., Q(z d λ); η) λi Q(z i λ i ) d 1 = λi log q(z i λ i ) + λi Q(z i λ i ) j=1 ηi log c(q(z 1 λ),..., Q(z d λ); η) = e(k,l) E j : i C(e) d 1 e edge; C conditioning set; D conditioned set. Q(zi λ i ) log c kl D(e) j=1 e(k,l) E j : e ηi {C(e),D(e)} ηi log c kl D(e) D. Tran et al., 2015 Variational Inference with Copula Augmentation 12 / 15

16 Outline 1 Introduction 2 Background: Vine Pair Copulas 3 Copula Variational Inference Sampling from the copula-augmented variational distribution Calculating the gradients 4 Experiments Mixture of Gaussians Latent space model D. Tran et al., 2015 Variational Inference with Copula Augmentation 12 / 15

17 Implementations Automatic differentiation tools [6] Variance reduction: # of samples m = 1024 ADAM [7]: adaptive learning rate schedule combines ideas from AdaGrad and RMSprop [6] Stan Development Team. Stan: A c++ library for probability and sampling, 2014 [7] D. Kingman and J. Lei Ba. Adam: a method for stochastic optimization, ICLR, 2015 D. Tran et al., 2015 Variational Inference with Copula Augmentation 13 / 15

18 Mixture of Gaussians Classic example which stresses the difficulty of modeling dependency. D. Tran et al., 2015 Variational Inference with Copula Augmentation 14 / 15

19 Latent space model Dependency in the latent variables is crucial and the mean-field provides arbitrarily bad estimates. z n N (µ, Λ 1 ), logit(p) = θ z i z j (11) D. Tran et al., 2015 Variational Inference with Copula Augmentation 15 / 15

How to select a good vine

How to select a good vine Universitetet i Oslo ingrihaf@math.uio.no International FocuStat Workshop on Focused Information Criteria and Related Themes, May 9-11, 2016 Copulae Regular vines Model selection and reduction Limitations

More information

Markov Switching Regular Vine Copulas

Markov Switching Regular Vine Copulas Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS057) p.5304 Markov Switching Regular Vine Copulas Stöber, Jakob and Czado, Claudia Lehrstuhl für Mathematische Statistik,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Bayesian Inference for Pair-copula Constructions of Multiple Dependence

Bayesian Inference for Pair-copula Constructions of Multiple Dependence Bayesian Inference for Pair-copula Constructions of Multiple Dependence Claudia Czado and Aleksey Min Technische Universität München cczado@ma.tum.de, aleksmin@ma.tum.de December 7, 2007 Overview 1 Introduction

More information

An Overview of Edward: A Probabilistic Programming System. Dustin Tran Columbia University

An Overview of Edward: A Probabilistic Programming System. Dustin Tran Columbia University An Overview of Edward: A Probabilistic Programming System Dustin Tran Columbia University Alp Kucukelbir Eugene Brevdo Andrew Gelman Adji Dieng Maja Rudolph David Blei Dawen Liang Matt Hoffman Kevin Murphy

More information

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Melbourne Business School, University of Melbourne (Joint with Mohamad Khaled, University of Queensland)

More information

Approximation Multivariate Distribution of Main Indices of Tehran Stock Exchange with Pair-Copula

Approximation Multivariate Distribution of Main Indices of Tehran Stock Exchange with Pair-Copula Journal of Modern Applied Statistical Methods Volume Issue Article 5 --03 Approximation Multivariate Distribution of Main Indices of Tehran Stock Exchange with Pair-Copula G. Parham Shahid Chamran University,

More information

Hybrid Copula Bayesian Networks

Hybrid Copula Bayesian Networks Kiran Karra kiran.karra@vt.edu Hume Center Electrical and Computer Engineering Virginia Polytechnic Institute and State University September 7, 2016 Outline Introduction Prior Work Introduction to Copulas

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

REINTERPRETING IMPORTANCE-WEIGHTED AUTOENCODERS

REINTERPRETING IMPORTANCE-WEIGHTED AUTOENCODERS Worshop trac - ICLR 207 REINTERPRETING IMPORTANCE-WEIGHTED AUTOENCODERS Chris Cremer, Quaid Morris & David Duvenaud Department of Computer Science University of Toronto {ccremer,duvenaud}@cs.toronto.edu

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions

Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions Mohammad Emtiyaz Khan, Reza Babanezhad, Wu Lin, Mark Schmidt, Masashi Sugiyama Conference on Uncertainty

More information

Representing sparse Gaussian DAGs as sparse R-vines allowing for non-gaussian dependence

Representing sparse Gaussian DAGs as sparse R-vines allowing for non-gaussian dependence Representing sparse Gaussian DAGs as sparse R-vines allowing for non-gaussian dependence arxiv:604.040v [stat.me] 0 Nov 06 Dominik Müller and Claudia Czado December, 06 Abstract Modeling dependence in

More information

Operator Variational Inference

Operator Variational Inference Operator Variational Inference Rajesh Ranganath Princeton University Jaan Altosaar Princeton University Dustin Tran Columbia University David M. Blei Columbia University Abstract Variational inference

More information

14 : Mean Field Assumption

14 : Mean Field Assumption 10-708: Probabilistic Graphical Models 10-708, Spring 2018 14 : Mean Field Assumption Lecturer: Kayhan Batmanghelich Scribes: Yao-Hung Hubert Tsai 1 Inferential Problems Can be categorized into three aspects:

More information

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS Proceedings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 10-708 Probabilistic Graphical Models Homework 3 (v1.1.0) Due Apr 14, 7:00 PM Rules: 1. Homework is due on the due date at 7:00 PM. The homework should be submitted via Gradescope. Solution to each problem

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds.

Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds. Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds. A SIMULATION-BASED COMPARISON OF MAXIMUM ENTROPY AND COPULA

More information

Pair-copula constructions of multiple dependence

Pair-copula constructions of multiple dependence Pair-copula constructions of multiple dependence 3 4 5 3 34 45 T 3 34 45 3 4 3 35 4 T 3 4 3 35 4 4 3 5 34 T 3 4 3 5 34 5 34 T 4 Note no SAMBA/4/06 Authors Kjersti Aas Claudia Czado Arnoldo Frigessi Henrik

More information

Program and big picture Big data: can copula modelling be used for high dimensions, say

Program and big picture Big data: can copula modelling be used for high dimensions, say Conditional independence copula models with graphical representations Harry Joe (University of British Columbia) For multivariate Gaussian with a large number of variables, there are several approaches

More information

MAXIMUM ENTROPIES COPULAS

MAXIMUM ENTROPIES COPULAS MAXIMUM ENTROPIES COPULAS Doriano-Boris Pougaza & Ali Mohammad-Djafari Groupe Problèmes Inverses Laboratoire des Signaux et Systèmes (UMR 8506 CNRS - SUPELEC - UNIV PARIS SUD) Supélec, Plateau de Moulon,

More information

Bayesian Model Selection of Regular Vine Copulas

Bayesian Model Selection of Regular Vine Copulas Bayesian Model Selection of Regular Vine Copulas Lutz F. Gruber Claudia Czado Abstract Regular vine copulas are a novel and very flexible class of dependence models. This paper presents a reversible jump

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Probabilistic Reasoning in Deep Learning

Probabilistic Reasoning in Deep Learning Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Natural Gradients via the Variational Predictive Distribution

Natural Gradients via the Variational Predictive Distribution Natural Gradients via the Variational Predictive Distribution Da Tang Columbia University datang@cs.columbia.edu Rajesh Ranganath New York University rajeshr@cims.nyu.edu Abstract Variational inference

More information

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Message-Passing Algorithms for GMRFs and Non-Linear Optimization Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández Lobato 1,2, David López Paz 3,2 and Zoubin Ghahramani 1 June 27, 2013 1 University of Cambridge 2 Equal Contributor 3 Ma-Planck-Institute

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Deep Variational Inference. FLARE Reading Group Presentation Wesley Tansey 9/28/2016

Deep Variational Inference. FLARE Reading Group Presentation Wesley Tansey 9/28/2016 Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is Variational Inference? What is Variational Inference? Want to estimate some distribution, p*(x) p*(x) What is

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Truncation of vine copulas using fit indices

Truncation of vine copulas using fit indices Truncation of vine copulas using fit indices Eike C. Brechmann Harry Joe February 2, 2015 Abstract Vine copulas are flexible multivariate dependence models, which are built up from a set of bivariate copulas

More information

Probabilistic Graphical Models for Image Analysis - Lecture 4

Probabilistic Graphical Models for Image Analysis - Lecture 4 Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.

More information

CS Lecture 18. Topic Models and LDA

CS Lecture 18. Topic Models and LDA CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same

More information

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Applied Mathematical Sciences, Vol. 4, 2010, no. 14, 657-666 Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Pranesh Kumar Mathematics Department University of Northern British Columbia

More information

Variational Inference in TensorFlow. Danijar Hafner Stanford CS University College London, Google Brain

Variational Inference in TensorFlow. Danijar Hafner Stanford CS University College London, Google Brain Variational Inference in TensorFlow Danijar Hafner Stanford CS 20 2018-02-16 University College London, Google Brain Outline Variational Inference Tensorflow Distributions VAE in TensorFlow Variational

More information

arxiv: v1 [stat.me] 16 Feb 2013

arxiv: v1 [stat.me] 16 Feb 2013 arxiv:1302.3979v1 [stat.me] 16 Feb 2013 David Lopez-Paz Max Planck Institute for Intelligent Systems Jose Miguel Hernández-Lobato Zoubin Ghahramani University of Cambridge Abstract Copulas allow to learn

More information

Technische Universität München. Zentrum Mathematik

Technische Universität München. Zentrum Mathematik Technische Universität München Zentrum Mathematik Joint estimation of parameters in multivariate normal regression with correlated errors using pair-copula constructions and an application to finance Diplomarbeit

More information

Fisher Information in Gaussian Graphical Models

Fisher Information in Gaussian Graphical Models Fisher Information in Gaussian Graphical Models Jason K. Johnson September 21, 2006 Abstract This note summarizes various derivations, formulas and computational algorithms relevant to the Fisher information

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

Partial Correlation with Copula Modeling

Partial Correlation with Copula Modeling Partial Correlation with Copula Modeling Jong-Min Kim 1 Statistics Discipline, Division of Science and Mathematics, University of Minnesota at Morris, Morris, MN, 56267, USA Yoon-Sung Jung Office of Research,

More information

Why Aren t You Using Probabilistic Programming? Dustin Tran Columbia University

Why Aren t You Using Probabilistic Programming? Dustin Tran Columbia University Why Aren t You Using Probabilistic Programming? Dustin Tran Columbia University Alp Kucukelbir Adji Dieng Dave Moore Dawen Liang Eugene Brevdo Ian Langmore Josh Dillon Maja Rudolph Brian Patton Srinivas

More information

Deep Generative Models

Deep Generative Models Deep Generative Models Durk Kingma Max Welling Deep Probabilistic Models Worksop Wednesday, 1st of Oct, 2014 D.P. Kingma Deep generative models Transformations between Bayes nets and Neural nets Transformation

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software January 2013, Volume 52, Issue 3. http://www.jstatsoft.org/ CDVine: Modeling Dependence with C- and D-Vine Copulas in R Eike Christian Brechmann Technische Universität

More information

Nonparametric Inference for Auto-Encoding Variational Bayes

Nonparametric Inference for Auto-Encoding Variational Bayes Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

arxiv: v1 [stat.ml] 2 Mar 2016

arxiv: v1 [stat.ml] 2 Mar 2016 Automatic Differentiation Variational Inference Alp Kucukelbir Data Science Institute, Department of Computer Science Columbia University arxiv:1603.00788v1 [stat.ml] 2 Mar 2016 Dustin Tran Department

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Stochastic Variational Inference

Stochastic Variational Inference Stochastic Variational Inference David M. Blei Princeton University (DRAFT: DO NOT CITE) December 8, 2011 We derive a stochastic optimization algorithm for mean field variational inference, which we call

More information

Variational Inference via Stochastic Backpropagation

Variational Inference via Stochastic Backpropagation Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation

More information

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION Alireza Bayestehtashk and Izhak Shafran Center for Spoken Language Understanding, Oregon Health & Science University, Portland, Oregon, USA

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 2 - Spring 2017 Lecture 6 Jan-Willem van de Meent (credit: Yijun Zhao, Chris Bishop, Andrew Moore, Hastie et al.) Project Project Deadlines 3 Feb: Form teams of

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

An Introduction to Expectation-Maximization

An Introduction to Expectation-Maximization An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Understanding Covariance Estimates in Expectation Propagation

Understanding Covariance Estimates in Expectation Propagation Understanding Covariance Estimates in Expectation Propagation William Stephenson Department of EECS Massachusetts Institute of Technology Cambridge, MA 019 wtstephe@csail.mit.edu Tamara Broderick Department

More information

A COPULA-BASED SUPERVISED LEARNING CLASSIFICATION FOR CONTINUOUS AND DISCRETE DATA

A COPULA-BASED SUPERVISED LEARNING CLASSIFICATION FOR CONTINUOUS AND DISCRETE DATA Journal of Data Science 13(2014), 769-790 A COPULA-BASED SUPERVISED LEARNING CLASSIFICATION FOR CONTINUOUS AND DISCRETE DATA Yuhui Chen 1* 1 Department of Mathematics, The University of Alabama, USA Abstract:

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

Black-box α-divergence Minimization

Black-box α-divergence Minimization Black-box α-divergence Minimization José Miguel Hernández-Lobato, Yingzhen Li, Daniel Hernández-Lobato, Thang Bui, Richard Turner, Harvard University, University of Cambridge, Universidad Autónoma de Madrid.

More information

Copula based Probabilistic Measures of Uncertainty with Applications

Copula based Probabilistic Measures of Uncertainty with Applications Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS057) p.5292 Copula based Probabilistic Measures of Uncertainty with Applications Kumar, Pranesh University of Northern

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of Discussion Functionals Calculus of Variations Maximizing a Functional Finding Approximation to a Posterior Minimizing K-L divergence Factorized

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection Chapter 1 Bayesian Inference for D-vines: Estimation and Model Selection Claudia Czado and Aleksey Min Technische Universität München, Zentrum Mathematik, Boltzmannstr. 3, 85747 Garching, Germany cczado@ma.tum.de

More information

Sparse Stochastic Inference for Latent Dirichlet Allocation

Sparse Stochastic Inference for Latent Dirichlet Allocation Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Undirected Graphical Models

Undirected Graphical Models Undirected Graphical Models 1 Conditional Independence Graphs Let G = (V, E) be an undirected graph with vertex set V and edge set E, and let A, B, and C be subsets of vertices. We say that C separates

More information

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Scott W. Linderman Matthew J. Johnson Andrew C. Miller Columbia University Harvard and Google Brain Harvard University Ryan

More information

arxiv: v1 [cs.ne] 19 Oct 2012

arxiv: v1 [cs.ne] 19 Oct 2012 Modeling with Copulas and Vines in Estimation of Distribution Algorithms arxiv:121.55v1 [cs.ne] 19 Oct 212 Marta Soto Institute of Cybernetics, Mathematics and Physics, Cuba. Email: mrosa@icimaf.cu Yasser

More information

Latent Variable Models

Latent Variable Models Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Approximate Bayesian inference

Approximate Bayesian inference Approximate Bayesian inference Variational and Monte Carlo methods Christian A. Naesseth 1 Exchange rate data 0 20 40 60 80 100 120 Month Image data 2 1 Bayesian inference 2 Variational inference 3 Stochastic

More information

Variational Message Passing. By John Winn, Christopher M. Bishop Presented by Andy Miller

Variational Message Passing. By John Winn, Christopher M. Bishop Presented by Andy Miller Variational Message Passing By John Winn, Christopher M. Bishop Presented by Andy Miller Overview Background Variational Inference Conjugate-Exponential Models Variational Message Passing Messages Univariate

More information

Improved Bayesian Compression

Improved Bayesian Compression Improved Bayesian Compression Marco Federici University of Amsterdam marco.federici@student.uva.nl Karen Ullrich University of Amsterdam karen.ullrich@uva.nl Max Welling University of Amsterdam Canadian

More information