An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

Similar documents
6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

STA 4273H: Statistical Machine Learning

Bayesian Networks BY: MOHAMAD ALSABBAGH

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Probabilistic Graphical Networks: Definitions and Basic Results

Learning in Bayesian Networks

CS 343: Artificial Intelligence

STA 4273H: Statistical Machine Learning

Directed Graphical Models

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

Summary of the Bayes Net Formalism. David Danks Institute for Human & Machine Cognition

Machine Learning. Probabilistic KNN.

LEARNING WITH BAYESIAN NETWORKS

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Probabilistic Graphical Models (I)

CS 5522: Artificial Intelligence II

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

STA 4273H: Statistical Machine Learning

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Markov Networks.

CPSC 540: Machine Learning

Machine Learning Techniques for Computer Vision

Learning With Bayesian Networks. Markus Kalisch ETH Zürich

Sampling Methods (11/30/04)

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Bayesian room-acoustic modal analysis

Machine Learning for Data Science (CS4786) Lecture 24

Learning Energy-Based Models of High-Dimensional Data

Bayesian model selection in graphs by using BDgraph package

Respecting Markov Equivalence in Computing Posterior Probabilities of Causal Graphical Features

Based on slides by Richard Zemel

Basic Sampling Methods

p L yi z n m x N n xi

CSC 2541: Bayesian Methods for Machine Learning

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Assessing Regime Uncertainty Through Reversible Jump McMC

CS Lecture 3. More Bayesian Networks

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Markov Chain Monte Carlo, Numerical Integration

STA 4273H: Statistical Machine Learning

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Learning Causality. Sargur N. Srihari. University at Buffalo, The State University of New York USA

Bayes Nets III: Inference

Bayesian (conditionally) conjugate inference for discrete data models. Jon Forster (University of Southampton)

Advanced Statistical Modelling

Probabilistic Graphical Models. Guest Lecture by Narges Razavian Machine Learning Class April

An Introduction to Bayesian Networks: Representation and Approximate Inference

Bayesian Regression Linear and Logistic Regression

Graphical Models and Kernel Methods

Density Estimation. Seungjin Choi

Recent Advances in Bayesian Inference Techniques

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

Probabilistic Graphical Models

The Origin of Deep Learning. Lili Mou Jan, 2015

Non-Parametric Bayes

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Bayesian Networks Inference with Probabilistic Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models

Rapid Introduction to Machine Learning/ Deep Learning

Inference in Bayesian Networks

Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model

Molecular Evolution & Phylogenetics

CS 188: Artificial Intelligence. Bayes Nets

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

CS 188: Artificial Intelligence Spring Announcements

STA414/2104 Statistical Methods for Machine Learning II

Markov-Chain Monte Carlo

9 Forward-backward algorithm, sum-product on factor graphs

Nonparametric Bayesian Methods (Gaussian Processes)

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Learning Bayesian network : Given structure and completely observed data

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

CSC 2541: Bayesian Methods for Machine Learning

STAT 518 Intro Student Presentation

Lecture 16 Deep Neural Generative Models

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CPSC 540: Machine Learning

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Artificial Intelligence

Approximate Inference

Which coin will I use? Which coin will I use? Logistics. Statistical Learning. Topics. 573 Topics. Coin Flip. Coin Flip

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling

Undirected Graphical Models

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Stochastic Proximal Gradient Algorithm

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Bayesian Networks

STA 414/2104: Machine Learning

Inferring Causal Phenotype Networks from Segregating Populat

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Introduction to Machine Learning CMU-10701

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

13: Variational inference II

Bayes Nets: Independence

Introduction to Probabilistic Graphical Models

Lecture 6: Graphical Models: Learning

Transcription:

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application, CleverSet, Inc. STARMAP/DAMARS Conference Page 1

The research described in this presentation has been funded by the U.S. Environmental Protection Agency through the STAR Cooperative Agreement #CR82-9095 to Colorado State University. It has not been subjected to the Agency s review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred. STARMAP/DAMARS Conference Page 2

Introduction Bayesian Networks Bayesian Networks (BNs) are a method for representing complex multivariate probability distributions. Graphical Structure: Directed Acyclic Graph Each node represents a variable An edge represents an association between two variables Parametrical Model: Multivariate Gaussian Data are independent, multivariate Gaussian Parameters are multivariate Gaussian Marginal distribution of each node (variable) is a multiple regression STARMAP/DAMARS Conference Page 3

Introduction Bayesian Networks D B C A Important point: despite the arrows, BNs represent a probability distribution and do not imply causations, only associations. STARMAP/DAMARS Conference Page 4

Introduction Reversible Jump Markov chain Monte Carlo MCMC is a method for simulating a probability distribution that cannot be directly simulated. MCMC is often used for model selection, especially in very large model spaces. RJMCMC is a type of MCMC that allows for dimensional changes in the probability distribution being simulated. RJMCMC can be used for model selection in cases where dimensionality may change, such as: ARIMA time series models Gaussian Mixtures STARMAP/DAMARS Conference Page 5

RJMCMC for BNs RJMCMC is suited to searching for BNs because: A change in the graphical structure of a BN results in a change in the number of parameters The number of possible structures of BNs increases super-exponentially in the number of variables. STARMAP/DAMARS Conference Page 6

RJMCMC for BNs How it works: RJMCMC randomly walks around the space of possible model structures by changing one edge at a time called structural learning. At each step in its walk, all of the model parameters are updated called parametrical learning. At the end, you have a list of all of the model structures it visited at each step and their corresponding set of parameters. STARMAP/DAMARS Conference Page 7

Example: MAHA-MAIA Subset of numerical data taken from the MAHA-MAIA study. Six variables chosen, with help of a domain expert: Insect IBI - Insect Index of Biotic Integrity Sediment Disturbance - A log-scale metric of excess grain-size diameter Environmental disturbance - Combined percentages of disturbance (urban, agricultural, and mining disturbances) ph Natural logarithm of nutrients (maximum of either N or P) Natural logarithm of slightly translated Acid Neutralizing Capacity STARMAP/DAMARS Conference Page 8

Example: MAHA-MAIA Implementation notes To avoid numerical issues, variables were standardized. Three graphical structures are presented: The structure with the maximum posterior probability, The average model, which gives us a sense of the likelihood of an association between pairs of variables, A reference model obtained from the package TETRAD IV. + s and - s, derived from the learned parameters, denote the type of quantitative association between variables. STARMAP/DAMARS Conference Page 9

Example: MAHA-MAIA Maximum posterior probability structure Disturbance + - Nutrients Sediment + Bug_IBI + + + ANC ph STARMAP/DAMARS Conference Page 10

Example: MAHA-MAIA Posterior probabilities of all visited structures Posterior Probability 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Max Prob = 0.304 0 100 200 300 400 500 STARMAP/DAMARS Conference Page 11

Example: MAHA-MAIA Average structure Note: edges with posterior probability <.3 not shown Disturbance 1 0.988 Nutrients Sediment 0.715 Bug_IBI 1 0.775 ph 1 ANC STARMAP/DAMARS Conference Page 12

Example: MAHA-MAIA Structure discovered by TETRAD IV Note: undirected edge on right can be oriented either direction Disturbance Nutrients Sediment Bug_IBI ANC Phstvl STARMAP/DAMARS Conference Page 13

Example: MAHA-MAIA Discussion The + s and - s seem reasonable, and the structures agree, but the direction of the edges from Bug IBI seem to preclude a causal interpretation. Recall that BNs encode a joint probability distribution and don t necessarily suggest a causal relation. Try again, but remove models from consideration that include edges emanating from Bug IBI. STARMAP/DAMARS Conference Page 14

Example: MAHA-MAIA (2) Maximum posterior probability structure + Nutrients ANC + + Disturbance - Sediment - ph + - Bug_IBI STARMAP/DAMARS Conference Page 15

Example: MAHA-MAIA (2) Posterior probabilities of all visited structures Posterior Probability 0.00 0.01 0.02 0.03 0.04 Max Prob = 0.042 0 200 400 600 800 1000 STARMAP/DAMARS Conference Page 16

Example: MAHA-MAIA (2) Average structure Note: edges with posterior probability <.3 not shown ph 1, + ANC 0.986, + 0.654, - 1, + Bug_IBI Nutrients 0.438, + 0.622, - 1, + Sediment 0.337, - Disturbance STARMAP/DAMARS Conference Page 17

Example: MAHA-MAIA (2) Structure discovered by TETRAD IV Disturbance Sediment Nutrients Bug_IBI ANC ph STARMAP/DAMARS Conference Page 18

Conclusion Finding an optimal BN that fits data is known to be a very hard problem (NP-hard, in fact), making heuristic algorithms a necessity. Though many of these algorithms are much faster than RJMCMC, increases in computing power and programming refinements are making its lack of speed less of an issue. Furthermore, other methods lack the unique strengths of RJMCMC: Posterior edge probabilities give a measure of the likelihood of association Combined structure discovery and parameter learning Fully Bayesian solution STARMAP/DAMARS Conference Page 19