Machine Learning in Simple Networks. Lars Kai Hansen

Size: px
Start display at page:

Download "Machine Learning in Simple Networks. Lars Kai Hansen"

Transcription

1 Machine Learning in Simple Networs Lars Kai Hansen

2 Outline Communities and lin prediction Modularity Modularity as a combinatorial optimization problem Gibbs sampling Detection threshold a phase transition? Learning community parameters The Hofman-Wiggins generative model Is there a threshold for detection when you learn the parameters and complexity?

3 Muzeeer Wiipedia based common sense Wiipedia used as a proxy for the music users mental model Implementation: Filter retrieval using Wiipedia s article/ categories Muzeeer.com LINK PREDICTION to complete the ontological quality of Wiipedia

4 Networ models Nodes/vertices and lins/edges Directed / undirected Weighted / un-weighted Lin distributions Random Long tail Hubs and authorities Lin induced correlations The Rich club Communities Lin prediction

5 Motivation for community detection Community structure may mar a non-stationary lin distribution with high and low density sub-networs, hence summarizing with a single model could be misleading

6 Modularity can be predictive for dynamics M.E.J. Newman and M. Girvan, Finding and evaluating community structure in networs, Phys. Rev. E 69, (2004).

7 Modularity objective function The modularity is expressed as a sum over lins, such that we penalize missing lins in communities - missing is measured relative to a null distribution P 0 ij. Aij Q = PP i j δ ( ci, cj ) ij 2m C i is the community assignment of node j and 2m = Σ ij A ij, i = Σ j A ij The null is a baseline distribution P ij = i j /(2m) 2 The value of the modularity lies in the range [ 1,1]. It is positive if the number of edges within groups exceeds the number expected on the basis of chance M.E.J. Newman and M. Girvan. Finding and evaluating community structure in networs. Physical Review E, 69:026113, 2004, cond-mat/

8 Potts representation Introduce 0,1 binary variables S j coding the community assignment: node j is member of community δ ( c, c ) = S S i j i j Aij P( j, i) = 2m Aij Aij Q = (, ) ij PP c c PP S S ij 2m δ = 2m 1 Tr( SBS ') Q = B ij ijsisj = 2m 2m i j i j i j i j

9 Spectral optimization Newman relaxes the optimization problem to the simplex Q = 1 Tr( SBS ') B ijs is ij j = 2m 2m L = Tr( SBS ') + Tr( Λ S ) 2 m B S = S Λ

10 Combinatorial optimization We can use a physics analogy Simulated Annealing (Kirpatric et al. 1983) QS ( ) TrSBS ( ') PS ( AT, ) exp( ) = exp( ) T 2mT Gibbs sampling is a Monte Carlo realization of a Marov process in which each variable is randomly assigned according to its marginal distribution PS ( S, AT, ) j j PS ( AT, ) = PS ( AT, ) S j S Geman,D Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images". IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (6): (1984)

11 Potts model 1-node Discrete probability distribution on states = 1,,K ( ) 1 ' ' (, ) exp, (, ) exp exp K S S PS AT T PS AT r T S r T ϕ ϕ ϕ = = = =

12 Gibbs sampling ϕ Bij Aij i j = S = S S j j j 2m 2m 2m 2m i j j j r i = exp( ϕi / T ) exp( ϕ / T ) ' i ' S i = potts( r) i

13 Deterministic annealing Instead of drawing Gibbs samples according to the marginals we can average instead, this provides a set of self-consistent equations for the means (for 0,1 Bernoulli variables the mean is the probability μ i =P(S i )) r i = exp( ϕi / T ) exp( ϕ / T ) ' i ' ϕ Bij Aij = r = r PPr j j j 2m 2m i j j i j j S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) (2007).

14 Experimental evaluation Create a simple testbed with lin probability and noise S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) (2007).

15 S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) (2007).

16 Generative community model (Hofman & Wiggins, 2008) PASpq (,, ) = p(1 p) q(1 q) c d = j i, = (1 A ) S S j i, j i ij j i c d e f ij j i ( 1 ) ij j i ( ) f = (1 A ) 1 S S j i A S S e= A S S ij j i

17 Learning parameters of the generative model Hofman & Wiggins (2008) Here Variational Bayes Dirichlets/beta prior and posterior distributions for the probabilities Very well determined (over ill) Independent binomials for the assignment variables (misses correlation) Maximum lielihood for the parameters Gibbs sampling for the assignments Jae M. Hofman and Chris H. Wiggins, Bayesian Approach to Networ Modularity Phys. Rev. Lett. 100, (2008),

18 The community detection threshold how many lins are needed to detect the structure? P in p SNR = = qc ( 1) C 1 Jorg Reichardt and Michele Leone, Un)detectable Cluster Structure in Sparse Networs Phys. Rev. Lett. 101, (2008),

19 Experimental design Planted solution N = 1000 nodes C true = 5 Quality: Mutual information between planted assignments and the best identified Gibbs sampling No annealing Burn-in 200 iterations Averaging 800 iterations Parameter learning Q = 10 iterations

20 Community Detection fully informed on number of communities and probabilities MUTUAL INF. PLANTED COMMUNITY MUTUAL INF. PLANTED COMMUNITY COMMUNITY DETECTION (N =1000, C = 10, SNR = 50) INTRA COMMUNITY LINK PROB (P) COMMUNITY DETECTION (N =1000, C = 5, SNR = 50) INTRA COMMUNITY LINK PROB (P) MUTUAL INF. PLANTED COMMUNITY MUTUAL INF. PLANTED COMMUNITY COMMUNITY DETECTION (N =1000, C = 5, SNR = 5) INTRA COMMUNITY LINK PROB (P) COMMUNITY DETECTION (N =1000, C = 5, SNR = 10) INTRA COMMUNITY LINK PROB (P)

21 Now what happens to the phase transition if we learn the parameters with a too complex model (C > C true = 5)? MUTUAL INF. PLANTED COMMUNITY COMMUNITY DETECTION (N =1000, C = 10, SNR = 10) INTRA COMMUNITY LINK PROB (P) MUTUAL INF. PLANTED COMMUNITY COMMUNITY DETECTION (N =1000, C = 10, SNR = 5) INTRA COMMUNITY LINK PROB (P) 200 MEMBERSHIPS COMMUNITY

22 Conclusions Community detection can be formulated as an inference problem (Hofman & Wiggins, 2008) The sampling process for fixed SNR has a phase transition lie detection threshold (Richard & Leone, 2008) The phase transition remains (sharpens?) if you learn the parameters of a generative model with unnown complexity

23

Learning latent structure in complex networks

Learning latent structure in complex networks Learning latent structure in complex networks Lars Kai Hansen www.imm.dtu.dk/~lkh Current network research issues: Social Media Neuroinformatics Machine learning Joint work with Morten Mørup, Sune Lehmann

More information

Markov chain Monte Carlo methods for visual tracking

Markov chain Monte Carlo methods for visual tracking Markov chain Monte Carlo methods for visual tracking Ray Luo rluo@cory.eecs.berkeley.edu Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720

More information

Deterministic modularity optimization

Deterministic modularity optimization EPJ B proofs (will be inserted by the editor) Deterministic modularity optimization S. Lehmann 1,2,a and L.K. Hansen 1 1 Informatics and Mathematical Modeling, Technical University of Denmark 2 Center

More information

The non-backtracking operator

The non-backtracking operator The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

10-701/15-781, Machine Learning: Homework 4

10-701/15-781, Machine Learning: Homework 4 10-701/15-781, Machine Learning: Homewor 4 Aarti Singh Carnegie Mellon University ˆ The assignment is due at 10:30 am beginning of class on Mon, Nov 15, 2010. ˆ Separate you answers into five parts, one

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

MONTE CARLO METHODS. Hedibert Freitas Lopes

MONTE CARLO METHODS. Hedibert Freitas Lopes MONTE CARLO METHODS Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Parameter Learning: Binary Variables

Parameter Learning: Binary Variables Parameter Learning: Binary Variables SS 008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} Reference Richard E.

More information

Review: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF)

Review: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF) Case Study 4: Collaborative Filtering Review: Probabilistic Matrix Factorization Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 2 th, 214 Emily Fox 214 1 Probabilistic

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions - Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Bayesian Social Learning with Random Decision Making in Sequential Systems

Bayesian Social Learning with Random Decision Making in Sequential Systems Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Parallel Particle Filter in Julia

Parallel Particle Filter in Julia Parallel Particle Filter in Julia Gustavo Goretkin December 12, 2011 1 / 27 First a disclaimer The project in a sentence. workings 2 / 27 First a disclaimer First a disclaimer The project in a sentence.

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Replicated Softmax: an Undirected Topic Model. Stephen Turner Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016 A Random Dot Product Model for Weighted Networks arxiv:1611.02530v1 [stat.ap] 8 Nov 2016 Daryl R. DeFord 1 Daniel N. Rockmore 1,2,3 1 Department of Mathematics, Dartmouth College, Hanover, NH, USA 03755

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Application of new Monte Carlo method for inversion of prestack seismic data. Yang Xue Advisor: Dr. Mrinal K. Sen

Application of new Monte Carlo method for inversion of prestack seismic data. Yang Xue Advisor: Dr. Mrinal K. Sen Application of new Monte Carlo method for inversion of prestack seismic data Yang Xue Advisor: Dr. Mrinal K. Sen Overview Motivation Introduction Bayes theorem Stochastic inference methods Methodology

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics

Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute Paola Sebastiani Statistics Department Abstract This paper compares three methods

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Learning Bayesian Networks

Learning Bayesian Networks Learning Bayesian Networks Probabilistic Models, Spring 2011 Petri Myllymäki, University of Helsinki V-1 Aspects in learning Learning the parameters of a Bayesian network Marginalizing over all all parameters

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Using Belief Propagation to Counter Correlated Reports in Cooperative Spectrum Sensing

Using Belief Propagation to Counter Correlated Reports in Cooperative Spectrum Sensing Using Belief Propagation to Counter Correlated Reports in Cooperative Spectrum Sensing Mihir Laghate and Danijela Cabric Department of Electrical Engineering, University of California, Los Angeles Emails:

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Michalis K. Titsias Department of Informatics Athens University of Economics and Business

More information

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

COURSE INTRODUCTION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception COURSE INTRODUCTION COMPUTATIONAL MODELING OF VISUAL PERCEPTION 2 The goal of this course is to provide a framework and computational tools for modeling visual inference, motivated by interesting examples

More information

General Construction of Irreversible Kernel in Markov Chain Monte Carlo

General Construction of Irreversible Kernel in Markov Chain Monte Carlo General Construction of Irreversible Kernel in Markov Chain Monte Carlo Metropolis heat bath Suwa Todo Department of Applied Physics, The University of Tokyo Department of Physics, Boston University (from

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling

More information

Down by the Bayes, where the Watermelons Grow

Down by the Bayes, where the Watermelons Grow Down by the Bayes, where the Watermelons Grow A Bayesian example using SAS SUAVe: Victoria SAS User Group Meeting November 21, 2017 Peter K. Ott, M.Sc., P.Stat. Strategic Analysis 1 Outline 1. Motivating

More information

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i ) Symmetric Networks Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). How can we model an associative memory? Let M = {v 1,..., v m } be a

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning CSC412 Probabilistic Learning & Reasoning Lecture 12: Bayesian Parameter Estimation February 27, 2006 Sam Roweis Bayesian Approach 2 The Bayesian programme (after Rev. Thomas Bayes) treats all unnown quantities

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

Part IV: Monte Carlo and nonparametric Bayes

Part IV: Monte Carlo and nonparametric Bayes Part IV: Monte Carlo and nonparametric Bayes Outline Monte Carlo methods Nonparametric Bayesian models Outline Monte Carlo methods Nonparametric Bayesian models The Monte Carlo principle The expectation

More information

CS Lecture 18. Expectation Maximization

CS Lecture 18. Expectation Maximization CS 6347 Lecture 18 Expectation Maximization Unobserved Variables Latent or hidden variables in the model are never observed We may or may not be interested in their values, but their existence is crucial

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers PAC-Bayes Ris Bounds for Sample-Compressed Gibbs Classifiers François Laviolette Francois.Laviolette@ift.ulaval.ca Mario Marchand Mario.Marchand@ift.ulaval.ca Département d informatique et de génie logiciel,

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Novel spectrum sensing schemes for Cognitive Radio Networks

Novel spectrum sensing schemes for Cognitive Radio Networks Novel spectrum sensing schemes for Cognitive Radio Networks Cantabria University Santander, May, 2015 Supélec, SCEE Rennes, France 1 The Advanced Signal Processing Group http://gtas.unican.es The Advanced

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Markov Random Fields (A Rough Guide)

Markov Random Fields (A Rough Guide) Sigmedia, Electronic Engineering Dept., Trinity College, Dublin. 1 Markov Random Fields (A Rough Guide) Anil C. Kokaram anil.kokaram@tcd.ie Electrical and Electronic Engineering Dept., University of Dublin,

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectation-Maximization Algorithm Francisco S. Melo In these notes, we provide a brief overview of the formal aspects concerning -means, EM and their relation. We closely follow the presentation in

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Bayesian networks: approximate inference

Bayesian networks: approximate inference Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008 Approximative inference September 2008 1 / 25 Motivation Because of the (worst-case) intractability of exact

More information

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

Computer Vision Group Prof. Daniel Cremers. 14. Clustering Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it

More information

Collaborative Place Models Supplement 2

Collaborative Place Models Supplement 2 Collaborative Place Models Supplement Ber Kapicioglu Foursquare Labs berapicioglu@gmailcom Robert E Schapire Princeton University schapire@csprincetonedu David S Rosenberg YP Mobile Labs daviddavidr@gmailcom

More information

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II 1 Non-linear regression techniques Part - II Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information