Consensus and Distributed Inference Rates Using Network Divergence

Size: px
Start display at page:

Download "Consensus and Distributed Inference Rates Using Network Divergence"

Transcription

1 DIMACS August / 26 Consensus and Distributed Inference Rates Using Network Divergence Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey August 23, 2017 (Joint work with Tara Javidi and Anusha Lalitha (UCSD)) Work sponsored by NSF under award CCF

2 DIMACS August / 26 The model: finite hypothesis testing Second simple model: estimate a global parameter θ. Each agent takes observations over time conditioned on θ. Can do local updates followed by communication with neighbors. Main focus: simple rule and rate of convergence.

3 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes.

4 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }.

5 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }. Observations X (t) i are i.i.d.

6 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }. Observations X (t) i are i.i.d. Fixed known distributions {f i ( ; θ 1 ), f i ( ; θ 2 ),..., f i ( ; θ M )}.

7 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }. Observations X (t) i are i.i.d. Fixed known distributions {f i ( ; θ 1 ), f i ( ; θ 2 ),..., f i ( ; θ M )}. θ Θ is fixed global unknown parameter

8 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }. Observations X (t) i are i.i.d. Fixed known distributions {f i ( ; θ 1 ), f i ( ; θ 2 ),..., f i ( ; θ M )}. θ Θ is fixed global unknown parameter X (t) i f i ( ; θ ).

9 DIMACS August 2017 > A pseudo-bayes update 3 / 26 Model Set of n nodes. Set of hypotheses Θ = {θ 1, θ 2,..., θ M }. Observations X (t) i are i.i.d. Fixed known distributions {f i ( ; θ 1 ), f i ( ; θ 2 ),..., f i ( ; θ M )}. θ Θ is fixed global unknown parameter X (t) i f i ( ; θ ). GOAL Parametric inference of unknown θ

10 DIMACS August 2017 > A pseudo-bayes update 4 / 26 Hypothesis Testing

11 DIMACS August 2017 > A pseudo-bayes update 4 / 26 Hypothesis Testing If θ is globally identifiable, then collecting all observations X (t) = {X (t) 1, X(t) 2,..., X(t) n } at a central locations yields a centralized hypothesis testing problem.

12 DIMACS August 2017 > A pseudo-bayes update 4 / 26 Hypothesis Testing If θ is globally identifiable, then collecting all observations X (t) = {X (t) 1, X(t) 2,..., X(t) n } at a central locations yields a centralized hypothesis testing problem. Exponentially fast convergence to the true hypothesis

13 DIMACS August 2017 > A pseudo-bayes update 4 / 26 Hypothesis Testing If θ is globally identifiable, then collecting all observations X (t) = {X (t) 1, X(t) 2,..., X(t) n } at a central locations yields a centralized hypothesis testing problem. Exponentially fast convergence to the true hypothesis Can this be achieved locally with low dimensional observations?

14 DIMACS August 2017 > A pseudo-bayes update 5 / 26 Example: Low-dimensional Observations If all observations are not collected centrally, node 1 individually cannot learn θ.

15 DIMACS August 2017 > A pseudo-bayes update 5 / 26 Example: Low-dimensional Observations If all observations are not collected centrally, node 1 individually cannot learn θ. = nodes must communicate.

16 DIMACS August 2017 > A pseudo-bayes update 6 / 26 Distributed Hypothesis Testing Define Θ i = {θ Θ : f i ( ; θ) = f i ( ; θ )}.

17 DIMACS August 2017 > A pseudo-bayes update 6 / 26 Distributed Hypothesis Testing Define Θ i = {θ Θ : f i ( ; θ) = f i ( ; θ )}. θ Θ i = θ and θ are observationally equivalent for node i.

18 DIMACS August 2017 > A pseudo-bayes update 6 / 26 Distributed Hypothesis Testing Define Θ i = {θ Θ : f i ( ; θ) = f i ( ; θ )}. θ Θ i = θ and θ are observationally equivalent for node i. Suppose {θ } = Θ 1 Θ 2... Θ n.

19 DIMACS August 2017 > A pseudo-bayes update 6 / 26 Distributed Hypothesis Testing Define Θ i = {θ Θ : f i ( ; θ) = f i ( ; θ )}. θ Θ i = θ and θ are observationally equivalent for node i. Suppose {θ } = Θ 1 Θ 2... Θ n. GOAL Parametric inference of unknown θ

20 DIMACS August 2017 > A pseudo-bayes update 7 / 26 Learning Rule

21 DIMACS August 2017 > A pseudo-bayes update 7 / 26 Learning Rule At t = 0, node i begins with initial estimate vector q (0) i > 0, where components of q (t) i form a probability distribution on Θ.

22 DIMACS August 2017 > A pseudo-bayes update 7 / 26 Learning Rule At t = 0, node i begins with initial estimate vector q (0) i > 0, where components of q (t) i form a probability distribution on Θ. At t > 0, node i draws X (t) i.

23 DIMACS August 2017 > A pseudo-bayes update 8 / 26 Learning Rule Node i computes belief vector,, via Bayesian update b (t) i b (t) i (θ) = ( f i θ Θ f i X (t) i ) ; θ q (t 1) i (θ) ( X (t) i ; θ ) q (t 1) i (θ ).

24 DIMACS August 2017 > A pseudo-bayes update 8 / 26 Learning Rule Node i computes belief vector,, via Bayesian update b (t) i b (t) i (θ) = ( f i θ Θ f i X (t) i ) ; θ q (t 1) i (θ) ( X (t) i ; θ ) q (t 1) i (θ ). Sends message Y (t) i = b (t) i.

25 DIMACS August 2017 > A pseudo-bayes update 9 / 26 Learning Rule Receives messages from its neighbors at the same time.

26 DIMACS August 2017 > A pseudo-bayes update 9 / 26 Learning Rule Receives messages from its neighbors at the same time. Updates q (t) i via averaging of log beliefs, ( n ) exp j=1 W ijlog b (t) j (θ) q (t) i (θ) = θ Θ exp ( n j=1 W ijlog b (t) j (θ ) where weight W ij denotes the influence of node j on estimate of node i. ),

27 DIMACS August 2017 > A pseudo-bayes update 9 / 26 Learning Rule Receives messages from its neighbors at the same time. Updates q (t) i via averaging of log beliefs, ( n ) exp j=1 W ijlog b (t) j (θ) q (t) i (θ) = θ Θ exp ( n j=1 W ijlog b (t) j (θ ) where weight W ij denotes the influence of node j on estimate of node i. Put t = t + 1 and repeat. ),

28 DIMACS August 2017 > A pseudo-bayes update 10 / 26 In a picture Q i (,t) X i (t) f i ( ) local observations Bayesian Update b i (,t)= f i (X i (t) ) P f 0 i (X i (t) )Q i (,t) Average log-beliefs Q i (,t+ 1) = messages from {b j (,t)} neighbors Pn exp j=1 W ij log b j (,t) P 0 2 exp Pn. j=1 W ij log b j ( 0,t)

29 DIMACS August 2017 > A pseudo-bayes update 11 / 26 An example

30 DIMACS August 2017 > A pseudo-bayes update 11 / 26 An example When connected in a network, using the proposed learning rule node 1 learns θ.

31 DIMACS August 2017 > A pseudo-bayes update 12 / 26 Assumptions Assumption 1 For every pair θ θ, f i ( ; θ ) f i ( ; θ) for at least one node, i.e the KL-divergence D (f i ( ; θ ) f i ( ; θ)) > 0.

32 DIMACS August 2017 > A pseudo-bayes update 12 / 26 Assumptions Assumption 1 For every pair θ θ, f i ( ; θ ) f i ( ; θ) for at least one node, i.e the KL-divergence D (f i ( ; θ ) f i ( ; θ)) > 0. Assumption 2 The stochastic matrix W is irreducible.

33 DIMACS August 2017 > A pseudo-bayes update 12 / 26 Assumptions Assumption 1 For every pair θ θ, f i ( ; θ ) f i ( ; θ) for at least one node, i.e the KL-divergence D (f i ( ; θ ) f i ( ; θ)) > 0. Assumption 2 The stochastic matrix W is irreducible. Assumption 3 For all i [n], the initial estimate q (0) i (θ) > 0 for every θ Θ.

34 DIMACS August 2017 > A pseudo-bayes update 13 / 26 Network Divergence The eigenvector centrality v = [v 1, v 2,..., v n ] is the left eigenvector of W corresponding to eigenvalue 1. The central quantity of interest is what we call the network divergence n K(θ, θ) = v j D (f j ( ; θ ) f j ( ; θ)) j=1

35 DIMACS August 2017 > A pseudo-bayes update 14 / 26 Convergence Results Let θ be the unknown fixed parameter. Suppose assumptions 1 3 hold.

36 DIMACS August 2017 > A pseudo-bayes update 14 / 26 Convergence Results Let θ be the unknown fixed parameter. Suppose assumptions 1 3 hold. Theorem: Rate of rejecting θ θ Every node i s estimate of θ θ almost surely converges to 0 exponentially fast. Mathematically, 1 lim log q(t) t i (θ) = K(θ, θ) P-a.s. t where K(θ, θ) = n j=1 v jd (f j ( ; θ ) f j ( ; θ)).

37 DIMACS August 2017 > A pseudo-bayes update 15 / 26 Example: Network-wide Learning Θ = {θ 1, θ 2, θ 3, θ 4 } and θ = θ 1. If i and j are connected, 1 W ij = degree of node i, otherwise 0. v = [ 1 12, 1 8, 1 12, 1 8, 1 6, 1 8, 1 12, 1 8, 1 12 ].

38 DIMACS August 2017 > A pseudo-bayes update 16 / 26 Example Θ 1 = {θ }, Θ i = Θ i 1 Θ5 = {θ }, Θ i = Θ i 5

39 DIMACS August 2017 > A pseudo-bayes update 17 / 26 Corollaries Theorem: Rate of rejecting θ θ Every node i s estimate of θ θ almost surely converges to 0 exponentially fast. Mathematically, 1 lim log q(t) t i (θ) = K(θ, θ) P-a.s. t where K(θ, θ) = n j=1 v jd (f j ( ; θ ) f j ( ; θ)). Lower bound on rate of convergence to θ For every node i, the rate at which error in the estimate of θ goes to zero can be lower bounded as 1 ( ) lim t t log 1 q (t) i (θ ) = min θ θ K(θ, θ) P-a.s.

40 DIMACS August 2017 > A pseudo-bayes update 18 / 26 Corollaries Lower bound on rate of learning The rate of learning λ across the network can be lower bounded as, λ min θ Θ min θ θ K(θ, θ) P-a.s. where, and λ = lim inf t 1 t log e t, e t = 1 2 n i=1 q (t) i ( ) 1 θ (.) 1 = n i=1 q (t) i θ θ (θ).

41 DIMACS August 2017 > Examples 19 / 26 Example: Periodicity node 1 can distinguish node 2 can distinguish Θ = {θ 1, θ 2, θ 3, θ 4 } and θ = θ 1. Underlying graph is periodic, ( ) 0 1 W =. 1 0

42 DIMACS August 2017 > Examples 20 / 26 Example: Networks with Large Mixing Times node 1 can distinguish node 2 can distinguish Θ = {θ 1, θ 2, θ 3, θ 4 } and θ = θ 1. Underlying graph is aperiodic, ( ) W =

43 DIMACS August 2017 > Examples 21 / 26 Concentration Result Assumption 4 For k [n], X X k, and for any given θ i, θ j Θ such that θ i θ j, is bounded, denoted by L. log f k( ;θ i) f k ( ;θ j) Theorem Under Assumptions 1 4, for every ɛ > 0 there exists a T such that for all t T and for every θ θ and i [n] we have and ( Pr ( Pr log q (t) i log q (t) i ) (θ) (K(θ, θ) ɛ)t γ(ɛ, L, t), ) (θ) (K(θ, θ) + ɛ)t γ( ɛ, L, t), 2 where L is a finite constant and γ(ɛ, L, t) = 2 exp ( ) ɛ2 t 2L 2 d.

44 DIMACS August 2017 > Examples 22 / 26 Related Work and Contribution Jadbabaie et al. use local Bayesian update of beliefs followed by averaging the beliefs. Show exponential convergence with no closed form of convergence rate. [ 12] Provide an upper bound on learning rate. [ 13] We average the log beliefs instead. Provide a lower bound on learning rate λ. Lower bound on learning rate is greater than the upper bound = Our learning rule converges faster.

45 DIMACS August 2017 > Examples 22 / 26 Related Work and Contribution Jadbabaie et al. use local Bayesian update of beliefs followed by averaging the beliefs. Show exponential convergence with no closed form of convergence rate. [ 12] Provide an upper bound on learning rate. [ 13] We average the log beliefs instead. Provide a lower bound on learning rate λ. Lower bound on learning rate is greater than the upper bound = Our learning rule converges faster. Shahrampour and Jadbabaie, 13 formulated a stochastic optimization learning problem; obtained a dual-based learning rule for doubly stochastic W, Provide closed-form lower bound on rate of identifying θ. Using our rule we achieve the same lower bound (from corollary 1) min θ θ ( 1 n ) n D(f j( ; θ ) f j( ; θ)). j=1

46 DIMACS August 2017 > Examples 23 / 26 Related Work and Contribution An update rule similar to ours was used in Rahnama Rad and Tahbaz-Salehi, 2010 to Show that node s belief converges in probability to the true parameter. However, under certain analytic assumptions. For general model and discrete parameter spaces we show almost-sure exponentially fast convergence.

47 DIMACS August 2017 > Examples 23 / 26 Related Work and Contribution An update rule similar to ours was used in Rahnama Rad and Tahbaz-Salehi, 2010 to Show that node s belief converges in probability to the true parameter. However, under certain analytic assumptions. For general model and discrete parameter spaces we show almost-sure exponentially fast convergence. Shahrampour et. al. and Nedic et. al. (independently) showed that our learning rule coincides with distributed stochastic optimization based learning rule (W irreducible and aperiodic)

48 DIMACS August 2017 > Summing up 24 / 26 Hypothesis testing and semi-bayes Q i (t) Get Local Send Social Q i (t + 1) Data Update Msg Update X i (t) {Y j (t)} Combination of local Bayesian updates and averaging. Network divergence: an intuitive measure for the rate of convergence. Posterior consistency gives a Bayesio-frequentist analysis.

49 DIMACS August 2017 > Summing up 25 / 26 Looking forward i 4 Continuous distributions and parameters. Applications to distributed optimization. Further limiting messages via coordinate descent ( and Javidi 15). Time-varying parameters and distributed stochastic filtering.

50 DIMACS August 2017 > Summing up 26 / 26 Thank You!

Learning distributions and hypothesis testing via social learning

Learning distributions and hypothesis testing via social learning UMich EECS 2015 1 / 48 Learning distributions and hypothesis testing via social learning Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey September 29, 2015

More information

arxiv: v1 [math.oc] 24 Dec 2018

arxiv: v1 [math.oc] 24 Dec 2018 On Increasing Self-Confidence in Non-Bayesian Social Learning over Time-Varying Directed Graphs César A. Uribe and Ali Jadbabaie arxiv:82.989v [math.oc] 24 Dec 28 Abstract We study the convergence of the

More information

Information Aggregation in Complex Dynamic Networks

Information Aggregation in Complex Dynamic Networks The Combined 48 th IEEE Conference on Decision and Control and 28 th Chinese Control Conference Information Aggregation in Complex Dynamic Networks Ali Jadbabaie Skirkanich Associate Professor of Innovation

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University

Convergence Rates in Decentralized Optimization. Alex Olshevsky Department of Electrical and Computer Engineering Boston University Convergence Rates in Decentralized Optimization Alex Olshevsky Department of Electrical and Computer Engineering Boston University Distributed and Multi-agent Control Strong need for protocols to coordinate

More information

Asymptotics, asynchrony, and asymmetry in distributed consensus

Asymptotics, asynchrony, and asymmetry in distributed consensus DANCES Seminar 1 / Asymptotics, asynchrony, and asymmetry in distributed consensus Anand D. Information Theory and Applications Center University of California, San Diego 9 March 011 Joint work with Alex

More information

Confident Bayesian Sequence Prediction. Tor Lattimore

Confident Bayesian Sequence Prediction. Tor Lattimore Confident Bayesian Sequence Prediction Tor Lattimore Sequence Prediction Can you guess the next number? 1, 2, 3, 4, 5,... 3, 1, 4, 1, 5,... 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1,

More information

DISTRIBUTED GENERALIZED LIKELIHOOD RATIO TESTS : FUNDAMENTAL LIMITS AND TRADEOFFS. Anit Kumar Sahu and Soummya Kar

DISTRIBUTED GENERALIZED LIKELIHOOD RATIO TESTS : FUNDAMENTAL LIMITS AND TRADEOFFS. Anit Kumar Sahu and Soummya Kar DISTRIBUTED GENERALIZED LIKELIHOOD RATIO TESTS : FUNDAMENTAL LIMITS AND TRADEOFFS Anit Kumar Sahu and Soummya Kar Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS Giorgio Picci with T. Taylor, ASU Tempe AZ. Wofgang Runggaldier s Birthday, Brixen July 2007 1 CONSENSUS FOR RANDOM GOSSIP ALGORITHMS Consider a finite

More information

Distributed Estimation and Detection for Smart Grid

Distributed Estimation and Detection for Smart Grid Distributed Estimation and Detection for Smart Grid Texas A&M University Joint Wor with: S. Kar (CMU), R. Tandon (Princeton), H. V. Poor (Princeton), and J. M. F. Moura (CMU) 1 Distributed Estimation/Detection

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

Distributed Optimization over Random Networks

Distributed Optimization over Random Networks Distributed Optimization over Random Networks Ilan Lobel and Asu Ozdaglar Allerton Conference September 2008 Operations Research Center and Electrical Engineering & Computer Science Massachusetts Institute

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

E-Companion to The Evolution of Beliefs over Signed Social Networks

E-Companion to The Evolution of Beliefs over Signed Social Networks OPERATIONS RESEARCH INFORMS E-Companion to The Evolution of Beliefs over Signed Social Networks Guodong Shi Research School of Engineering, CECS, The Australian National University, Canberra ACT 000, Australia

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Bayesian Learning in Social Networks

Bayesian Learning in Social Networks Bayesian Learning in Social Networks Asu Ozdaglar Joint work with Daron Acemoglu, Munther Dahleh, Ilan Lobel Department of Electrical Engineering and Computer Science, Department of Economics, Operations

More information

Distributed Optimization over Networks Gossip-Based Algorithms

Distributed Optimization over Networks Gossip-Based Algorithms Distributed Optimization over Networks Gossip-Based Algorithms Angelia Nedić angelia@illinois.edu ISE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Outline Random

More information

Economics of Networks Social Learning

Economics of Networks Social Learning Economics of Networks Social Learning Evan Sadler Massachusetts Institute of Technology Evan Sadler Social Learning 1/38 Agenda Recap of rational herding Observational learning in a network DeGroot learning

More information

Posterior Regularization

Posterior Regularization Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Lecture 13 Fundamentals of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Approximate Inference

Approximate Inference Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides) Bayesian inference Justin Chumbley ETH and UZH (Thanks to Jean Denizeau for slides) Overview of the talk Introduction: Bayesian inference Bayesian model comparison Group-level Bayesian model selection

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

Inconsistency of Bayesian inference when the model is wrong, and how to repair it

Inconsistency of Bayesian inference when the model is wrong, and how to repair it Inconsistency of Bayesian inference when the model is wrong, and how to repair it Peter Grünwald Thijs van Ommen Centrum Wiskunde & Informatica, Amsterdam Universiteit Leiden June 3, 2015 Outline 1 Introduction

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each

More information

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS LEARNING DYNAMIC SYSTEMS: MARKOV MODELS Markov Process and Markov Chains Hidden Markov Models Kalman Filters Types of dynamic systems Problem of future state prediction Predictability Observability Easily

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 8 Probability and Statistical Models Motivating ideas AMS 131: Suppose that the random variable

More information

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence Valen E. Johnson Texas A&M University February 27, 2014 Valen E. Johnson Texas A&M University Uniformly most powerful Bayes

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Lecture 8: Bayesian Networks

Lecture 8: Bayesian Networks Lecture 8: Bayesian Networks Bayesian Networks Inference in Bayesian Networks COMP-652 and ECSE 608, Lecture 8 - January 31, 2017 1 Bayes nets P(E) E=1 E=0 0.005 0.995 E B P(B) B=1 B=0 0.01 0.99 E=0 E=1

More information

Reinforcement Learning as Variational Inference: Two Recent Approaches

Reinforcement Learning as Variational Inference: Two Recent Approaches Reinforcement Learning as Variational Inference: Two Recent Approaches Rohith Kuditipudi Duke University 11 August 2017 Outline 1 Background 2 Stein Variational Policy Gradient 3 Soft Q-Learning 4 Closing

More information

Bayesian Analysis (Optional)

Bayesian Analysis (Optional) Bayesian Analysis (Optional) 1 2 Big Picture There are two ways to conduct statistical inference 1. Classical method (frequentist), which postulates (a) Probability refers to limiting relative frequencies

More information

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58 Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Variational Bayes and Variational Message Passing

Variational Bayes and Variational Message Passing Variational Bayes and Variational Message Passing Mohammad Emtiyaz Khan CS,UBC Variational Bayes and Variational Message Passing p.1/16 Variational Inference Find a tractable distribution Q(H) that closely

More information

Notes on Markov Networks

Notes on Markov Networks Notes on Markov Networks Lili Mou moull12@sei.pku.edu.cn December, 2014 This note covers basic topics in Markov networks. We mainly talk about the formal definition, Gibbs sampling for inference, and maximum

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Stratégies bayésiennes et fréquentistes dans un modèle de bandit

Stratégies bayésiennes et fréquentistes dans un modèle de bandit Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

6.207/14.15: Networks Lectures 22-23: Social Learning in Networks

6.207/14.15: Networks Lectures 22-23: Social Learning in Networks 6.207/14.15: Networks Lectures 22-23: Social Learning in Networks Daron Acemoglu and Asu Ozdaglar MIT December 2 end 7, 2009 1 Introduction Outline Recap on Bayesian social learning Non-Bayesian (myopic)

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Social network analysis: social learning

Social network analysis: social learning Social network analysis: social learning Donglei Du (ddu@unb.edu) Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2 October 20, 2016 Donglei Du (UNB) AlgoTrading

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

ICES REPORT Model Misspecification and Plausibility

ICES REPORT Model Misspecification and Plausibility ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,

More information

Asymmetric Randomized Gossip Algorithms for Consensus

Asymmetric Randomized Gossip Algorithms for Consensus Asymmetric Randomized Gossip Algorithms for Consensus F. Fagnani S. Zampieri Dipartimento di Matematica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, 10129 Torino, Italy, (email: fabio.fagnani@polito.it).

More information

Asymptotic Learning on Bayesian Social Networks

Asymptotic Learning on Bayesian Social Networks Asymptotic Learning on Bayesian Social Networks Elchanan Mossel Allan Sly Omer Tamuz January 29, 2014 Abstract Understanding information exchange and aggregation on networks is a central problem in theoretical

More information

Particle Filters; Simultaneous Localization and Mapping (Intelligent Autonomous Robotics) Subramanian Ramamoorthy School of Informatics

Particle Filters; Simultaneous Localization and Mapping (Intelligent Autonomous Robotics) Subramanian Ramamoorthy School of Informatics Particle Filters; Simultaneous Localization and Mapping (Intelligent Autonomous Robotics) Subramanian Ramamoorthy School of Informatics Recap: State Estimation using Kalman Filter Project state and error

More information

Hidden Markov Models (recap BNs)

Hidden Markov Models (recap BNs) Probabilistic reasoning over time - Hidden Markov Models (recap BNs) Applied artificial intelligence (EDA132) Lecture 10 2016-02-17 Elin A. Topp Material based on course book, chapter 15 1 A robot s view

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Distributed intelligence in multi agent systems

Distributed intelligence in multi agent systems Distributed intelligence in multi agent systems Usman Khan Department of Electrical and Computer Engineering Tufts University Workshop on Distributed Optimization, Information Processing, and Learning

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Stochastic Variational Inference

Stochastic Variational Inference Stochastic Variational Inference David M. Blei Princeton University (DRAFT: DO NOT CITE) December 8, 2011 We derive a stochastic optimization algorithm for mean field variational inference, which we call

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & nformatics Colorado School of Public Health University of Colorado Denver

More information

Frequentist Statistics and Hypothesis Testing Spring

Frequentist Statistics and Hypothesis Testing Spring Frequentist Statistics and Hypothesis Testing 18.05 Spring 2018 http://xkcd.com/539/ Agenda Introduction to the frequentist way of life. What is a statistic? NHST ingredients; rejection regions Simple

More information

An overview of Bayesian analysis

An overview of Bayesian analysis An overview of Bayesian analysis Benjamin Letham Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA bletham@mit.edu May 2012 1 Introduction and Notation This work provides

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI

Generative and Discriminative Approaches to Graphical Models CMSC Topics in AI Generative and Discriminative Approaches to Graphical Models CMSC 35900 Topics in AI Lecture 2 Yasemin Altun January 26, 2007 Review of Inference on Graphical Models Elimination algorithm finds single

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

18.05 Practice Final Exam

18.05 Practice Final Exam No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information