Maximization of the information divergence from the multinomial distributions 1

Size: px
Start display at page:

Download "Maximization of the information divergence from the multinomial distributions 1"

Transcription

1 aximization of the information divergence from the multinomial distributions Jozef Juríček Charles University in Prague Faculty of athematics and Physics Department of Probability and athematical Statistics Supervisor: Ing. František atúš, CSc. Academy of Sciences of the Czech Republic Institute of Information Theory and Automation Department of Decision-aking Theory Abstract. The explicit solution of the problem of maximization of information divergence from the family of multinomial distributions is presented, using result of N. Ay and A. Knauf for the problem of maximization of multi-information, which is the special case of maximization of information divergence from hierarchical models. The problem studied in this paper is a generalization of the binomial case, which was solved in [3]. The problem of maximization of information divergence from an exponential family has emerged in probabilistic models for evolution and learning in neural networks that are based on infomax principles. The maximizers admit interpretation as stochastic systems with high complexity w.r.t. exponential family. Introduction Let µ, ν be nonzero measures on a finite set Z. Let f : Z R d. Let E = E µ,f = {Q µ,f,ϑ : ϑ R d } be the (full) exponential family determined by the reference measure µ and the directional statistic f, where Q µ,f,ϑ is a probability measure (pm) given by where, denotes the scalar product and Q µ,f,ϑ (z) = e ϑ,f(z) Λ µ,f (ϑ) µ(z), z Z, Λ µ,f (ϑ) = ln z Z e ϑ,f(z) µ(z). The information divergence (relative entropy; Kullback-Leibler divergence) of a pm P (on Z) from ν is { P (z) D(P ν) = z s(p ) P (z) ln ν(z), s(p ) s(ν), +, otherwise, s( ) is the support function, i.e. s(ν) = {z Z : ν(z) > 0}. The information divergence of a pm P (on Z) from the exponential family E is defined by D(P E) = inf D(P Q). Q E This work studies the maximization of the function P D(P ) where is a family of multinomial distributions (which is the closure of an exponential family). This problem is a generalization of the binomial case, which was solved in [3]. Problem. (aximization of divergence from the multinomial family). Let N be the number of identical and independent trials, n be the number of possible outcomes in each trial. Let p j be the probability of realization of the j th outcome ( n j= p j =, p,..., p n [0; ]) in each trial. Now, the multinomial distribution (with parameters N, n, p,..., p n ) is a joint distribution of numbers of realizations of each outcome in all N trials. Let Z := {z = (z,..., z n ) {0,,..., N} n : n j= z j = N} be the state space of (random variables with) multinomial distributions (with N, n fixed). Let be the set of all pm s on Z and be the set of all multinomial distributions (with N, n fixed). Finally, let be the set of all strictly positive pm s on Z and :=. The problem is to calculate sup D(P ) and find every P sup and Psup such that D(P sup P sup) = D(P sup ) = sup D(P ). AS 000 ath. Subject Classification. Primary 94A7. Secondary 6B0, 60A0, 5A0. Keywords and phrases. Kullback-Leibler divergence, relative entropy, exponential family, hierarchical models, multinomial distribution, information projection, log-laplace transform, cumulant generating function.

2 Example. (N =, n = ). Z = {z 0 = (, 0), z = (, ), z 0 = (0, ) }, = {( P (z 0 ), P (z ), P (z 0 ) ) = ( p 0, p, p 0 ) : (p0, p, p 0 ) [0; ] 3, p 0 + p + p 0 = }, = {( P (z 0 ), P (z ), P (z 0 ) ) = ( p, p( p), ( p) ) : p [0; ] }. The situation is illustrated on Figure. δ 0 δ 0 δ Figure : The simplex and exponential family for N =, n =. Example.3 (N = 3, n = ). Z = {z 30 = (3, 0), z = (, ), z = (, ), z 03 = (0, 3) }, = = {( P (z 30 ), P (z ), P (z ), P (z 03 ) ) = ( p 30, p, p, p 03 ) : (p30, p, p, p 03 ) [0; ] 4, p 30 + p + p + p 03 = }, = {( P (z 30 ), P (z ), P (z ), P (z 03 ) ) = ( p 3, 3p ( p), 3p( p), ( p) 3) : p [0; ] }. The situation is illustrated on Figure. δ 03 δ 30 δ δ Figure : The simplex and exponential family for N = 3, n =. The general problem of maximization of information divergence from an exponential family has emerged in probabilistic models for evolution and learning in neural networks based on infomax principles. aximizers of D( E) admit interpretation as stochastic systems with high complexity w.r.t. exponential family E [].

3 Preliminaries This section reviews some facts about exponential families and information projections. Let Lin(A) denote the linear span of a set A R d. Lemma.. Let µ, ν be strictly positive measures on a finite set Z, f : Z R d f, g : Z R dg two exponential families. Then E µ,f = E ν,f. and E µ,f E ν,g Proof. Notice that ν ν(z) = Q ν,g,0 E ν,g E µ,f. Then, there exists ϑ 0 R d f, such that Now µ can be expressed as µ(z) = ν(z) ν(z) eλ(ϑ0) ϑ0,f(z). It can be seen, that for every ϑ R d f Q µ,f,ϑ (z) = e ϑ,f(z) Λ(ϑ) µ(z) = z Z e ϑ,f(z) +Λ(ϑ0) ϑ0,f(z) = e ϑ ϑ0,f(z) Λ(ϑ ϑ0) ν(z) = Q ν,f,ϑ ϑ0 (z). This proves E µ,f E ν,f and the equality here follows by symmetry. e ϑ,f(z) +Λ(ϑ0) ϑ0,f(z) ν(z) ν(z) ν ν(z) = Q µ,f,ϑ 0. ν(z) ν(z) Lemma.. Let ν be a nonzero measure, f = (f,..., f df ), f i : Z R, i =,..., d f, g = (g,..., g dg ), g j : Z R, j =,..., d g. Then E ν,g E ν,f Lin{, g,..., g dg } Lin{, f,..., f df }, E ν,g = E ν,f Lin{, g,..., g dg } = Lin{, f,..., f df }. Proof. It is easy to see, that E ν,f = E ν,(,f). The rest results from the fact that exponential function is injective. Corollary.3. With using notation from Lemma. and D f := dim, f,..., f df there exists h = (h,..., h Df ), h i : Z R, i =,..., D f such that E ν,f = E ν,h and {h i, i =,..., D f } are linearly independent and linearly independent with (on Z). oreover, if E ν,g E ν,f, then dim, g,..., g dg =: D g D f and for h g := (h,..., h Dg ), it holds E ν,g = E ν,hg. Proof. By Lemma. and Steinitz s exchange theorem. The nonnegative integer D f is the dimension of the exponential family E ν,f. Theorem.4 (Uniqueness of the generalized ri-projection). For every pm P (on Z) and exponential family E = E ν,f with s(ν) = Z, there exists a unique pm P E E (the generalized reverse information projection; generalized ri-projection) such that D(P P E ) = D(P E). Proof. For details, see []. holds 3 ultinomial family For n, N N denote [0 : N] := {0,..., N}, [ : n] = {,..., n}. Z := {z = (z,..., z n ) [0 : N] n : n j= z j = N}, for z Z denote ( ) N z := N! n zj!. j= The set of all pm s on Z will be denoted := {P = ( P (z) ) z Z [0; ]Z : z Z P (z) = }, strictly positive pm s := {P = ( P (z) ) z Z (0; )Z : z Z P (z) = }. The family of multinomial distributions (multinomial family) is a set of pm s { Denote = = Q : Q(z) = Q : Q(z) = ( N z ( ) N n p zj j z, z Z; (p j) n j= = ( p(j) ) j [:n] j= = p P([ : n]). ) n j= pzj j, z Z; (p j) n j= = ( p(j) ) j [:n] = p P([ : n]) }. 3

4 It is easy to see, that the multinomial family is the closure of an exponential family, = E µ,f and = E µ,f with µ(z) = ( ) N z and f(z) = z. Its dimension is equal to n and for ϑ R n, Q µ,f,ϑ =: Q = E µ,f, one e has p j = ϑ j n k= eϑ k. Let (X,..., X N ) be the random vector with identical marginal distributions X k p P([ : n]), k =,..., N. Denote V j := {i [ : N] : Y i = j}, j =,..., n. Then V = (V,..., V n ) Q if and only if X,..., X N are mutually independent. Now, the problem of maximization of D(P ) = D(P ) can be formulated in a different equivalent way. Denote X = [ : n] N the state space of the random vector (X,..., X N ). For x = (x,..., x N ) X and permutation π : [ : N] [ : N] let x π = (x π(),..., x π(n) ). The set of all permutations π on [ : N] will be denoted as [ : N]!. Denote: E := {P P(X) : P (x) = P (x π ), x X; π [ : N]!}, F := {Q P(X) : Q(x) = N i= Q i(x i ), x X}, where Q i (x i ) = x X:x i =xi Q(x ), i =,..., N. Finally, E := P(X) E, F := P(X) F. Lemma 3.. With using a previous notation and X z := {x X : j [ : n] : {i [ : N] : x i = j} = z j }, it holds: (i) The mapping h : E such that h(p ) = P, P (x) = P (z) ( N z ) for z Z s.t. x Xz is a bijection, h : E F = E F and for h : E, the inverse of h, h (P ) = P and P (z) = ( ) N z P (x) for any x X z. (ii) For any P, Q, it holds D(P Q) = D ( h(p ) h(q) ). (iii) For any P E, Q F \ E F, there exists π [ : N]!, such that for Q π, Q π (x) = Q(x π ), it holds Q π Q and D(P Q) = D(P Q π ). (iv) For any P E: D(P F) = inf Q E F D(P Q) and arg inf Q E F D(P Q) = P F E F. (v) sup arg sup D(P ) = sup D(P ) = h (arg sup D(P E F) = sup D(P F) D(P E F)) = h (arg sup sup D(P F) and P P(X) D(P F)). Proof. Due to the uniqueness of the ri-projection (Theorem.4), (iii) (iv). Other propositions are straightforward. It is well known, that for P P(X), the D(P F) = (P ), the multi-information. The problem of maximizing the multi-information over the P(X) has explicit solution and was solved in []. Theorem 3. (aximizers of D( F) = ( )). The set of maximizers of D( F) = ( ) is equal to arg sup D(P F) = P P(X) P Π = n δ n (j,π(j),...,π N (j)) : Π = (π,..., π N ) [ : n]! (N ), j= D(P Π F) = (N ) ln(n) and P F Π = U X = n N x X δ x, Π [ : n]! (N ). Proof. For details, see [], Theorem 4.3 and Corollary 4.0. j Denote e j = e j,j = (0,..., 0,, 0,..., 0) }{{}, e k,l = (0,..., 0,, 0,..., 0,, 0,..., 0) }{{}, j, k, l [ : n], k < l; n n ɛ j,j = δ e j,j, ɛ k,l = δ e k,l. k l 4

5 Corollary 3.3 (The set of maximizers of D( )). When using notation of Lemma 3., it holds: ) j [:n]:j π(j) arg sup D(P ) = h (E arg sup D(P F) P P(X) For N =, arg sup D(P ) = = P π = ɛ j,π(j), π [ : n]! : [π(j) = k] [π(k) = j], j, k [ : n] n. For N >, the only maximizer is P Id = n n δ Ne j. j= sup D(P ) = (N ) ln(n) and for every maximizer P sup, it holds Psup(z) = (N z ), z Z. n N Proof. To avoid trivial cases, let n, N. By Lemma 3. and Theorem 3.: sup D(P ) = sup E D(P F) sup P(X) D(P F) = (N ) ln(n). It is easy to see, that P 0 = P (Id,...,Id) is a maximizer (on P(X)) and even P 0 E (Id is an identity mapping on [ : n]). In order of finding the rest of maximizers (on P(X)) which also belong to E, for another maximizer P E, P 0 P = P (π,...,π N ): π i Id and π i (j) j for some i [ : N] and j [ : n]. Thus, (j,..., π i (j),... ) s(p ) and (from the fact, that P E) also (π i (j),..., j,... ) s(p ). If N >, then (j,..., π i (j),..., k,... ) s(p ) and also (π i (j),..., j,..., k,... ) s(p ) for some k [ : n]. Hence, for some l [ : N], π l is not injective, but π l is a permutation and this is contradiction. The rest simply follows. When considering the binomial case (n = ), the application of the ri-projection theorem (Theorem.4), result (in []) of N. Ay and A. Knauf (Theorem 3.) and Lemma 3. (prop. (v)) substantially simplified the proof of the result given in [3] (see proof of Proposition ). Example 3.4 (Ad: Example., N =, n = ). arg sup P(X) D(P F) = { (δ + δ ), (δ + δ )} arg sup E D(P E F) = E arg sup P(X) D(P F) = { (δ + δ ), (δ + δ )} arg sup D(P ) = h (E arg sup P(X) D(P F)) = { (δ 0 + δ 0 ), δ )} sup D(P ) = ln Figure 3 illustrates how the maximization of information divergence from multinomial family is related to the maximization of multi-information and the fact of Lemma 3., prop. (iii). Correspondingly, situation in the simplex is depiced on Figure 4(a). Example 3.5 (Ad: Example.3, N = 3, n = ). arg sup P(X) D(P F) = { (δ + δ ), (δ + δ ), (δ + δ ), (δ + δ )} arg sup E D(P E F) = { (δ + δ )} arg sup D(P ) = { (δ 30 + δ 03 )} sup D(P ) = ln aximization problem in the simplex is illustrated on Figure 4(b). Example 3.6 (N =, n = 3). arg sup P(X) D(P F) = { 3 (δ + δ + δ 33 ), 3 (δ + δ 3 + δ 3 ), 3 (δ 3 + δ + δ 3 ), 3 (δ + δ + δ 33 ), 3 (δ + δ 3 + δ 3 ), 3 (δ 3 + δ + δ 3 )} arg sup E D(P E F) = { 3 (δ + δ + δ 33 ), 3 (δ + δ 3 + δ 3 ), 3 (δ 3 + δ + δ 3 ), 3 (δ + δ + δ 33 )} arg sup D(P ) = { 3 (δ 00 + δ 00 + δ 00 ), 3 δ δ 0, 3 δ δ 0, 3 δ δ 0} sup D(P ) = ln 3. 5

6 δ (δ + δ) E E F U X F δ δ δ δ Q U X (δ + δ) F E F Q π (a) The simplex P(X) δ δ δ (b) Factorizable pm s F Figure 3: Relation between maximization of information divergence and multi-information for N =, n =. δ 0 δ 03 P sup = (δ30 + δ03) (δ0 + δ0) δ 30 δ h (U X ) Psup δ 0 δ δ (a) N =, n = (b) N = 3, n = Figure 4: Ad Figure and Figure : aximization in. References [] Ay, N., Knauf, A. (006). aximizing multi-information. Kybernetika [] Csiszár, I., atúš, F. (003). Information projections revisited. IEEE Transactions Information Theory [3] atúš, F. (004). aximization of information divergences from binary i.i.d. sequences. Proceedings IPU (004) Perugia, Italy. 6

Maximization of Multi - Information

Maximization of Multi - Information Maximization of Multi - Information Week of Doctoral Students 2007 Jozef Juríček http://www.adultpdf.com Academy of Sciences of the Czech Republic Created by Image To PDF trial version, Institute to remove

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

Convergence of generalized entropy minimizers in sequences of convex problems

Convergence of generalized entropy minimizers in sequences of convex problems Proceedings IEEE ISIT 206, Barcelona, Spain, 2609 263 Convergence of generalized entropy minimizers in sequences of convex problems Imre Csiszár A Rényi Institute of Mathematics Hungarian Academy of Sciences

More information

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University Charalampos E. Tsourakakis January 25rd, 2017 Probability Theory The theory of probability is a system for making better guesses.

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Machine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang

Machine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang Machine Learning Lecture 02.2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering The Hong Kong University of Science and Technology Nevin L. Zhang

More information

Algebraic matroids are almost entropic

Algebraic matroids are almost entropic accepted to Proceedings of the AMS June 28, 2017 Algebraic matroids are almost entropic František Matúš Abstract. Algebraic matroids capture properties of the algebraic dependence among elements of extension

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families

More information

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018

CS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 1 Overview This lecture mainly covers Recall the statistical theory of GANs

More information

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H.

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H. Appendix A Information Theory A.1 Entropy Shannon (Shanon, 1948) developed the concept of entropy to measure the uncertainty of a discrete random variable. Suppose X is a discrete random variable that

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Probabilities Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: AI systems need to reason about what they know, or not know. Uncertainty may have so many sources:

More information

für Mathematik in den Naturwissenschaften Leipzig

für Mathematik in den Naturwissenschaften Leipzig ŠܹÈÐ Ò ¹ÁÒ Ø ØÙØ für Mathematik in den Naturwissenschaften Leipzig Finding the Maximizers of the Information Divergence from an Exponential Family by Johannes Rauh Preprint no.: 82 2009 Finding the

More information

A View on Extension of Utility-Based on Links with Information Measures

A View on Extension of Utility-Based on Links with Information Measures Communications of the Korean Statistical Society 2009, Vol. 16, No. 5, 813 820 A View on Extension of Utility-Based on Links with Information Measures A.R. Hoseinzadeh a, G.R. Mohtashami Borzadaran 1,b,

More information

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type

More information

EECS 750. Hypothesis Testing with Communication Constraints

EECS 750. Hypothesis Testing with Communication Constraints EECS 750 Hypothesis Testing with Communication Constraints Name: Dinesh Krithivasan Abstract In this report, we study a modification of the classical statistical problem of bivariate hypothesis testing.

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2017

Cheng Soon Ong & Christian Walder. Canberra February June 2017 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 679 Part XIX

More information

Application of Information Theory, Lecture 7. Relative Entropy. Handout Mode. Iftach Haitner. Tel Aviv University.

Application of Information Theory, Lecture 7. Relative Entropy. Handout Mode. Iftach Haitner. Tel Aviv University. Application of Information Theory, Lecture 7 Relative Entropy Handout Mode Iftach Haitner Tel Aviv University. December 1, 2015 Iftach Haitner (TAU) Application of Information Theory, Lecture 7 December

More information

Posterior Regularization

Posterior Regularization Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Mutual Information and Optimal Data Coding

Mutual Information and Optimal Data Coding Mutual Information and Optimal Data Coding May 9 th 2012 Jules de Tibeiro Université de Moncton à Shippagan Bernard Colin François Dubeau Hussein Khreibani Université de Sherbooe Abstract Introduction

More information

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Hierarchical Quantification of Synergy in Channels by Paolo Perrone and Nihat Ay Preprint no.: 86 2015 Hierarchical Quantification

More information

Quantitative Biology II Lecture 4: Variational Methods

Quantitative Biology II Lecture 4: Variational Methods 10 th March 2015 Quantitative Biology II Lecture 4: Variational Methods Gurinder Singh Mickey Atwal Center for Quantitative Biology Cold Spring Harbor Laboratory Image credit: Mike West Summary Approximate

More information

Acta Universitatis Carolinae. Mathematica et Physica

Acta Universitatis Carolinae. Mathematica et Physica Acta Universitatis Carolinae. Mathematica et Physica František Žák Representation form of de Finetti theorem and application to convexity Acta Universitatis Carolinae. Mathematica et Physica, Vol. 52 (2011),

More information

On John type ellipsoids

On John type ellipsoids On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to

More information

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality

More information

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012) Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation

More information

On the Chi square and higher-order Chi distances for approximating f-divergences

On the Chi square and higher-order Chi distances for approximating f-divergences On the Chi square and higher-order Chi distances for approximating f-divergences Frank Nielsen, Senior Member, IEEE and Richard Nock, Nonmember Abstract We report closed-form formula for calculating the

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence

More information

Introduction to Statistical Learning Theory

Introduction to Statistical Learning Theory Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated

More information

Isodiametric problem in Carnot groups

Isodiametric problem in Carnot groups Conference Geometric Measure Theory Université Paris Diderot, 12th-14th September 2012 Isodiametric inequality in R n Isodiametric inequality: where ω n = L n (B(0, 1)). L n (A) 2 n ω n (diam A) n Isodiametric

More information

NOTES ON FINITE FIELDS

NOTES ON FINITE FIELDS NOTES ON FINITE FIELDS AARON LANDESMAN CONTENTS 1. Introduction to finite fields 2 2. Definition and constructions of fields 3 2.1. The definition of a field 3 2.2. Constructing field extensions by adjoining

More information

Restricted Boltzmann Machines

Restricted Boltzmann Machines Restricted Boltzmann Machines Boltzmann Machine(BM) A Boltzmann machine extends a stochastic Hopfield network to include hidden units. It has binary (0 or 1) visible vector unit x and hidden (latent) vector

More information

THE BORSUK-ULAM THEOREM FOR GENERAL SPACES

THE BORSUK-ULAM THEOREM FOR GENERAL SPACES THE BORSUK-ULAM THEOREM FOR GENERAL SPACES PEDRO L. Q. PERGHER, DENISE DE MATTOS, AND EDIVALDO L. DOS SANTOS Abstract. Let X, Y be topological spaces and T : X X a free involution. In this context, a question

More information

An Introduction to Expectation-Maximization

An Introduction to Expectation-Maximization An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative

More information

Chapter 2 Exponential Families and Mixture Families of Probability Distributions

Chapter 2 Exponential Families and Mixture Families of Probability Distributions Chapter 2 Exponential Families and Mixture Families of Probability Distributions The present chapter studies the geometry of the exponential family of probability distributions. It is not only a typical

More information

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities Maxim Raginsky and Igal Sason ISIT 2013, Istanbul, Turkey Capacity-Achieving Channel Codes The set-up DMC

More information

topics about f-divergence

topics about f-divergence topics about f-divergence Presented by Liqun Chen Mar 16th, 2018 1 Outline 1 f-gan: Training Generative Neural Samplers using Variational Experiments 2 f-gans in an Information Geometric Nutshell Experiments

More information

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma

More information

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes RAÚL MONTES-DE-OCA Departamento de Matemáticas Universidad Autónoma Metropolitana-Iztapalapa San Rafael

More information

Expressive Power and Approximation Errors of Restricted Boltzmann Machines

Expressive Power and Approximation Errors of Restricted Boltzmann Machines Expressive Power and Approximation Errors of Restricted Boltzmann Machines Guido F. Montúfar, Johannes Rauh, and Nihat Ay, Max Planck Institute for Mathematics in the Sciences, Inselstraße 0403 Leipzig,

More information

Upper triangular matrices and Billiard Arrays

Upper triangular matrices and Billiard Arrays Linear Algebra and its Applications 493 (2016) 508 536 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Upper triangular matrices and Billiard Arrays

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

On the Complexity of Best Arm Identification with Fixed Confidence

On the Complexity of Best Arm Identification with Fixed Confidence On the Complexity of Best Arm Identification with Fixed Confidence Discrete Optimization with Noise Aurélien Garivier, Emilie Kaufmann COLT, June 23 th 2016, New York Institut de Mathématiques de Toulouse

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

Lecture 2: Conjugate priors

Lecture 2: Conjugate priors (Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability

More information

CODE DECOMPOSITION IN THE ANALYSIS OF A CONVOLUTIONAL CODE

CODE DECOMPOSITION IN THE ANALYSIS OF A CONVOLUTIONAL CODE Bol. Soc. Esp. Mat. Apl. n o 42(2008), 183 193 CODE DECOMPOSITION IN THE ANALYSIS OF A CONVOLUTIONAL CODE E. FORNASINI, R. PINTO Department of Information Engineering, University of Padua, 35131 Padova,

More information

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA Discussiones Mathematicae General Algebra and Applications 23 (2003 ) 125 137 RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA Seok-Zun Song and Kyung-Tae Kang Department of Mathematics,

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Chapter 4: Modelling

Chapter 4: Modelling Chapter 4: Modelling Exchangeability and Invariance Markus Harva 17.10. / Reading Circle on Bayesian Theory Outline 1 Introduction 2 Models via exchangeability 3 Models via invariance 4 Exercise Statistical

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

Stochastic Realization of Binary Exchangeable Processes

Stochastic Realization of Binary Exchangeable Processes Stochastic Realization of Binary Exchangeable Processes Lorenzo Finesso and Cecilia Prosdocimi Abstract A discrete time stochastic process is called exchangeable if its n-dimensional distributions are,

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission

More information

Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes

Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes Daniel M. Roy University of Toronto; Vector Institute Joint work with Gintarė K. Džiugaitė University

More information

Lecture 22: Error exponents in hypothesis testing, GLRT

Lecture 22: Error exponents in hypothesis testing, GLRT 10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected

More information

STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY

STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY 2nd International Symposium on Information Geometry and its Applications December 2-6, 2005, Tokyo Pages 000 000 STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY JUN-ICHI TAKEUCHI, ANDREW R. BARRON, AND

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. 1. Prediction with expert advice. 2. With perfect

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

ECE521 Lectures 9 Fully Connected Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

The Method of Types and Its Application to Information Hiding

The Method of Types and Its Application to Information Hiding The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,

More information

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Vincent Tan, Matt Johnson, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems,

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel

More information

Split Rank of Triangle and Quadrilateral Inequalities

Split Rank of Triangle and Quadrilateral Inequalities Split Rank of Triangle and Quadrilateral Inequalities Santanu Dey 1 Quentin Louveaux 2 June 4, 2009 Abstract A simple relaxation of two rows of a simplex tableau is a mixed integer set consisting of two

More information

Hands-On Learning Theory Fall 2016, Lecture 3

Hands-On Learning Theory Fall 2016, Lecture 3 Hands-On Learning Theory Fall 016, Lecture 3 Jean Honorio jhonorio@purdue.edu 1 Information Theory First, we provide some information theory background. Definition 3.1 (Entropy). The entropy of a discrete

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

Divergences, surrogate loss functions and experimental design

Divergences, surrogate loss functions and experimental design Divergences, surrogate loss functions and experimental design XuanLong Nguyen University of California Berkeley, CA 94720 xuanlong@cs.berkeley.edu Martin J. Wainwright University of California Berkeley,

More information

Supplementary material for: Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model

Supplementary material for: Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model Supplementary material for: Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model Filip Matějka and Alisdair McKay May 2014 C Additional Proofs for Section 3 Proof

More information

The binary entropy function

The binary entropy function ECE 7680 Lecture 2 Definitions and Basic Facts Objective: To learn a bunch of definitions about entropy and information measures that will be useful through the quarter, and to present some simple but

More information

MONTE CARLO INVERSE. 1. Introduction. Consider a measure space (Ω, F, µ) and a set of measurable non-negative constraint functions

MONTE CARLO INVERSE. 1. Introduction. Consider a measure space (Ω, F, µ) and a set of measurable non-negative constraint functions MONTE CARLO INVERSE RAOUL LEPAGE, KRZYSZTOF PODGÓRSKI, AND MICHA L RYZNAR Abstract. We consider the problem of determining and calculating a positive measurable function f satisfying g i fdµ = c i, i I

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Randomized Quantization and Optimal Design with a Marginal Constraint

Randomized Quantization and Optimal Design with a Marginal Constraint Randomized Quantization and Optimal Design with a Marginal Constraint Naci Saldi, Tamás Linder, Serdar Yüksel Department of Mathematics and Statistics, Queen s University, Kingston, ON, Canada Email: {nsaldi,linder,yuksel}@mast.queensu.ca

More information

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014 Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Information measures in simple coding problems

Information measures in simple coding problems Part I Information measures in simple coding problems in this web service in this web service Source coding and hypothesis testing; information measures A(discrete)source is a sequence {X i } i= of random

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Law of Cosines and Shannon-Pythagorean Theorem for Quantum Information

Law of Cosines and Shannon-Pythagorean Theorem for Quantum Information In Geometric Science of Information, 2013, Paris. Law of Cosines and Shannon-Pythagorean Theorem for Quantum Information Roman V. Belavkin 1 School of Engineering and Information Sciences Middlesex University,

More information

Stratégies bayésiennes et fréquentistes dans un modèle de bandit

Stratégies bayésiennes et fréquentistes dans un modèle de bandit Stratégies bayésiennes et fréquentistes dans un modèle de bandit thèse effectuée à Telecom ParisTech, co-dirigée par Olivier Cappé, Aurélien Garivier et Rémi Munos Journées MAS, Grenoble, 30 août 2016

More information

Course 212: Academic Year Section 1: Metric Spaces

Course 212: Academic Year Section 1: Metric Spaces Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........

More information

Lecture 17: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 31, 2016 [Ed. Apr 1]

Lecture 17: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 31, 2016 [Ed. Apr 1] ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture 7: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 3, 06 [Ed. Apr ] In last lecture, we studied the minimax

More information

G8325: Variational Bayes

G8325: Variational Bayes G8325: Variational Bayes Vincent Dorie Columbia University Wednesday, November 2nd, 2011 bridge Variational University Bayes Press 2003. On-screen viewing permitted. Printing not permitted. http://www.c

More information

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in

More information

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2 COMPSCI 650 Applied Information Theory Jan 21, 2016 Lecture 2 Instructor: Arya Mazumdar Scribe: Gayane Vardoyan, Jong-Chyi Su 1 Entropy Definition: Entropy is a measure of uncertainty of a random variable.

More information

A Representation Approach for Relative Entropy Minimization with Expectation Constraints

A Representation Approach for Relative Entropy Minimization with Expectation Constraints A Representation Approach for Relative Entropy Minimization with Expectation Constraints Oluwasanmi Koyejo sanmi.k@utexas.edu Department of Electrical and Computer Engineering, University of Texas, Austin,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Self-Organization by Optimizing Free-Energy

Self-Organization by Optimizing Free-Energy Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present

More information

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 14 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Density Estimation Maxent Models 2 Entropy Definition: the entropy of a random variable

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

Prequential Plug-In Codes that Achieve Optimal Redundancy Rates even if the Model is Wrong

Prequential Plug-In Codes that Achieve Optimal Redundancy Rates even if the Model is Wrong Prequential Plug-In Codes that Achieve Optimal Redundancy Rates even if the Model is Wrong Peter Grünwald pdg@cwi.nl Wojciech Kotłowski kotlowsk@cwi.nl National Research Institute for Mathematics and Computer

More information

Statistics (1): Estimation

Statistics (1): Estimation Statistics (1): Estimation Marco Banterlé, Christian Robert and Judith Rousseau Practicals 2014-2015 L3, MIDO, Université Paris Dauphine 1 Table des matières 1 Random variables, probability, expectation

More information

The information complexity of sequential resource allocation

The information complexity of sequential resource allocation The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation

More information

Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data

Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data Appendices for the article entitled Semi-supervised multi-class classification problems with scarcity of labelled data Jonathan Ortigosa-Hernández, Iñaki Inza, and Jose A. Lozano Contents 1 Appendix A.

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Bayesian Learning in Social Networks

Bayesian Learning in Social Networks Bayesian Learning in Social Networks Asu Ozdaglar Joint work with Daron Acemoglu, Munther Dahleh, Ilan Lobel Department of Electrical Engineering and Computer Science, Department of Economics, Operations

More information