Economics of Networks Social Learning

Similar documents
Bayesian Learning in Social Networks

LEARNING IN SOCIAL NETWORKS

6.207/14.15: Networks Lectures 22-23: Social Learning in Networks

Preliminary Results on Social Learning with Partial Observations

Rate of Convergence of Learning in Social Networks

Information diffusion in networks through social learning

Social Learning with Endogenous Network Formation

NBER WORKING PAPER SERIES BAYESIAN LEARNING IN SOCIAL NETWORKS. Daron Acemoglu Munther A. Dahleh Ilan Lobel Asuman Ozdaglar

Social network analysis: social learning

Information Aggregation in Complex Dynamic Networks

Markov Chains Handout for Stat 110

Convergence of Rule-of-Thumb Learning Rules in Social Networks

Lecture Slides - Part 1

DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition

Probability Models of Information Exchange on Networks Lecture 3

Markov Chains, Stochastic Processes, and Matrix Decompositions

Asymptotic Learning on Bayesian Social Networks

Lecture: Local Spectral Methods (1 of 4)

Introduction to Machine Learning CMU-10701

Some Notes on Costless Signaling Games

Spread of (Mis)Information in Social Networks

Area I: Contract Theory Question (Econ 206)

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.

Stochastic process for macro

arxiv: v1 [math.oc] 24 Dec 2018

Pathological Outcomes of Observational Learning

Consensus and Distributed Inference Rates Using Network Divergence

Learning distributions and hypothesis testing via social learning

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Finite-Horizon Statistics for Markov chains

Weighted Majority and the Online Learning Approach

Probability Models of Information Exchange on Networks Lecture 5

Definitions and Proofs

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

Near-Potential Games: Geometry and Dynamics

Learning from Others Outcomes

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Value and Policy Iteration

Information di usion in networks with the Bayesian Peer In uence heuristic

Reinforcement Learning

Hierarchical Clustering and Consensus in Trust Networks

NBER WORKING PAPER SERIES PRICE AND CAPACITY COMPETITION. Daron Acemoglu Kostas Bimpikis Asuman Ozdaglar

Uncertainty and Disagreement in Equilibrium Models

Economics 201B Economic Theory (Spring 2017) Bargaining. Topics: the axiomatic approach (OR 15) and the strategic approach (OR 7).

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

Coordination and Continuous Choice

Notes on Markov Networks

A PECULIAR COIN-TOSSING MODEL

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Agreement algorithms for synchronization of clocks in nodes of stochastic networks

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Markov Decision Processes

We set up the basic model of two-sided, one-to-one matching

Markov Chains, Random Walks on Graphs, and the Laplacian

Alternative Characterization of Ergodicity for Doubly Stochastic Chains

Sequential Decisions

MATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems

Opinion Dynamics in Social Networks: A Local Interaction Game with Stubborn Agents

Payoff Continuity in Incomplete Information Games

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

Uncertainty and Randomization

MDP Preliminaries. Nan Jiang. February 10, 2019

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Homework 4, 5, 6 Solutions. > 0, and so a n 0 = n + 1 n = ( n+1 n)( n+1+ n) 1 if n is odd 1/n if n is even diverges.

Ergodicity and Non-Ergodicity in Economics

Network Games: Learning and Dynamics

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

Graph Partitioning Using Random Walks

Virtual Robust Implementation and Strategic Revealed Preference

4.1 The basic model of herding

Assessing others rationality in real time

Equilibrium Refinements

6.207/14.15: Networks Lecture 24: Decisions in Groups

Continuity. Chapter 4

September Math Course: First Order Derivative

Robust Mechanism Design and Robust Implementation

A GENERAL MODEL OF BOUNDEDLY RATIONAL OBSERVATIONAL LEARNING: THEORY AND EXPERIMENT

1 Markov decision processes

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games

Markov Chains and Stochastic Sampling

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Recitation 1 (Sep. 15, 2017)

Lecture 5: Random Walks and Markov Chain

Cascades and herds. Chapter 4. One million people cannot be wrong.

Costly Social Learning and Rational Inattention

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

Review (Probability & Linear Algebra)

BELIEFS & EVOLUTIONARY GAME THEORY

Limit pricing models and PBE 1

Distributed Optimization. Song Chong EE, KAIST

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer Social learning and bargaining (axiomatic approach)

Set, functions and Euclidean space. Seungjin Han

Introduction to Machine Learning CMU-10701

Price and Capacity Competition

A hierarchical network formation model

Mechanism Design: Implementation. Game Theory Course: Jackson, Leyton-Brown & Shoham

Information Aggregation in Networks

Transcription:

Economics of Networks Social Learning Evan Sadler Massachusetts Institute of Technology Evan Sadler Social Learning 1/38

Agenda Recap of rational herding Observational learning in a network DeGroot learning Reading: Golub and Sadler (2016), Learning in Social Networks Supplement: Acemoglu et al., (2011), Bayesian Learning in Social Networks; Golub and Jackson (2010), Naïve Learning in Social Networks and the Wisdom of Crowds Evan Sadler Social Learning 2/38

The Classic Herding Model Two equally likely states of the world 2 {0, 1} Agents n = 1, 2,... sequentially make binary decisions x n 2 {0, 1} Earn payo 1 for matching the state, payo 0 otherwise Each agent receives a binary signal s n 2 {0, 1}, observes history of actions Signals i.i.d. conditional on the state: 1 P(s n = 0 = 0) = P(s n = 1 = 1) = g > 2 Evan Sadler Social Learning 3/38

Rational Herding Last time we showed in any PBE of the social learning game, we get herd behavior All agents after some time t choose the same action With positive probability, agents herd on the wrong action Ineÿciency reflects an informational externality Agents fail to internalize the value of their information to others Evan Sadler Social Learning 4/38

Observational Learning: A Modern Perspective Foursquare. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/ Evan Sadler Social Learning 5/38

Observational Learning: A Modern Perspective We observe more of what other people do......but observing entire history is less reasonable How does the observation structure a ect learning? Evan Sadler Social Learning 6/38

A Souped-up Model Two states of the world 2 {0, 1}, common prior q 0 = P( = 1) Agents n = 1, 2,... sequentially make binary decisions x n 2 {0, 1} Earn payo u(x n, ), arbitrary function satisfying u(1, 1) > u(0, 1), u(0, 0) > u(1, 0) Each agent receives a signal s n 2 S in an arbitrary metric space Signals are conditionally i.i.d. with distributions F Evan Sadler Social Learning 7/38

The Observation Structure Agent n has a neighborhood B(n) {1, 2,..., n 1}, observes x k for k 2 B(n) Information set I n = {s n, B(n), x k 8 k 2 B(n)} Neighborhoods drawn from a joint distribution Q that we call the network topology Q is common knowledge For this class, assume {B(n)} n2n are mutually independent Study perfect Bayesian equilibria of the learning game: n = arg max E [u(x, ) I n ] Evan Sadler Social Learning 8/38

A Complex Inference Problem x 2 = 0 2 x 4 =? 4 1 x 1 =? 3 x 3 = 1 5 Evan Sadler Social Learning 9/38

Learning Principles Cannot fully characterize decisions, focus on asymptotic outcomes Two learning principles: The improvement principle The large-sample principle Corresponding learning metrics: di usion vs. aggregation Evan Sadler Social Learning 10/38

Private and Social Beliefs Define the private belief p n = P( = 1 s n ), distribution G Social belief q n = P ( = 1 B(n), x k, k 2 B(n)) Support of private beliefs [, ] = inf{r 2 [0, 1] : P(p 1 r) > 0} = sup{r 2 [0, 1] : P(p 1 r) < 1} The expert signal s, binary with P( = 1 s = 0) =, P( = 1 s = 1) = Evan Sadler Social Learning 11/38

Learning Metrics Information di uses if lim inf E [u(x n, )] E[u( s, )] u n!1 Information aggregates if lim P (x n = ) = 1 n!1 A network topology Q di uses (aggregates) information if di usion (aggregation) occurs for every signal structure and every equilibrium strategy profile Evan Sadler Social Learning 12/38

Di usion vs. Aggregation If 1 = = 1, the two metrics coincide We say private beliefs are unbounded If > 0 and < 1, private beliefs are bounded Di usion is weaker condition than aggregation In complete network, aggregation i unbounded private beliefs (Smith and Sorensen, 2000) Our definition emphasizes role of network Complete network di uses, does not aggregate, information Evan Sadler Social Learning 13/38

Necessary Conditions for Learning Basic requirement: suÿcient connectivity An agent s personal subnetwork Bˆ(n) includes all m < n with a directed path to n Theorem If Q di uses information, we must have expanding subnetworks: for all K 2 N lim n!1 P( ˆB(n) < K) = 0 Evan Sadler Social Learning 14/38

The Improvement Principle Intuition: I can always pick a neighbor to copy Whom do I imitate? Can I improve? A heuristic approach: look at neighbor with largest index B(n) If we have expanding subnetworks, then P(B(n) < K)! 0 as n! 1 for any fixed K Key idea: imitate this neighbor if my signal is weak, follow my signal if it is strong Suboptimal rule, but it gives a lower bound on performance Rational agents must do (weakly) better Evan Sadler Social Learning 15/38

Two Lemmas Lemma Suppose Q has expanding subnetworks, and there exists a continuous increasing Z such that Z(u) > u for all u < u, and Then Q di uses information. E [u(x n, )] Z(E [u(x B(n), )]) Lemma There exists a continuous increasing Z with Z(u) > u for all u < u such that for any m 2 B(n). E [u(x n, )] Z(E [u(x m, )]) Evan Sadler Social Learning 16/38

A Key Assumption: Independent Neighborhoods Our two lemmas imply that information di uses in any suÿciently connected network Relies on independence of neighborhoods If neighborhoods are correlated, the fact that I observe someone is related to how informative their choice is Evan Sadler Social Learning 17/38

Failure to Aggregate Proposition (Acemoglu et al., 2011, Theorem 3) The topology Q fails to aggregate information if any of the following conditions hold: B(n) = {1, 2,..., n 1} B(n) 1 for all n B(n) M for all n and some M 2 N, and lim n!1 max m = 1 almost surely m2b(n) Evan Sadler Social Learning 18/38

The Large-Sample Principle Intuition: I can always learn from many independent observations Limiting connectively can create sacrificial lambs: B(m) = ; Proposition Suppose there exists a subsequence {m i } such that X i2n P(B(m i ) = ;) = 1, and lim n!1 P(m i 2 B(n)) = 1 for all i. Then Q aggregates information. Follows from a martingale convergence argument Evan Sadler Social Learning 19/38

Heterogeneous Preferences Key limitation so far: everyone has the same preferences Give each agent n a type t n 2 (0, 1) Payo s 8 < 1 + t if x = 0 u(x,, t) = : + 1 t if x = 1 The type t parameterizes the relative cost of error in each state Evan Sadler Social Learning 20/38

Failure of the Improvement Principle Copying a neighbor no longer guarantees same utility Copying works better when neighbor s preferences are close to own Assume B(n) = {n 1} for all n Odds have type 1 5, evens have type 4 5 G 0 (r) = 2r r 2 and G 1 (r) = r 2 Can show inductively that all odds (evens) err in state 0 (state 1) with probability at least 1 (homework problem) 4 Evan Sadler Social Learning 21/38

Robust Large-Sample Principle With full support in preference distribution, preferences can counterbalance social information Some agents will act on signals No need for sacrificial lambs Proposition Suppose preference types are i.i.d. with full support on (0, 1), and there exists an infinite sequence {m i } such that lim n!1 P(m i 2 B(n)) = 1 for all i. Then information aggregates. Evan Sadler Social Learning 22/38

Remarks on the SSLM Clear understanding of learning mechanisms Improvement vs. Large samples Di erent e ects of preference heterogeneity Rationality is a very strong assumption... but proofs are based on heuristic benchmarks Can t say much about rate of learning, influence Evan Sadler Social Learning 23/38

A Di erent Approach Look at a model of heuristic learning based on DeGroot (1974) Finite set N of agents, time is discrete At time t, agent i has a belief or opinion x i (t) 2 [0, 1] How likely is it the state is 1? How good is politician X? A simple update rule: X x i (t) = W ij x j (t 1) j2n Think of W as a weighted graph Evan Sadler Social Learning 24/38

DeGroot Updating Assumptions: The x i (0) are given exogenously The matrix W is an n n matrix with non-negative entries P For each i we have j2n W ij = 1 Take a weighted average of friends opinions Simple example: Consider an unweighted graph G, agent i has degree d i 1 W ij = d for each neighbor j of i, and W i ij = 0 for each non-neighbor Evan Sadler Social Learning 25/38

Matrix Powers and Markov Chains Can rewrite the update rule as x(t) = W x(t 1) =) x(t) = W t x(0) Reduction to dynamics of matrix powers Entries in each row sum to 1, so this is a row-stochastic matrix Correspond to transition probabilities for an n-state Markov chain How to think about W ij t @x i(t) = W t : influence of j on i s time t opinion @x j (0) ij t W ij sums over all paths of indirect influence Evan Sadler Social Learning 26/38

The Long-Run Limit Does each individual s estimate settle down to a long-run limit? Does lim t!1 x i (t) exist? Do agents reach a consensus? If so, what does it look like? How do long-run beliefs depend on W and the initial estimates x(0)? Start with strongly connected networks The network W is strongly connected if there is a directed path from i to j for every i, j 2 N Call W primitive if there exists q such that every entry of W q is strictly positive Equivalent to aperiodicity in the network Evan Sadler Social Learning 27/38

The Long-Run Limit Theorem Suppose W is strongly connected and aperiodic. The limit lim t!1 x i (t) exists and is the same for each i. Proof: The sequence max i x i (t) is monotonically decreasing The sequence min i x i (t) is monotonically increasing Primitivity ensures the two extreme agents put at least weight w > 0 on each other after q steps Distance between max and min decreases by factor at least 1 w after every q steps Evan Sadler Social Learning 28/38

Influence on the Consensus lim x(t) = lim W t x(0) t!1 t!1 The matrix powers must converge Moreover, since agents reach consensus, it must be that all rows of W t converge to the same vector ˇ X x(1) = ˇT x(0) = ˇix i (0) i2n The coeÿcient ˇi gives the influence of agent i on the consensus Depends only on the network W, not on initial estimates x(0) Vector ˇ must satisfy ˇT W = ˇ Left eigenvector with eigenvalue 1 Evan Sadler Social Learning 29/38

Influence on the Consensus Theorem If W is strongly connected and primitive, then for all i X lim x i (t) = ˇix i (0) t!1 i2n where ˇi is the left eigenvector centrality of i in W Note vector ˇ is also the unique stationary distribution of the Markov chain with transition probabilities given by W Can also be seen as a consequence of the Perron-Frobenius Theorem from linear algebra Evan Sadler Social Learning 30/38

Beyond Strong Connectedness If network not strongly connected, can decompose into strongly connected subgraphs Equivalent to reduction of a Markov chain to closed communicating classes Analyze each subgraph separately using earlier result Agents i and j are is same communicating class if there is a directed path from i to j and vice versa No longer guarantee consensus Consensus within communicating classes, not necessarily across Small amount of communcation across classes makes large (discontinuous) di erence in asymptotic outcomes Evan Sadler Social Learning 31/38

When is Consensus Correct? Are large populations able to aggregate information? Suppose there is some true state µ 2 [0, 1], and agents begin with noisy estimates of µ Suppose the x i (0) are i.i.d. random variables with mean µ, variance 2 Consider an infinite sequence of networks {W (n) } 1 n=1, population getting larger If x (n) (1) is the consensus estimate in network n, do these estimates converge to µ as n! 1? Evan Sadler Social Learning 32/38

When is Consensus Correct? Theorem (Golub and Jackson, 2010) The consensus beliefs x (n) (1) converge in probability to µ if and only if lim max ˇ(n) n!1 i = 0. i The influence of the most central agent in the network converges to zero Proof: We have V ar x (n) (1) µ = h i P (n) n i=1(ˇi ) 2 2 (n) Converges to zero if and only if max i ˇi! 0 If not, no convergence in probability If it does, Chebyshev s inequality implies convergence in probability Evan Sadler Social Learning 33/38

Speed of Convergence Consensus might be irrelevant if it takes too long to get there How long does it take for di erences to get small? What network properties lead to fast or slow convergence? Note, first question depends both on network and initial estimates If we start at consensus, we stay there Focus on worst-case convergence time, highlight role of network Evan Sadler Social Learning 34/38

A Spectral Decomposition Lemma For generic W, we may write W t = nx l=1 t lp l where 1 = 1, 2,..., n are n distinct eigenvalues of W P l is a projection onto the eigenspace of l P 1 = W 1 and P 1 x(0) = x(1) P l 1 = 0 for all l > 1, where 1 is a vector of all ones All other eigenvalues strictly smaller in absolute value than 1 = 1 Evan Sadler Social Learning 35/38

Speed of Convergence Theorem For generic W, 1 2 2 t (n 2) 3 t sup x(0)2[0,1] n kx(t) x(1)k 1 (n 1) 2 t. Note k k 1 denotes the supremum norm, largest deviation from consensus among all agents Clear answer to first question: rate of convergence depends on second largest eigenvalue Larger 2 (i.e. smaller spectral gap) implies slower convergence Evan Sadler Social Learning 36/38

Segregation and Slow Convergence What network features correspond to large 2? On an intuitive level, we get slow convergence in highly segregated networks Define the bottleneck ratio P i2m,j /2M ˇiW ij (W ) = min P MN ˇ(M) 1 2 i2m ˇi Small when some influential group pays little attention to those outside itself Can use to bound size of 2 Evan Sadler Social Learning 37/38

Wrap Up Limited ability to learn through observation Information externality creates ineÿciency Heterogeneity may help or hurt depending on network properties Naïve learning model gives measures of influence, learning rate Next time: moving on to models of di usion, di erent influence mechanism Evan Sadler Social Learning 38/38

MIT OpenCourseWare https://ocw.mit.edu 14.15J/6.207J Networks Spring 2018 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.