Finite Markov Information-Exchange processes

Similar documents
3. The Voter Model. David Aldous. June 20, 2012

Finite Markov Information-Exchange processes. 1. Overview

Interacting particle systems as stochastic social dynamics

4: The Pandemic process

2. The averaging process

Mixing Times and Hitting Times

2. The averaging process

Lecture: Local Spectral Methods (1 of 4)

G(t) := i. G(t) = 1 + e λut (1) u=2

A lecture on the averaging process

Finite Markov Information-Exchange processes

Discrete random structures whose limits are described by a PDE: 3 open problems

Lecture 4: Constructing the Integers, Rationals and Reals

Convergence Rate of Markov Chains

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

Introduction to Algebra: The First Week

Modelling data networks stochastic processes and Markov chains

MAS275 Probability Modelling Exercises

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Lecture 21. David Aldous. 16 October David Aldous Lecture 21

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

Math 308 Spring Midterm Answers May 6, 2013

Agreement algorithms for synchronization of clocks in nodes of stochastic networks

Random Times and Their Properties

Admin and Lecture 1: Recap of Measure Theory

On Martingales, Markov Chains and Concentration

Latent voter model on random regular graphs

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: NP-Completeness I Date: 11/13/18

Part III: A Simplex pivot

Week 2: Counting with sets; The Principle of Inclusion and Exclusion (PIE) 13 & 15 September 2017

Stochastic Modelling Unit 1: Markov chain models

DS504/CS586: Big Data Analytics Graph Mining II

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

On Objectivity and Models for Measuring. G. Rasch. Lecture notes edited by Jon Stene.

ECEN 689 Special Topics in Data Science for Communications Networks

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Oberwolfach workshop on Combinatorics and Probability

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Consensus on networks

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Question Points Score Total: 70

Computational Tasks and Models

Lecture 10: Powers of Matrices, Difference Equations

Math Lab 10: Differential Equations and Direction Fields Complete before class Wed. Feb. 28; Due noon Thu. Mar. 1 in class

20.1 2SAT. CS125 Lecture 20 Fall 2016

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

Calculus at Rutgers. Course descriptions

Selecting Efficient Correlated Equilibria Through Distributed Learning. Jason R. Marden

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

Figure 10.1: Recording when the event E occurs

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Theory of Stochastic Processes 3. Generating functions and their applications

Class President: A Network Approach to Popularity. Due July 18, 2014

6 Markov Chain Monte Carlo (MCMC)

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

Math 381 Midterm Practice Problem Solutions

Lecture 3: Sizes of Infinity

Please simplify your answers to the extent reasonable without a calculator. Show your work. Explain your answers, concisely.

2 Systems of Linear Equations

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Chapter 3 Representations of a Linear Relation

Web Structure Mining Nodes, Links and Influence

Math 440 Project Assignment

Space and Nondeterminism

The Multi-Agent Rendezvous Problem - The Asynchronous Case

Basic Definitions: Indexed Collections and Random Functions

EE595A Submodular functions, their optimization and applications Spring 2011

Theory of Computer Science

Essential facts about NP-completeness:

The Derivative of a Function

Lecture 28: April 26

Notes 6 : First and second moment methods

course overview 18.06: Linear Algebra

Distributed Optimization over Networks Gossip-Based Algorithms

CIS 2033 Lecture 5, Fall

Looking Ahead to Chapter 10

Entropy for Sparse Random Graphs With Vertex-Names

Turing Machines. Nicholas Geis. February 5, 2015

Combining Memory and Landmarks with Predictive State Representations

Math 396. Quotient spaces

Theory of Computer Science. Theory of Computer Science. E1.1 Motivation. E1.2 How to Measure Runtime? E1.3 Decision Problems. E1.

Modelling data networks stochastic processes and Markov chains

MATH 320, WEEK 6: Linear Systems, Gaussian Elimination, Coefficient Matrices

Lecture 17 Brownian motion as a Markov process

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

PageRank: The Math-y Version (Or, What To Do When You Can t Tear Up Little Pieces of Paper)

Stat 150 Practice Final Spring 2015

CSE250A Fall 12: Discussion Week 9

Lecture: Modeling graphs with electrical networks

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Topic 1: a paper The asymmetric one-dimensional constrained Ising model by David Aldous and Persi Diaconis. J. Statistical Physics 2002.

Plan Martingales cont d. 0. Questions for Exam 2. More Examples 3. Overview of Results. Reading: study Next Time: first exam

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step.

The Asymptotic Theory of Transaction Costs

Introduction to Complexity Theory

Conditional probabilities and graphical models

Lecture 4. David Aldous. 2 September David Aldous Lecture 4

Transcription:

Finite Markov Information-Exchange processes David Aldous February 2, 2011 Course web site: Google Aldous STAT 260. Style of course Big Picture thousands of papers from different disciplines (statistical physics and interacting particle systems; epidemic theory; broadcast algorithms on graphs; ad hoc networks; social learning theory) use stochastic models to study questions involving information flow through networks. We will study some of the mathematics of such processes............ working within a particular framework (FMIE processes), described in this lecture, which provides a (mathematically) convenient abstraction of many such models.

In the first half of the course we work explicitly in the FMIE setting, seeing what results can be derived easily from standard background the pre-existing theories of Markov chains, interacting particle processes, and general modern mathematical probability. Here our theme is Don t try to be clever Often we just quote background. Regrading proofs, a guideline is give only arguments that are useful in more than one context. In the second half of the course (partly as student projects) we look at recent papers some explicitly in the FMIE setting, some which could be re-formulated in that setting. What do I need to do to get a grade? Find something that interests you and is moderately related to course material. Most common option ( paper project ): read a recent paper and give a 20-minute talk during last 2 weeks of semester. Alternatives: theory/simulation/literature-search projects. Suggestions on web site; will be edited as course progresses. Only first half of course is pre-planned, so additional suggestions welcome (sooner better than later).

Prerequisites Undergraduate probability; EX and all that. Basic Markov chain notions. Basic algorithms on graphs notions. The next few lectures go a little way beyond the basics of finite reversible Markov chains, emphasising mixing and hitting times, and the standard examples of random walks on the complete graph, the d-dimensional grid, and on random graphs with prescribed degree distributions. This topic is treated in much more detail in Levin-Peres-Wilmer Markov Chains and Mixing Times Aldous-Fill Reversible Markov Chains and Random Walks on Graphs, accessible from class web page. Recall the notion of a rate-λ Poisson process of times of events 0 < ξ 1 < ξ 2 <.... The mean number of events in a time interval of duration t equals λt, and this particular process formalizes the idea that the times are completely random. What (mathematically) is a social network? Usually formalized as a graph, whose vertices are individual people and where an edge indicates presence of a specified kind of relationship.

In many contexts it would be more natural to allow different strengths of relationship (close friends, friends, acquaintances) and formalize as a weighted graph. The interpretation of weight is context-dependent. In some contexts (scientific collaboration; corporate directorships) there is a natural quantitative measure, but not so in friendship -like contexts. Our particular viewpoint is to identify strength of relationship with frequency of meeting, where meeting carries the implication of opportunity to exchange information. Because we don t want to consider only social networks, we will use the neutral word agents for the n people/vertices. Write ν for the weight on edge, the strength of relationship between agents i and j. Here is the model for agents meeting (i.e. opportunities to exchange information). Each pair i, j of agents with ν > 0 meets at random times, more precisely at the times of a rate-ν Poisson process. Call this the meeting model. It is parametrized by the symmetric matrix N = (ν ) without diagonal entries. A natural geometric model is to visualize agents having positions in 2-dimensional space, and take ν as a decreasing function of Euclidean distance. This is hard to study analytically. Most analytic work implicitly takes N as the adjacency matrix of an unweighted graph.

Schematic the meeting model on the 8-cycle. 0=8 7 6 5 4 3 2 1 0 What is a FMIE process? Such a process has two levels. 1. Start with a meeting model as above, specified by the symmetric matrix N = (ν ) without diagonal entries. 2. Each agent i has some information (or state ) X i (t) at time t. When two agents i, j meet at time t, they update their information according to some rule (deterministic or random). That is, the updated information X i (t+), X j (t+) depends only on the pre-meeting information X i (t ), X j (t ) and added randomness. We distinguish the two levels as geometric substructure and informational superstructure. Different FMIE models correspond to different rules for the informational superstructure.

Model 0. A token-passing model. [hot potato; white elephant; mathom] There is one token. When the agent i holding the token meets another agent j, the token is passed to j. The natural aspect to study is Z(t) = the agent holding the token at time t. This Z(t) is the continuous-time Markov chain with transition rates (ν ). (Continuous-time chains, reviewed in next lecture, are very similar to discrete-time ones). As we shall see, for some FMIE models interesting aspects of their behavior can be related fairly directly to behavior of this associated MC, while for others any relation is not so visible. Model 1. The simple epidemic model. Initially one or more agents are infected. Whenever an infected agent meets another agent, the other agent becomes infected. If initially the single agent i is infected, natural objects of study are T epi T epi i F epi = time until agent j is infected = time until all agents are infected i (t) = proportion of agents infected at time t.

The Markov chain and the simple epidemic can be viewed as the two fundamental FMIE processes. The Markov chain because it is mathematically simplest. The epidemic because, roughly speaking, in any FMIE information cannot spread faster than in the epidemic process. For our purposes, MC theory is well enough understood both in general and in many specific geometries. In contrast, the simple epidemic has been studied in various specific geometries (a 2001 paper Epidemic spreading in scale-free networks has acquired 1312 citations!) but scarcely studied in general. As an illustration of why the two processes are fundamental, consider the following 3 FMIE models. Model 2. Averaging model. When agents i and j meet, each replaces their (real-valued) information by the average (x i (t) + x j (t))/2. Model 3. Random consensus-seeking model. Each agent initially has some different opinion. When two agents meet, they adopt the same opinion, randomly chosen from their two previous opinions. Model 4. Ordered consensus-seeking model. Each agent initially has some different opinion, represented as random uniform(0, 1) numbers. When two agents meet, they adopt the same opinion, the smaller of the two numbers. You should immediately see how Model 4 relates to the epidemic process. We will see later (not so obvious) that Models 2 and 3 have explicit connections with the associated Markov chain. Note that 4 (which 4?) out of these 5 models have deterministic information-update rules.

Our notion of information flow is different from communication, in the sense of a particular message being conveyed from source to destination. A standard high-level abstraction of a communication network is what I ll call the graph model : a vertex can transmit to any neighbor in unit time slots. Within the graph model, there is a trivial algorithm ( epidemic routing ) that broadcasts a message from one originating vertex to all n vertices in minimal time: after receiving the message for the first time at t, a vertex transmits to all other neighbors at time t + 1. Moreover the total number of transmits is at most d n where d is maximal degree, so for this to be a good algorithm we need only know that d is small (and that the graph is connected). Moving away from directed communication within networks of machines toward e.g. spread of ideas between people, the graph model seems less appealing and the FMIE setting seems more appealing. One general program; take algorithmic problems such as broadcast/coordination problems, previously studied in the graph model, and reconsider them in the FMIE setting. Designing a broadcast algorithm: In the FMIE setting, what is a good rule to use to convey a message from one initial agent i to all agents? Could just simulate the epidemic process itself : after receiving the message, copy to each new agent you meet. This takes time T epi i, and no algorithm could do better. But this isn t an honest algorithm (one needs a rule telling each vertex when to stop copying) and it s not clear how to bound the number of transmits in terms of the matrix (ν ). So Vague Big Problem: Can one design a broadcast algorithm that succeeds w.h.p. in time O(T epi i ) with O(n) transmits, provided the matrix (ν ), unknown in detail to the algorithm, satisfies some side condition?

Viewing a FMIE as an algorithm designed for a specific purpose (e.g. a broadcast algorithm or coordination algorithm), knowing some features of the rates (ν ) would presumably help. So Vague Big Problem: what features of an (unknown in detail) rate matrix (ν ) can be learned from running some FMIE designed to learn the feature? Note that FMIEs are quite different from MCs in this respect. One cannot (except in very special settings) estimate the mixing time τ mix of a MC by running the MC for O(τ mix ) steps. But parameters of the form τ epi := ave i,j ET epi can be estimated in time O(τ epi ) by simply running the epidemic process from random starts. Note: we take the total meeting rate j ν for each agent i to be O(1); very roughly, we consider algorithms doing O(1) computation per unit time and small storage. Minor theory/literature project. For unknown rates λ j, j J from an unknown index set J, we observe the non-zero values and j-values of independent Poisson(λ j ) RVs. What can we infer about the rates? Simulation project. Cute animation of meeting process? Simulation project. Fix some 50-agent geometry. Do simulations of each of the main models on this geometry.

Overview of course 3 brief generalities about FMIE processes Finite reversible Markov chains, emphasising mixing and hitting times, and the standard examples of random walks on the complete graph, the d-dimensional grid, and on random graphs with prescribed degree distributions. Mathematics of simple information-exchange models (like Models 1-4), relating their behavior to the rate matrix (ν ). Algorithmic contexts Game theoretic contexts The style is very variable some vague high-level discussion, some math details. 3 observations about general FMIEs 1. Irreducibility and convergence. n = number of agents. If an agent s information is limited to k states then the whole FMIE process itself is a k n -state continuous-time MC. So qualitative time-asymptotics are determined by the strongly connected components of the transition graph of the whole process. [Discussion on board, leading to... ] for a given informational-level model, in order that an n-agent case be irreducible it is sufficient, but not necessary, that the 2-agent case is irreducible. Literature-search project. Find relevant work on asynchronous cellular automata. Theory project. Can quantify mixing (convergence) times in reducible setting. Consider deterministic rules (i, j) (f (i, j), f (j, i)). Fix geometry as n-path or K n. For given (n, k), what can you say about worst-case (over rules f ) mixing time? [More discussion on board].

2. Bottleneck parameters. Given a geometry with rate matrix N = (ν ), the quantity ν(a, A c ) = n 1 ν i A,j A c has the interpretation, in terms of the associated continuous-time Markov chain Z(t) at stationarity, as flow rate from A to A c So if for some m the quantity P(Z(0) A, Z(dt) A c ) = ν(a, A c ) dt. φ(m) = min{ν(a, A c ) : A = m}, 1 m n 1 is small, it indicates a possible bottleneck subset of size m. For many models one can obtain upper bounds (on the expected time until something desirable happens) in terms of the parameters (φ(m), 1 m n/2). Such bounds are always worth having but their significance is sometimes overstated; note φ(m) is not readily computable, or simulate-able The bounds are often rather crude for a specific geometry More elegant to combine the family (φ(m), 1 m n/2) into a single parameter, but the way to do this is model-dependent. 3. Time-reversal duality Fix a time t. Regard meeting times during [0, t] as arbitrary, and consider an event A determined by the meeting times. We can reflect or time-reverse meeting times via the map s t s. Let A be the event determined by these reflected times. In our probability model, reflection does not change the distribution of the Poisson processes, so we have a time-reversal duality principle P(A ) = P(A). This principle is useful in several models. Let s see carefully what this says for the simple epidemic model. = time until agent j is infected, if initially only i is infected. What symmetry properties does the matrix (T epi ) have? T epi

0=8 7 6 5 4 3 2 1 0 For fixed t, the matrix of events {T epi ( ) {T epi t} d = In particular, for a single entry we have P(T epi since this is true t T epi t} has symmetric distribution: ( ) {T epi ji t} (1) t) = P(T epi ji t), and d = T epi ji. (2) One might guess this distributional symmetry holds for the whole matrix, but that is wrong. Consider the 3-agent case ν ab = ν bc = 1, ν ac = 0. Then P(T epi ac = T epi epi bc ) > 0 but P(Tca = T epi cb ) > 0 implying (Tac epi, T epi (Tca epi, T epi cb ). However there are results for maxima: (1) implies that for fixed i and t bc ) d and since this is true t P(max T epi j t) = P(max T epi ji t) j max T epi j d = max j T epi ji.

Recall that in the token-passing model (as a FMIE), for the agent Z(t) holding the token at time t, the process (Z(t), t 0) evolves as the associated Markov chain. The next lectures treat Markov chains as objects in their own right. First, a few comments on the connection. The FMIE setting provides more structure than the associated MC alone. For instance one can ask how long until agent k receives the token or meets someone who previously held the token? ; this is a meaningful question for the token passing model but not for a plain MC. The FMIE setting provides a particular coupling of the Markov chains Z i (t) starting from different i. (Such a coupling is not automatically specified in the plain MC setup. Ours is essentially what is called the independent coupling.) The time-reversal duality argument above gives P(Z i (t) = j) = P(Z j (t) = i). Note that symmetry does not extend to hitting times: T hit d T hit ji in general. Sometimes we do martingale-like calculations. In that context, F(t) is the filtration of events in the underlying meeting model, together with the particular information-level process under consideration. [Board: k th hand process interpolate between MC and epidemic process].