Markov chains (week 6) Solutions

Similar documents
8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

Markov Chains Handout for Stat 110

Markov chains. Randomness and Computation. Markov chains. Markov processes

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

TMA 4265 Stochastic Processes Semester project, fall 2014 Student number and

Markov Chains and Stochastic Sampling

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

Social network analysis: social learning

STOCHASTIC PROCESSES Basic notions

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Probability & Computing

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

Online Social Networks and Media. Link Analysis and Web Search

18.06 Problem Set 7 Solution Due Wednesday, 15 April 2009 at 4 pm in Total: 150 points.

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Markov Chains (Part 3)

Markov Chains (Part 4)

Markov Processes Hamid R. Rabiee

6 Markov Chain Monte Carlo (MCMC)

Markov Chains, Stochastic Processes, and Matrix Decompositions

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

A Note on Google s PageRank

Lecture 20 : Markov Chains

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Markov Chains, Random Walks on Graphs, and the Laplacian

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Page rank computation HPC course project a.y

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

Transience: Whereas a finite closed communication class must be recurrent, an infinite closed communication class can be transient:

Online Social Networks and Media. Link Analysis and Web Search

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS

Reinforcement Learning

IEOR 6711: Professor Whitt. Introduction to Markov Chains

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

1 Ways to Describe a Stochastic Process

Markov Chains Absorption Hamid R. Rabiee

Ranking of nodes in graphs

0.1 Naive formulation of PageRank

Chapter 16 focused on decision making in the face of uncertainty about one future

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

1 9/5 Matrices, vectors, and their applications

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

6.842 Randomness and Computation February 24, Lecture 6

The Google Markov Chain: convergence speed and eigenvalues

ECEN 689 Special Topics in Data Science for Communications Networks

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

System of Linear Equations. Slide for MA1203 Business Mathematics II Week 1 & 2

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

The cost/reward formula has two specific widely used applications:

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Countable state discrete time Markov Chains

Chapter 7. Markov chain background. 7.1 Finite state space

Lecture 9 Classification of States

Recitation 8: Graphs and Adjacency Matrices

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

6.842 Randomness and Computation March 3, Lecture 8

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

Review of Linear Algebra

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

Lecture 5: Random Walks and Markov Chain

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

ISM206 Lecture, May 12, 2005 Markov Chain

4.7.1 Computing a stationary distribution

Markov chains and the number of occurrences of a word in a sequence ( , 11.1,2,4,6)

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.

The Theory behind PageRank

2 Discrete-Time Markov Chains

Uncertainty and Randomization

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

Math 110 Linear Algebra Midterm 2 Review October 28, 2017

EECS 495: Randomized Algorithms Lecture 14 Random Walks. j p ij = 1. Pr[X t+1 = j X 0 = i 0,..., X t 1 = i t 1, X t = i] = Pr[X t+

18.440: Lecture 33 Markov Chains

Ma/CS 6b Class 20: Spectral Graph Theory

Data Mining and Analysis: Fundamental Concepts and Algorithms

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

Math 1553, Introduction to Linear Algebra

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

This operation is - associative A + (B + C) = (A + B) + C; - commutative A + B = B + A; - has a neutral element O + A = A, here O is the null matrix

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Discrete Markov Chain. Theory and use

Link Analysis Ranking

Computing PageRank using Power Extrapolation

18.600: Lecture 32 Markov Chains

PageRank: Ranking of nodes in graphs

Markov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly

Lecture: Modeling graphs with electrical networks

Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

Stochastic Models: Markov Chains and their Generalizations

Transcription:

Markov chains (week 6) Solutions 1 Ranking of nodes in graphs. A Markov chain model. The stochastic process of agent visits A N is a Markov chain (MC). Explain. The stochastic process of agent visits A N is a Markov Chain because it is memoryless (the next node deps only on the current node, not on any history of nodes), and time-invariant (the probability of moving to a node j from a node i does not differ with time.) Give conditions for the following statements to be true: State i, meaning visit of agent to node i, of this MC is transient. Since the MC is finite, the only condition that a state i is transient is j can be reached from i through a path of nonzero probabilities but not vice versa. his could happen if there are no nodes in its incoming neighborhood, or if the nodes in its incoming neighborhood have no nodes in its incoming neighborhood, etc. However, here the transitions are reciprocal. Thus, if j can be reached through a path (of potentially multiple transitions) then by traversing the same path in the reverse direction, i can also be reached from j. Thus all of the states here are not transient. All states of this MC are transient. In a finite MC it is impossible that all of the states are transient. Because the MC is finite, the outside agent must return to at least some of the nodes if it keeps moving for a sufficiently large number of time steps. All the states of this MC are recurrent. We argued that none of the states are transient, thus all of the states are recurrent, by definition. This means that from any given node, you can travel to any other node in the network (not necessarily in one step), so that as n the outside agent must return to every node in the network. All states of this MC are aperiodic. Note that periodicity (and period) is a class property. Because transitions are reciprocal, one can return to a starting state in steps of multiples of 2. So, if there is at least one loop of odd length within a class, then that class is aperiodic (argue why.) Note that we assumed that there is no self loop (unless the class consists of only one state). All states are positive recurrent. Note again that positive recurrence is a class property (positive recurrence of all of the states should not be mixed up with the positive recurrence of the whole MC, which requires irreducibility.) In a finite MC, a state is positive recurrent if it is recurrent. We argued that all of the states are recurrent, thus all of the states are positive recurrent. All states are ergodic. Note that ergodicity is a class property. (Similarly, ergodicity of all of the states should not be mixed up with ergodicity of the whole MC, which requires irreducibility.) A state is ergodic iff it is aperiodic and positive recurrent. We argued that all of the states are positive recurrent, thus if all of the classes are either composed of a single state or has a odd-length loop, all of the states are ergodic. The MC is irreducible. By definition, every state needs to be able to communicate with any other state, i.e., there should be a path of nonzero probabilities between any two states. Thus there should not be singleton states (loners), neither any disjoint classes of friship. B Implement random walk. First, we provide a function which calculates the ranks for a given legitimate connectivity graph. The starting point is i=1: function ranks=ranks_by_random_walk(graph,n,initial_state)

J=min(size(graph)); % J : total number of states nr_neighbors = sum(graph); % number of neighbours of each state % Here we construct a matrix which contains the indices of the states % that each state is linked to. % This Matrix called "Neighbours" will serve as a lookup-table % in order to find out to which state exactly should the random walker % go. Specifically Neighbours(m,n)=f, if f is not zero, means that % state m is linked to state f k=max(nr_neighbors); Neighbours=zeros(J,k); for i=1:j temp=find(graph(i,:)); Neighbours(i,1:length(temp))=temp; i=initial_state; % Starting State nr_visits=zeros(j,1); for n=1:n nr_visits(i)=nr_visits(i)+1; i=neighbours(i,randi(nr_neighbors(i))); ranks=nr_visits/n; Now we call the function for the non-augmented graph, which is reducible, that is, there are disjoint classes of friship amongst the whole class. Results are depicted in fig 1. clc;close all;clear all; graph=[,,... ]; % The given graph % The next three lines are to make the "graph" a legitimate connectivity graph graph=graph+graph ; % to ensure "graph" is symmetric graph=graph>; % to ensure graph is all zeros and ones graph=graph*1; % to ensure entries are real numbers not logical zeros and ones J=min(size(graph)); figure for j=1:3 N=1ˆ(4+j); tic ranks=ranks_by_random_walk(graph,n,1); t=toc subplot(3,1,j) bar(1:j,ranks, r ) title([ N=1ˆ,num2str(j+4), Computation time is,num2str(t), sec ], FontSize,12) We can investigate the disjoint classes by looking at the resulting ranks in fig. 1. We observe that some of the states receive zero ranks (seen at states 6, 11, 16, 28, 33). They are thus states that do not belong to the class of state 1. Let s change the initial state to each one of these. If you set the starting state to 6, we achieve fig. 2 which shows that states 6, 16 and 33 construct a separate class, in which state 6 is in the

middle. Changing the states to 11 and 28 shows that each of them are singleton states (loners) and hence separate classes. In order to add the professor (fully connected node) to the set, we just need to augment the graph with a new node as in the following (the only changes are the 2nd and 3rd lines, which add an extra row and column consisting of all ones). The resulting figure is fig. 3 clc;close all;clear all; graph=[,,... ]; % The given graph J=min(size(graph)); graph=[graph ones(j,1);ones(1,j) ]; % Augmenting with the fully connected node % The next three lines are to make the "graph" a legitimate connectivity graph graph=graph+graph ; % to ensure "graph" is symmetric graph=graph>; % to ensure graph is all zeros and ones graph=graph*1; % to ensure entries are real numbers not logical zeros and ones J=min(size(graph)); figure for j=1:3 N=1ˆ(4+j); tic ranks=ranks_by_random_walk(graph,n,1); t=toc C subplot(3,1,j) bar(1:j,ranks, r ) title([ N=1ˆ,num2str(j+4), Computation time is,num2str(t), sec ], FontSize,12) Probability update. Following slides 19 and 2, in ranking nodes in graphs, the probability update rule is simply: p(n + 1) = P T p(n), where P is the transition probability matrix, and p(n) is the vector of probabilities of being in each of the states. D Find ranks using the probability update. The limits probabilities: lim n p i (n) exist as long as the MC is not periodic (when it is periodic, the probability oscillates). Since the MC here is aperiodic, these limits always exist (slides 77 to 82 of markov chains). However, for these limits to be indepent of initial distribution, the MC must be irreducible, positive recurrent, and aperiodic (hence an ergodic MC). Refer to slide 6 of markov chains. By adding fully connected node, the MC becomes irreducible and (since aperiodic) ergodic. If p A () = 1, that is the outside agent begins at node A with certainty, then the rank is: r i (A ) = lim n p i(n) Because r = π, where π is depent on the starting node. The modified MatLab code for computing the ranks using r i = lim n p i (n) follows. function ranks=ranks_by_probability_update(graph,n,initial_distribution) J=min(size(graph)); nr_neighbors = sum(graph);

transition_probabilities = graph; for k=1:j if nr_neighbors(k)> transition_probabilities(k,:)=graph(k,:)/nr_neighbors(k); else transition_probabilities(k,k)=1; J=min(size(graph)); % J : total number of states pi_d=initial_distribution; % given initial distribution for n=1:n pi_d=transition_probabilities *pi_d; ranks=pi_d; The results are found in fig. 4 and 5. The calculation times are reported in the figures as well, which are much less than the calculation times in the random walk approach of part B. And if you, like me, are wondering why the elapsed time decreases for N=1 and 15 as compared with N=5, let me know what you figure out! E Recast as system of linear equations. Focusing on the modified, irreducible graph, and following slide 64 of markov chains, we can compute the ranks using a system of linear equations. We know that for a MC: π = P T π π T 1 = 1 (I P T )π = π T 1 = 1 This means that π is an eigenvector of (I P T ) with eigenvalue, which means that π spans the null space of P T. In our ergodic MC, r = π, so we can solve the above system of equations for π to find r. The code follows. The resulting figure is fig. 6 figure tic pi=null(eye(j)-transition_probabilities ); ranks=pi/sum(pi); t=toc bar(1:j,ranks, r ) title([ Recasting as Linear system problem,, Computation time is,num2str(t), sec ], Fo xlabel( states, Fontsize,12) ylabel( ranks, Fontsize,12) F Recast as eigenvalue problem. From the equations in Part E, we can compute the eigenvector π associated with eigenvalue 1 of P T. (Refer to slide 18 of ranking nodes in graphs.) The code follows. The result is depicted in fig. 7. tic [V,D]=eig(transition_probabilities ); t=toc ranks=v(:,1)/sum(v(:,1));

G Discuss advantages of each method. (Refer to slides 18, 22-28 of ranking nodes in graphs.) Random walk The random walk approach is preferable in that it is secure (information is not shared between nodes, for a node to find its rank it only needs to know how many neighbors it has and how many time steps have passed). This also means that implementation can be distributed, that is the information does not have to be compiled and assessed in a central location, but each node can determine their rank individually. Probability update (using Markov Chain ideas) The probability update, which utilizes probability propagation, is similar to the Random Walk in that implementation can be distributed and it is fairly secure (each node has information from each of its neighbors about their neighbors). The main advantage is that it converges to the true ranks significantly faster than the random walk implementation. At each iteration, ranks of the whole states are updated, unlike the random walk approach where in each iteration, rank of only one state is updated. If the MC is very large, if the program is halted at any iteration, it provides an approximation for the ranks of all of the states, unlike linear system or eigenvector approaches where before the final solution, no (not even approximate) solution is available. We also observed that the algorithm reduces to a simple matrix time vector multiplication, for which fast algorithms are investigated. (e.g. this is the specialty of Matlab! Matlab is designed specifically to carry out such matrix computations exceptionally fast, and is capable of handling large inputs.) (Advantages about distributed implementation of probability update algorithm is discussed in slide 28.) System of Linear Equations Using a system of linear equations eliminates the problem of slow convergence because it does not use iteration. The problem is that it comprises security because the ranks are all computed in one place, so information from each node must be given. Also, for large networks it is very costly to calculate because the matrices become too large to compute directly and there is no way to get an estimated answer. Eigenvectors/eigenvalues The eigenvector approach is very similar to the system of linear equations approach, with the one advantage that it is computationally simpler. Rather than finding the nullspace of (I P T ), the eigenvector can be determined.

.1 N=1 5 Computation time is.47263sec.5 5 1 15 2 25 3 35 4.8 N=1 6 Computation time is 2.31sec.6.4.2 5 1 15 2 25 3 35 4.8 N=1 7 Computation time is 22.186sec.6.4.2 5 1 15 2 25 3 35 4 Fig. 1. Part B (1)- Reducible Graph: Ranks calculated based on random walk and taking the time average of the number of visits to each state. Estimated times for N = 1 5, N = 1 6, N = 1 7 is reported where N is the total duration of the experiment. Starting state was 1.

.8 N=1 5 Computation time is.22785sec.6.4.2 5 1 15 2 25 3 35 4.8 N=1 6 Computation time is 2.2314sec.6.4.2 5 1 15 2 25 3 35 4.8 N=1 7 Computation time is 22.2656sec.6.4.2 5 1 15 2 25 3 35 4 Fig. 2. Part B (2)- Reducible Graph: Ranks calculated based on random walk and taking the time average of the number of visits to each state. Estimated times for N = 1 5, N = 1 6, N = 1 7 is reported where N is the total duration of the experiment. Starting state was 6.

.2 N=1 5 Computation time is.5457sec.15.1.5 5 1 15 2 25 3 35 4.2 N=1 6 Computation time is 2.231sec.15.1.5 5 1 15 2 25 3 35 4.2 N=1 7 Computation time is 21.9787sec.15.1.5 5 1 15 2 25 3 35 4 Fig. 3. Part B - Irreducible Graph: Ranks calculated based on random walk and taking the time average of the number of visits to each state. Estimated times for N = 1 5, N = 1 6, N = 1 7 is reported where N is the total duration of the experiment.

.1 N=5 Computation time is.397sec.5 5 1 15 2 25 3 35 4.1 N=1 Computation time is.2318sec.5 5 1 15 2 25 3 35 4.1 N=15 Computation time is.3451sec.5 5 1 15 2 25 3 35 4 Fig. 4. Part D - Reducible Graph: Ranks calculated based on updating the probabilities. Estimated times for N = 5, N = 1, N = 15 is reported where N is the total number of iterations. The initial distribution is [1; ;... ; ]

.2 N=5 Computation time is.32867sec.15.1.5 5 1 15 2 25 3 35 4.2 N=1 Computation time is.23767sec.15.1.5 5 1 15 2 25 3 35 4.2 N=15 Computation time is.31811sec.15.1.5 5 1 15 2 25 3 35 4 Fig. 5. Part D - Ranks calculated based on updating the probabilities. Estimated times for N = 5, N = 1, N = 15 is reported where N is the total number of iterations. The initial distribution is [1/J; 1/J;... ; 1/J]

.2 Recasting as Linear system problem, Computation time is.53576sec.18.16.14.12 ranks.1.8.6.4.2 5 1 15 2 25 3 35 4 states Fig. 6. Part E - Irreducible Graph. Ranks calculated based on Recasting the problem as a system of linear equations.

.2 Recasting as eigenvector problem Computation time is.28673sec.18.16.14.12 ranks.1.8.6.4.2 5 1 15 2 25 3 35 4 states Fig. 7. Part F - Ranks calculated based on recasting the problem as finding the eigenvector of transpose of transition probability matrix corresponding to the largest eigenvector and then normalizing.