Graphs and Networks Lecture 5. PageRank. Lecturer: Daniel A. Spielman September 20, 2007

Similar documents
Vectors in Rn un. This definition of norm is an extension of the Pythagorean Theorem. Consider the vector u = (5, 8) in R 2

1 The space of linear transformations from R n to R m :

Lecture 3. (2) Last time: 3D space. The dot product. Dan Nichols January 30, 2018

Xihe Li, Ligong Wang and Shangyuan Zhang

PHASE PLANE DIAGRAMS OF DIFFERENCE EQUATIONS. 1. Introduction

Connectivity and Menger s theorems

MATH2715: Statistical Methods

ANOVA INTERPRETING. It might be tempting to just look at the data and wing it

FEA Solution Procedure

The Brauer Manin obstruction

EE2 Mathematics : Functions of Multiple Variables

STEP Support Programme. STEP III Hyperbolic Functions: Solutions

6.4 VECTORS AND DOT PRODUCTS

3.3 Operations With Vectors, Linear Combinations

Selected problems in lattice statistical mechanics

Change of Variables. (f T) JT. f = U

Math 144 Activity #10 Applications of Vectors

CS 450: COMPUTER GRAPHICS VECTORS SPRING 2016 DR. MICHAEL J. REALE

Change of Variables. f(x, y) da = (1) If the transformation T hasn t already been given, come up with the transformation to use.

Restricted cycle factors and arc-decompositions of digraphs. J. Bang-Jensen and C. J. Casselgren

ON THE PERFORMANCE OF LOW

arxiv: v2 [cs.si] 27 Apr 2017

1. Solve Problem 1.3-3(c) 2. Solve Problem 2.2-2(b)

Spring, 2008 CIS 610. Advanced Geometric Methods in Computer Science Jean Gallier Homework 1, Corrected Version

Relativity II. The laws of physics are identical in all inertial frames of reference. equivalently

Differential Geometry. Peter Petersen

Lesson 81: The Cross Product of Vectors

arxiv: v1 [math.co] 25 Sep 2016

Lecture 5 November 6, 2012

We automate the bivariate change-of-variables technique for bivariate continuous random variables with

Decision Making in Complex Environments. Lecture 2 Ratings and Introduction to Analytic Network Process

IN this paper we consider simple, finite, connected and

The Minimal Estrada Index of Trees with Two Maximum Degree Vertices

Chords in Graphs. Department of Mathematics Texas State University-San Marcos San Marcos, TX Haidong Wu

The Real Stabilizability Radius of the Multi-Link Inverted Pendulum

The Cross Product of Two Vectors in Space DEFINITION. Cross Product. u * v = s ƒ u ƒƒv ƒ sin ud n

Essentials of optimal control theory in ECON 4140

Microscopic Properties of Gases

Spanning Trees with Many Leaves in Graphs without Diamonds and Blossoms

Linear System Theory (Fall 2011): Homework 1. Solutions

Linear Strain Triangle and other types of 2D elements. By S. Ziaei Rad

Axiomatizing the Cyclic Interval Calculus

Note: the net distance along the path is a scalar quantity its direction is not important so the average speed is also a scalar.

c 2009 Society for Industrial and Applied Mathematics

Formal Methods for Deriving Element Equations

Turbulence and boundary layers

SUPPLEMENT TO STRATEGIC TRADING IN INFORMATIONALLY COMPLEX ENVIRONMENTS (Econometrica, Vol. 86, No. 4, July 2018, )

A scalar nonlocal bifurcation of solitary waves for coupled nonlinear Schrödinger systems

Online Stochastic Matching: New Algorithms and Bounds

08.06 Shooting Method for Ordinary Differential Equations

arxiv: v1 [cs.dm] 27 Jun 2017 Darko Dimitrov a, Zhibin Du b, Carlos M. da Fonseca c,d

Predicting Popularity of Twitter Accounts through the Discovery of Link-Propagating Early Adopters

New Regularized Algorithms for Transductive Learning

On the Total Duration of Negative Surplus of a Risk Process with Two-step Premium Function

4 Exact laminar boundary layer solutions

LINEAR ALGEBRA. and VECTOR GEOMETRY Volume 2 of 2 September 2014 edition

Lecture 8: September 26

Chapter 3. Preferences and Utility

UNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL

TESTING MEANS. we want to test. but we need to know if 2 1 = 2 2 if it is, we use the methods described last time pooled estimate of variance

Lecture J. 10 Counting subgraphs Kirchhoff s Matrix-Tree Theorem.

Geometry of Span (continued) The Plane Spanned by u and v

u P(t) = P(x,y) r v t=0 4/4/2006 Motion ( F.Robilliard) 1

Reduction of over-determined systems of differential equations

Minimal Obstructions for Partial Representations of Interval Graphs

3.4-Miscellaneous Equations

Chapter 6 Momentum Transfer in an External Laminar Boundary Layer

Lecture 9: 3.4 The Geometry of Linear Systems

1 Undiscounted Problem (Deterministic)

10.2 Solving Quadratic Equations by Completing the Square

Math 116 First Midterm October 14, 2009

Modelling by Differential Equations from Properties of Phenomenon to its Investigation

arxiv: v1 [math.co] 10 Nov 2010

Low-emittance tuning of storage rings using normal mode beam position monitor calibration

Euclidean Vector Spaces

1. Tractable and Intractable Computational Problems So far in the course we have seen many problems that have polynomial-time solutions; that is, on

3 2D Elastostatic Problems in Cartesian Coordinates

Bayes and Naïve Bayes Classifiers CS434

Linear Algebra and its Applications

Minimum-Latency Beaconing Schedule in Multihop Wireless Networks

Math 4A03: Practice problems on Multivariable Calculus

ENGINEERING COUNCIL DYNAMICS OF MECHANICAL SYSTEMS D225 TUTORIAL 2 LINEAR IMPULSE AND MOMENTUM

Reaction-Diusion Systems with. 1-Homogeneous Non-linearity. Matthias Buger. Mathematisches Institut der Justus-Liebig-Universitat Gieen

GRAY CODES FAULTING MATCHINGS

BLOOM S TAXONOMY. Following Bloom s Taxonomy to Assess Students

A spectral Turán theorem

Lecture Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2

Section 7.4: Integration of Rational Functions by Partial Fractions

Complexity of the Cover Polynomial

Visualisations of Gussian and Mean Curvatures by Using Mathematica and webmathematica

As it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation

Exercise 4. An optional time which is not a stopping time

Direct linearization method for nonlinear PDE s and the related kernel RBFs

Intro to path analysis Richard Williams, University of Notre Dame, Last revised April 6, 2015

Momentum Equation. Necessary because body is not made up of a fixed assembly of particles Its volume is the same however Imaginary

SECTION 6.7. The Dot Product. Preview Exercises. 754 Chapter 6 Additional Topics in Trigonometry. 7 w u 7 2 =?. 7 v 77w7

arxiv: v1 [math.co] 25 Apr 2016

Study on the Mathematic Model of Product Modular System Orienting the Modular Design

Artificial Noise Revisited: When Eve Has more Antennas than Alice

Faster exact computation of rspr distance

Transcription:

Graphs and Networks Lectre 5 PageRank Lectrer: Daniel A. Spielman September 20, 2007 5.1 Intro to PageRank PageRank, the algorithm reportedly sed by Google, assigns a nmerical rank to eery web page. More important pages get higher rankings. The more in-links a page has, the higher its ranking shold be. Bt, more importantly, a page has a higher rank if it is pointed to by high-rank pages. Low-rank pages inflence the rank of a page less. If one page points to many others, it will hae less inflence on their rankings than if it jst points to a few. To algebraicize this intitiely appealing idea, PageRank treats the web as a directed graph, with web pages as ertices and links as directed edges. The rank of ertex is denoted r(), and is spposed to satisfy the formla: r() = r()/d + (), (5.1) :(,) E where d + () is the nmber of edges going ot of. Note that this sm is oer edges going in to. To express this in matrix form, we let D + be the diagonal matrix whose th diagonal is d + (). We then A be the directed adjacency matrix of the graph, where A(, ) = 1 if there is an edge from to. Yes, I know that this looks backwards. Bt, it is what I hae to do if I want to make r be a colmn ector. We then find that r mst satisfy the eqation r = AD 1 + r. (5.2) That is to say that r is an eigenector of eigenale 1 of the matrix AD 1 +. Howeer, AD 1 + is not a symmetric matrix, and is not in any way similiar to a symmetric matrix. So, some of the eigenales of this matrix can be complex, it might not hae n eigenectors, and the eigenectors it does hae can hae complex entries. Neertheless, in this lectre we will show that 1. If the graph has no ertices of ot-degree 0, then 1 is an eigenale. 2. If the graph is strongly connected, then the eigenale 1 has mltiplicity 1. 3. If the graph is strongly connected, then the niqe soltion (5.2) is strictly positie. 5-1

Lectre 5: September 20, 2007 5-2 Before I go frther, I wold like to point ot that this measre of importance was first sggested in the social network commnity in the paper by Phillip Bonacich, Factoring and weighting approaches to stats scores and cliqe identification, Jornal of Mathematical Sociology, 1972. I shold also point ot that r can be nderstood as the stable distribtion of the directed random walk on the graph G. Bt, random walks on directed graphs are more complicated than on ndirected graphs. 5.2 Eigenale 1 Set M def = AD 1 + Lemma 5.2.1. If G has no ertices of ot-degree 0, then 1 is an eigenale of M. Proof. If G has no ertices of ot-degree 0, then eery colmn of A has at least one non-zero entry. In fact, the th colmn of A has d + () non-zero entries, so the th colmn of AD 1 + has sm 1. This implies that 1M = 1, and so M has an eigenector of eigenale 1. This is similar to the ndirected case in both cases the walk matrix has the ector 1 as a lefteigenector. Howeer, it differs in that we do not know any simple expression for the corresponding right-eigenector, r. Lemma 5.2.2. If G is strongly connected, then the eigenale 1 has mltiplicity 1. In particlar, if M =, then we mst hae = c1 for some constant c. The proof of this is similar to the proof in the ndirected case, so we will skip it. 5.3 r is positie Lemma 5.3.1. If G is strongly connected and if the soltion of (5.2) is non-negatie, then it is positie. Proof. First, note that the soltion to (5.2) cannot be the all-zero ector. So, if it is non-negatie, it mst hae at leae one positie coordinate. So, assme that r(z) > 0. Now, let be any node

Lectre 5: September 20, 2007 5-3 that z points to. Eqation (5.1) tells s that r() = :(,) E r(z)/d + (z) > 0. r()/d + () In general, for eery node z for which r(z) > 0, eery node that z points to mst hae r() > 0. Since the graph is strongly connected, we can apply indction to show that r() > 0 for all V. Now, we mst show that r is non-negatie. To do this, we will consider the matrix We need to establish a few properties of M. M def = 1 n 1 M i. n Claim 5.3.2. If M r = r, then M r = r. Similarly, 1M = 1. Claim 5.3.3. The matrix M has no negatie or zero entries. Proof. As M is non-negatie, it follows immediately that M is non-negatie. To show that M has no zero entries, note that M t (b, a) is eqal to the probability that a random walk starting at a hits b in exactly t time steps. As the graph is strongly connected, for eery pair of ertices a and b, there is some t less than n for which this probability is non-zero (yo may proe this in the same way yo proed 5.3.1). As M (b, a) is the aerage of these probabilities for t between 0 and n, it is non-zero as well. Theorem 5.3.4. The eqation M r = r has a non-negatie soltion. Proof. We will show that it has a soltion in which all the signs are the same, which implies that it has a non-negatie soltion (flip all signs if necessary). Bt, we will work with the matrix M, which we proed also satisfies M r = r. (5.3) Assme by way of contradiction that r is not sign-niform. That is, that r has both positie and negatie entries. We will se the fact that if x is some ector with both positie and negatie entries, then x () < x (). From eqation (5.3), we hae that for all, i=0 r() = M (, )r(),

Lectre 5: September 20, 2007 5-4 and so r() = M (, )r(). As we hae assmed that r is not sign-niform, and M (, ) is always positie, we hae the ineqality M (, )r() < M (, ) r(), which implies r() < M (, ) r(). If we now sm oer all, we get r() < = = = M (, ) r() M (, ) r() r() M (, ) r(), as 1M (, ) = 1 is eqialent to Bt, we hae deried a contradiction. M (, ) = 1. 5.4 Closer to PageRank Brin and Page tell s that they don t actally take A to be the original web graph. Rather, they consider a random srfer who actally jmps to a random web page with some fixed probability at each time step. We can model this by inclding an edge between all pairs of ertices, giing that edge low weight. Since we haen t discssed weighting edges yet, let me instead say that this is eqialent to forcing r to satisfy the eqation ((1 α)m + αj /n)r = r, (5.4) where α is the probability of jmping to a random web page at any moment, and J is the all-1s matrix. This eqation is actally mch nicer than the original. First of all, it gies s an all-positie matrix. So, we know that the soltion will be all positie. It also eliminates the isse of nodes with no ot-edges.

Lectre 5: September 20, 2007 5-5 If we decide that we are going to normalize r so that 1r = 1, then we hae that so eqation (5.4) becomes which is eqialent to and J r = 1, (1 α)m r + (α/n)1 = r, (α/n)1 = (I (1 α)m )r, ((I (1 α)m )) 1 (α/n)1 = r. That is, r is now gien by the soltion to a system of linear eqations. Een better, we can sole these eqations qickly. We hae that (I (1 α)m )) 1 = ((1 α)m ) t. (This is jst like the formla yo learned for 1/(1 x), bt for matrices. It is tre as long as the sm conerges). Moreoer, this sm conerges ery qickly. Brin and Page sggest sing α =.15. We know that eery entry of M t is at most 1, so eery entry of ((1 α)m ) t is at most 0.85 t, which becomes small ery qickly as we increase t. So, we can qickly approximate r by sing the first few terms from this series. t=0