Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier

Size: px

Start display at page:

Download "Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier"

Jerome Baker
5 years ago
Views:

1 Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier W. Xie D. Bindel A. Demers J. Gehrke 12 Aug 2015 W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

2 PageRank Model Unweighted Node weighted Edge weighted Random surfer model: x (t+1) = αpx (t) + (1 α)v where P = AD 1 Stationary distribution: Mx = b where M = (I αp), b = (1 α)v W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

3 Edge Weight vs Node Weight Personalization v i = v i (w) γ ij = γ ij (w) w R d Introduce personalization parameters w R d in two ways: Node weights: M x(w) = b(w) Edge weights: M(w) x(w) = b W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

4 Edge Weight vs Node Weight Personalization Node weight personalization is well-studied Topic-sensitive PageRank: fast methods based on linearity Localized PageRank: fast methods based on sparsity Some work on edge weight personalization ObjectRank/ScaleRank: personalize weights for different edge types But lots of work incorporates edge weights without personalization Our goal: General, fast methods for edge weight personalization W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

5 Model Reduction Expensive full model (Mx = b) = U Reduced basis Approximation ansatz Reduced model ( My = b) = Model reduction procedure from physical simulation world: Offline: Construct reduced basis U R n k Offline: Choose k equations to pick approximation ˆx = Uy Online: Solve for y(w) given w and reconstruct ˆx W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

6 Reduced Basis Construction: SVD (aka POD/PCA/KL) Snapshot matrix Σ V T x 1 x 2... x r U w 1 w 2 w r Sample points W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

7 Approximation Ansatz Want r = MUy b 0. Consider two approximation conditions: Method Ansatz Properties Bubnov-Galerkin U T r = 0 Good accuracy empirically Fast for P(w) linear DEIM min r I Fast even for nonlinear P(w) Complex cost/accuracy tradeoff Similar error analysis framework for both (see paper): Consistency + Stability = Accuracy Consistency: Does the subspace contain good approximants? Stability: Is the approximation subproblem far from singular? W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

8 Bubnov-Galerkin Method U T y M U b = 0. Linear case: w i = probability of transition with edge type i ( ) ( M(w) = I α w i P (i), M(w) = I α i i w i P(i) where we can precompute P (i) = U T P (i) U Nonlinear: Cost to form M(w) comparable to cost of PageRank! ) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

9 Discrete Empirical Interpolation Method (DEIM) y Equations in I M U b = 0. I Ansatz: Minimize r I for chosen indices I Only need a few rows of M (and associated rows of U) Difference from physics applications: high-degree nodes! W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

10 Interpolation Costs Consider subgraph relevant to one interpolation equation: i I 1/3 1/50... Incoming neighbors of i Really care about weights of edges incident on I Need more edges to normalize (unless A(w) is linear) High in/out degree are expensive but informative Key question: how to choose I to balance cost vs accuracy? W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

11 Interpolation Accuracy Key: keep M I,: far from singular. If I = k, this is a subset selection over rows of MU. Have standard techniques (e.g. pivoted QR) Want to pick I once, so look at rows of Z = [ M(w 1 )U M(w 2 )U... ] for sample parameters w (i). Helps to explicitly enforce i ˆx i = 1 Several heuristics for cost/accuracy tradeoff (see paper) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

12 Online Costs If l = # PR components needed, online costs are: Form M O(dk 2 ) for B-G More complex for DEIM Factor M O(k 3 ) Solve for y O(k 2 ) Form Uy O(kl) Online costs do not depend on graph size! (unless you want the whole PR vector) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

13 Example Networks DBLP (citation network) 3.5M nodes / 18.5M edges Seven edge types = seven parameters P(w) linear Competition: ScaleRank Weibo (micro-blogging) 1.9M nodes / 50.7M edges Weight edges by topical similarity of posts Number of parameters = number of topics (5, 10, 20) (Studied global and local PageRank see paper for latter.) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

14 Singular Value Decay Value DBLP-L Weibo-S5 Weibo-S10 Weibo-S i th Largest Singular Value r = 1000 samples, k = 100 W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

15 DBLP Accuracy Kendall@100 Normalized L1 Galerkin DEIM-100 DEIM-120 DEIM-200 ScaleRank W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

16 DBLP Running Times (All Nodes) Running time (s) Coefficients Construction Galerkin DEIM-100 DEIM-120 DEIM-200 ScaleRank W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

17 Weibo Accuracy 10-1 Normalized L # Parameters W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

18 Weibo Running Times (All Nodes) Running time (s) Coefficients Construction # Parameters W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

19 Application: Learning to Rank Goal: Given T = {(i q, j q )} T q=1, find w that mostly ranks i q over j 1. (c.f. Backstrom and Leskovec, WSDM 2011) Standard: Gradient descent on full problem One PR computation for objective One PR computation for each gradient component Costs d + 1 PR computations per step With model reduction Rephrase objective in reduced coordinate space Use factorization to solve PR for objective Re-use same factorization for gradient W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

20 DBLP Learning Task Objective Function Value Standard Galerkin DEIM Iteration (8 papers for training + 7 params) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

21 The Punchline Test case: DBLP, 3.5M nodes, 18.5M edges, 7 params Cost per Iteration: Method Standard Bubnov-Galerkin DEIM-200 Time(sec) W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

22 Roads Not Taken In the paper (but not the talk) Selecting interpolation equations for DEIM Localized PageRank experiments (Weibo and DBLP) Comparison to BCA for localized PageRank Quasi-optimality framework for error analysis Room for future work! Analysis, applications, systems,... W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

23 Questions? Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier Wenlei Xie, David Bindel, Johannes Gehrke, and Al Demers KDD 2015, paper 117 Sponsors: NSF (IIS and IIS ) iad Project from the National Research Council of Norway W. Xie, D. Bindel, A. Demers, J. Gehrke KDD Aug / 1

Model Reduction for Edge-Weighted Personalized PageRank

Model Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue 11 Apr 2016 1 / 33 The Computational Science & Engineering Picture Application Analysis Computation MEMS Smart