Computing approximate PageRank vectors by Basis Pursuit Denoising

Computing approximate PageRank vectors by Basis Pursuit Denoising Michael Saunders Systems Optimization Laboratory, Stanford University Joint work with Holly Jin, LinkedIn Corp SIAM Annual Meeting San Diego, July 7 11, 2008 SIAM AN08 1/36

Outline 1 Introduction 2 PageRank 3 Basis Pursuit Denoising 4 Sparse PageRank (Neumann) 5 Sparse PageRank (LPdual) 6 Preconditioning 7 Summary SIAM AN08 2/36

Abstract The PageRank eigenvector problem involves a square system Ax = b in which x is naturally nonnegative and somewhat sparse (depending on b). We seek an approximate x that is nonnegative and extremely sparse. We experiment with an active-set optimization method designed for the dual of the BPDN problem, and find that it tends to extract the important elements of x in a greedy fashion. Supported by the Office of Naval Research SIAM AN08 3/36

Introduction Many applications (statistics, signal analysis, imaging) seek sparse solutions to square or rectangular systems Ax b x sparse SIAM AN08 4/36

Introduction Many applications (statistics, signal analysis, imaging) seek sparse solutions to square or rectangular systems Ax b x sparse Basis Pursuit Denoising (BPDN) seeks sparsity by solving min 1 2 Ax b 2 + λ x 1 Personalized PageRank eigenvectors involve square systems (I αh T )x = v H, v sparse where 0 < α < 1 He e x 0, sparse SIAM AN08 4/36

PageRank SIAM AN08 5/36

H = 0 1 1/3 1/3 1/3 B 1/2 1/2 C @ 0 0 0 0 1/2 1/2 PageRank Langville and Meyer 2006 A Hyperlink matrix He e SIAM AN08 6/36

H = 0 1 1/3 1/3 1/3 B 1/2 1/2 C @ 0 0 0 0 1/2 1/2 v = 1 n e or e i PageRank Langville and Meyer 2006 A Hyperlink matrix He e Teleportation vector or personalization vector e T v = 1 SIAM AN08 6/36

H = 0 1 1/3 1/3 1/3 B 1/2 1/2 C @ 0 0 0 0 1/2 1/2 v = 1 n e or e i PageRank Langville and Meyer 2006 A Hyperlink matrix He e Teleportation vector or personalization vector e T v = 1 S = H + av T Stochastic matrix Se = e G = αs + (1 α)ev T Google matrix α = 0.85 say Ge = e SIAM AN08 6/36

Web matrices H Name n (pages) nnz (links) i (v = e i ) csstanford 9914 36854 4 cs.stanford.edu Stanford 275689 1623817 23036 cs.stanford.edu 6753 calendus.stanford.edu Stanford-Berkeley 683446 7583376 6753 unknown All data collected around 2001 Cleve Moler Sep Kamvar David Gleich (I αh T )x = v SIAM AN08 7/36

Power method on G T x = x Stanford, n = 275000, v = e 6753 (calendus.stanford.edu) Time (secs) vs α = 0.1 : 0.01 : 0.99 60 50 40 30 20 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 α SIAM AN08 8/36

Power method Stanford-Berkeley, n = 683000, v = e 6753 180 Time (secs) vs α = 0.1 : 0.01 : 0.99 160 140 120 100 80 60 40 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 α SIAM AN08 9/36

Exact x(α) csstanford, n = 9914, v = e 4 (cs.stanford.edu) x j (α), j = 1:n Stanford alpha=0.850 v=e(4) xtrue=p\v 0.25 0.2 0.15 0.1 0.05 0 0 1000 2000 3000 4000 5000 6000 7000 8000 α = 0.85 SIAM AN08 10/36

Exact x(α) Stanford, n = 275000, v = e 23036 (cs.stanford.edu) x j (α), j = 1:n α = 0.85 SIAM AN08 11/36

Sparsity of x(α) Stanford, n = 275000, v = e 23036 (cs.stanford.edu) x > 250/n x > 500/n 0 0 10 50 20 30 40 50 100 60 70 150 0 20 40 60 80 nz = 4638 80 0 10 20 30 40 50 60 70 80 90 nz = 3076 α = 0.1 : 0.01 : 0.99 (90 values) SIAM AN08 12/36

n = 275000 x > 1000/n x > 2000/n 0 5 0 10 5 15 10 20 15 25 20 30 25 35 30 40 35 45 40 50 0 10 20 30 40 50 60 70 80 90 55 nz = 1909 0 10 20 30 40 50 60 70 80 90 nz = 2305 α = 0.1 : 0.01 : 0.99 (90 values) SIAM AN08 13/36

Basis Pursuit Denoising SIAM AN08 14/36

Lasso(ν) Tibshirani 1996 min L1 sparse x min 1 2 Ax b 2 st x 1 ν A = SIAM AN08 15/36

Lasso(ν) Tibshirani 1996 min L1 sparse x min 1 2 Ax b 2 st x 1 ν A = Basis Pursuit Chen, Donoho & S 1998 min x 1 st Ax = b A = fast operator (Ax, A T y are efficient) SIAM AN08 15/36

Lasso(ν) Tibshirani 1996 min L1 sparse x min 1 2 Ax b 2 st x 1 ν A = Basis Pursuit Chen, Donoho & S 1998 min x 1 st Ax = b A = fast operator (Ax, A T y are efficient) BPDN(λ) Same paper min λ x 1 + 1 2 Ax b 2 fast operator, any shape SIAM AN08 15/36

BP and BPDN algorithms min λ x 1 + 1 Ax b 2 2 OMP Davis, Mallat et al 1997 Greedy BPDN-interior Chen, Donoho & S, 1998, 2001 Interior, CG PDSCO, PDCO Saunders 1997, 2002 Interior, LSQR BCR Sardy, Bruce & Tseng 2000 Orthogonal blocks Homotopy Osborne et al 2000 Active-set, all λ LARS Efron, Hastie, Tibshirani 2004 Active-set, all λ STOMP Donoho, Tsaig, et al 2006 Double greedy l1 ls Kim, Koh, Lustig, Boyd et al 2007 Primal barrier, PCG GPSR Figueiredo, Nowak & Wright 2007 Gradient Projection BCLS Friedlander 2006 Projection, LSQR SPGL1 van den Berg & Friedlander 2007 Spectral GP, all λ BPdual Friedlander & S 2007 Active-set on dual LPdual Friedlander & S 2007 For x 0 SIAM AN08 16/36

x 1 when x 0 Suggests regularized LP problems: LPprimal(λ) min x, y e T x + 1 2 λ y 2 st Ax + λy = b, x 0 SIAM AN08 17/36

x 1 when x 0 Suggests regularized LP problems: LPprimal(λ) min x, y e T x + 1 2 λ y 2 st Ax + λy = b, x 0 LPdual(λ) min y b T y + 1 2 λ y 2 st A T y e SIAM AN08 17/36

x 1 when x 0 Suggests regularized LP problems: LPprimal(λ) min x, y e T x + 1 2 λ y 2 st Ax + λy = b, x 0 LPdual(λ) min y b T y + 1 2 λ y 2 st A T y e Friedlander and S 2007: New active-set Matlab codes SIAM AN08 17/36

LPdual solver for Ax b, x 0 min y b T y + 1 2 λ y 2 st A T y e SIAM AN08 18/36

LPdual solver for Ax b, x 0 min y b T y + 1 2 λ y 2 st A T y e A few active constraints: B T y = e Initially B is empty, y = 0 Select columns of B in almost greedy manner SIAM AN08 18/36

LPdual solver for Ax b, x 0 min y b T y + 1 2 λ y 2 st A T y e A few active constraints: B T y = e Initially B is empty, y = 0 Select columns of B in almost greedy manner Main work per itn: Solve min Bx g Form dy = (g Bx)/λ Form dz = A T dy SIAM AN08 18/36

LPdual solver for Ax b, x 0 min y b T y + 1 2 λ y 2 st A T y e A few active constraints: B T y = e Initially B is empty, y = 0 Select columns of B in almost greedy manner Main work per itn: R Solve min Bx g Update QB = 0 Form dy = (g Bx)/λ Form dz = A T dy «without Q SIAM AN08 18/36

Sparse PageRank with Neumann Series SIAM AN08 19/36

Neumann series (I αh T )x = v (*) x = (I αh T ) 1 v = (I + αh T + (αh T ) 2 + )v A sum of nonnegative terms SIAM AN08 20/36

Neumann series (I αh T )x = v (*) x = (I αh T ) 1 v = (I + αh T + (αh T ) 2 + )v A sum of nonnegative terms Equivalent to Jacobi s method on (*) Thanks David! SIAM AN08 20/36

Neumann series (I αh T )x = v (*) x = (I αh T ) 1 v = (I + αh T + (αh T ) 2 + )v A sum of nonnegative terms Equivalent to Jacobi s method on (*) x x k decreases monotonically Thanks David! SIAM AN08 20/36

Neumann series (I αh T )x = v (*) x = (I αh T ) 1 v = (I + αh T + (αh T ) 2 + )v A sum of nonnegative terms Equivalent to Jacobi s method on (*) Thanks David! x x k decreases monotonically r k x x k 1+α 1 α r k Error bounded by residual SIAM AN08 20/36

Sparse PageRank with BPDN (LPdual) SIAM AN08 21/36

Sparse PageRank Regard as (I αh T )x = v v sparse Ax v v = e i say SIAM AN08 22/36

Sparse PageRank Regard as (I αh T )x = v v sparse Ax v v = e i say Apply active-set solver LPdual to dual of min x, r λe T x + 1 2 r 2 st Ax + r = v, x 0 SIAM AN08 22/36

0.08 Stanford alpha=0.950 v=e(4) x(active) 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 SIAM AN08 23/36

Stanford web, n = 275000 α = 0.85 v = e 23036 (cs.stanford.edu) Direct solve: xtrue = A\v BPDN λ = 2e-3 76 nonzero x j SIAM AN08 24/36

Sparsity of xbp(α) Stanford, n = 275000, v = e 23036 (cs.stanford.edu) xpower > 1000/n xbp > 1/n 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 55 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 nz = 2382 nz = 2305 α = 0.1 : 0.01 : 0.99 (90 values) SIAM AN08 25/36

0 5 10 15 20 25 30 35 40 45 50 Sparsity of xbp(α) Stanford, n = 275000, v = e 23036 (cs.stanford.edu) xpower > 1000/n xbp > 1/n 55 0 10 20 30 40 50 60 70 80 90 nz = 2305 α = 0.1 : 0.01 : 0.99 (90 values) 0 5 10 15 20 25 30 35 40 45 0 10 20 30 40 50 60 70 80 90 nz = 2382 xpower > 1/n has λ = 4e-3 6000 rows, 155000 nonzeros SIAM AN08 25/36

Power method vs BP Stanford, n = 275000, v = e 23036 (cs.stanford.edu) Time (secs) for each α 70 60 BP Direct Power 50 40 30 20 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 α = 0.1 : 0.01 : 0.99 (90 values) λ = 4e-3 SIAM AN08 26/36

Power method vs BP Stanford-Berkeley, n = 683000, v = e 6753 (unknown) Time (secs) for each α 180 160 140 BP Direct Power 120 100 80 60 40 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 α = 0.1 : 0.01 : 0.99 (90 values) λ = 5e-3 SIAM AN08 27/36

Varying α and λ min x, r λ x 1 + 1 2 r 2 st (I αh T )x + r = v H = 9914 9914 csstanford web v = e 4 (cs.stanford.edu) α = 0.75, 0.85, 0.95 λ = 1e-3, 2e-3, 4e-3, 6e-3 Plot nonzero x j in the order they are chosen SIAM AN08 28/36

csstanford v = e 4 α = 0.75 min λe T x + 1 2 r 2 Stanford alpha=0.750 v=e(4) x(active) Stanford alpha=0.750 v=e(4) x(active) 0.25 0.25 0.2 0.2 0.15 λ = 1e-3 0.15 λ = 2e-3 0.1 0.1 0.05 0.05 0 0 10 20 30 40 50 60 70 0 0 10 20 30 40 50 60 Stanford alpha=0.750 v=e(4) x(active) Stanford alpha=0.750 v=e(4) x(active) 0.25 0.25 0.2 0.2 0.15 λ = 4e-3 0.15 λ = 6e-3 0.1 0.1 0.05 0.05 0 0 0 5 10 15 20 25 30 35 0 5 10 15 20 25 SIAM AN08 29/36

csstanford v = e 4 α = 0.85 min λe T x + 1 2 r 2 Stanford alpha=0.850 v=e(4) x(active) csstanford alpha=0.850 v=e(4) x(active) 0.16 0.16 0.14 0.14 0.12 0.1 0.08 0.12 0.1 λ = 1e-3 λ = 2e-3 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 10 20 30 40 50 60 70 80 90 100 0 0 10 20 30 40 50 60 Stanford alpha=0.850 v=e(4) x(active) Stanford alpha=0.850 v=e(4) x(active) 0.16 0.16 0.14 0.14 0.12 0.1 0.08 0.12 0.1 λ = 4e-3 λ = 6e-3 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 0 5 10 15 20 25 30 35 0 5 10 15 20 25 SIAM AN08 30/36

csstanford v = e 4 α = 0.95 min λe T x + 1 2 r 2 Stanford alpha=0.950 v=e(4) x(active) Stanford alpha=0.950 v=e(4) x(active) 0.08 0.08 0.07 0.07 0.06 0.05 0.04 0.06 0.05 λ = 1e-3 λ = 2e-3 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0 0 20 40 60 80 100 0 0 10 20 30 40 50 60 70 0.08 Stanford alpha=0.950 v=e(4) x(active) 0.08 Stanford alpha=0.950 v=e(4) x(active) 0.07 0.07 0.06 0.06 0.05 0.04 0.05 λ = 4e-3 λ = 6e-3 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0 0 0 5 10 15 20 25 30 35 0 5 10 15 20 SIAM AN08 31/36

SPAI: Sparse Approximate Inverse SIAM AN08 32/36

SPAI Preconditioning A plausible future application Find sparse X such that AX I SIAM AN08 33/36

SPAI Preconditioning A plausible future application Find sparse X such that AX I Find sparse x j such that Ax j e j for j = 1 : n SIAM AN08 33/36

SPAI Preconditioning A plausible future application Find sparse X such that AX I Find sparse x j such that Ax j e j for j = 1 : n Embarrassingly parallel use of sparse BP solutions SIAM AN08 33/36

Summary SIAM AN08 34/36

(I αh T )x v Comments Sparse v sometimes gives sparse x SIAM AN08 35/36

(I αh T )x v Comments Sparse v sometimes gives sparse x LPprimal and LPdual work in greedy manner SIAM AN08 35/36

(I αh T )x v Comments Sparse v sometimes gives sparse x LPprimal and LPdual work in greedy manner Greedy solvers guarantee sparse x (unlike power method) SIAM AN08 35/36

(I αh T )x v Comments Sparse v sometimes gives sparse x LPprimal and LPdual work in greedy manner Greedy solvers guarantee sparse x (unlike power method) Not sensitive to α SIAM AN08 35/36

(I αh T )x v Comments Sparse v sometimes gives sparse x LPprimal and LPdual work in greedy manner Greedy solvers guarantee sparse x (unlike power method) Not sensitive to α Sensitive to λ but we can limit iterations SIAM AN08 35/36

Thanks Shaobing Chen and David Donoho Michael Friedlander (BP solvers) SIAM AN08 36/36

Thanks Shaobing Chen and David Donoho Michael Friedlander (BP solvers) Sou-Cheng Choi and Lek-Heng Lim (ICIAM 07) SIAM AN08 36/36

Thanks Shaobing Chen and David Donoho Michael Friedlander (BP solvers) Sou-Cheng Choi and Lek-Heng Lim (ICIAM 07) Amy Langville and Carl Meyer (splendid book) David Gleich SIAM AN08 36/36

Thanks Shaobing Chen and David Donoho Michael Friedlander (BP solvers) Sou-Cheng Choi and Lek-Heng Lim (ICIAM 07) Amy Langville and Carl Meyer (splendid book) David Gleich Holly Jin SIAM AN08 36/36