Approximating a single component of the solution to a linear system

Similar documents
Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Markov Chains, Random Walks on Graphs, and the Laplacian

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

Monte Carlo simulation inspired by computational optimization. Colin Fox Al Parker, John Bardsley MCQMC Feb 2012, Sydney

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Introduction to Machine Learning CMU-10701

Robert Collins CSE586, PSU. Markov-Chain Monte Carlo

Iterative Methods. Splitting Methods

Probability & Computing

CPSC 540: Machine Learning

Markov Chains and MCMC

CPSC 540: Machine Learning

Reminder of some Markov Chain properties:

Markov chain Monte Carlo

Lecture 18 Classical Iterative Methods

Practical unbiased Monte Carlo for Uncertainty Quantification

Link Analysis. Stony Brook University CSE545, Fall 2016

Homework 2 will be posted by tomorrow morning, due Friday, October 16 at 5 PM.

16 : Markov Chain Monte Carlo (MCMC)

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Computational Economics and Finance

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Link Analysis Ranking

A hybrid reordered Arnoldi method to accelerate PageRank computations

LECTURE 15 Markov chain Monte Carlo

Probabilistic Machine Learning

Weakly interacting particle systems on graphs: from dense to sparse

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS

Online Social Networks and Media. Link Analysis and Web Search

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

Learning Energy-Based Models of High-Dimensional Data

Computational statistics

Monte Carlo Methods. Leon Gu CSD, CMU

Likelihood-Based Methods

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Robert Collins CSE586, PSU Intro to Sampling Methods

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Introduction to Machine Learning CMU-10701

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Lecture 6: Markov Chain Monte Carlo

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

Graph Partitioning Using Random Walks

Preserving sparsity in dynamic network computations

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Iterative solvers for linear equations

Markov Model. Model representing the different resident states of a system, and the transitions between the different states

Model Counting for Logical Theories

Examples of Countable State Markov Chains Thursday, October 16, :12 PM

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

Quantized Average Consensus on Gossip Digraphs

Machine Learning for Data Science (CS4786) Lecture 24

Randomized Algorithms

Markov Processes Hamid R. Rabiee

MULTILEVEL ADAPTIVE AGGREGATION FOR MARKOV CHAINS, WITH APPLICATION TO WEB RANKING

Quantifying Uncertainty

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis

Dimension reduction for semidefinite programming

0.1 Naive formulation of PageRank

Markov chain Monte Carlo

Finite-Horizon Statistics for Markov chains

Stat. 758: Computation and Programming

STAT 425: Introduction to Bayesian Analysis

Gradient Estimation for Attractor Networks

STA 4273H: Statistical Machine Learning

Computing Stationary Distribution Locally

Bayesian Methods for Machine Learning

Bias-Variance Error Bounds for Temporal Difference Updates

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

Approximate Inference

Lecture 7 and 8: Markov Chain Monte Carlo

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Random Walks A&T and F&S 3.1.2

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at

APPLIED NUMERICAL LINEAR ALGEBRA

Probabilistic Graphical Models

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Approximate Counting and Markov Chain Monte Carlo

Lecture 3: Policy Evaluation Without Knowing How the World Works / Model Free Policy Evaluation

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)


Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Why should you care about the solution strategies?

Course Notes: Week 1

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

Markov Chains Handout for Stat 110

16 : Approximate Inference: Markov Chain Monte Carlo

Chapter 7. Markov chain background. 7.1 Finite state space

5. Solving the Bellman Equation

Complex Social System, Elections. Introduction to Network Analysis 1

STA141C: Big Data & High Performance Statistical Computing

Online Social Networks and Media. Link Analysis and Web Search

Birth-death chain models (countable state)

Applicability and Robustness of Monte Carlo Algorithms for Very Large Linear Algebra Problems. Ivan Dimov

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Transcription:

Approximating a single component of the solution to a linear system Christina E. Lee, Asuman Ozdaglar, Devavrat Shah celee@mit.edu asuman@mit.edu devavrat@mit.edu MIT LIDS 1

How do I compare to my competitors? Pagerank Bonacich Centrality Rank Centrality 2

Q: Do I need to compute the entire solution vector in order to get an estimate of my centrality and the centrality of my competitors? Eek, too expensive!! Stationary distribution of Markov chain Solution to Linear System Pagerank Bonacich Centrality Rank Centrality 3

Q: How do we efficiently compute the solution for a single coordinate within a large network? (1) Stationary distribution of Markov chain G = P (i.e. a stochastic matrix), z = 0 Sample short random walks, relies on sharp characterization of π i (2) Solution to Linear System of Equations If A positive definite, then we can choose G, z s.t. spectral radius G < 1, and x = Gx + z equivalent Expand computation in local neighborhood, use sparsity of G to bound size of neighborhood Would like a local method, i.e. cheaper than computing full vector. 4

Overview of Literature Probabilistic Monte Carlo Stationary Distribution of Markov Chain MCMC methods x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Samples long random walks Good if G has good spectral properties Challenge is to control the variance of estimator Naturally parallelizable 5

Overview of Literature Probabilistic Monte Carlo Stationary Distribution of Markov Chain MCMC methods x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Iterative Algebraic Power method Linear iterative update, Jacobi, Gauss-Seidel Iterative matrix-vector multiplication Good if G is sparse, and has good spectral properties Distributed, but synchronous Each multiplication O(n 2 ) 6

Overview of Literature Probabilistic Monte Carlo Stationary Distribution of Markov Chain MCMC methods x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Iterative Algebraic Power method Linear iterative update, Jacobi, Gauss-Seidel Other SVD Gaussian elimination Not designed for single component computation Requires global computation What if we specifically thought about solving for a single component? 7

Probabilistic Monte Carlo Iterative Algebraic Contributions Stationary Distribution of Markov Chain MCMC methods Power method x = Gx + z, spectral radius(g) < 1 Ulam von Neumann New Monte Carlo method which uses Samples local truncated returning random walks Convergence is a function of mixing time Estimate always guaranteed to be an upper bound Suggest and analyze specific termination criteria which gives multiplicative error bounds for high π i, and thresholds low π i [L., Ozdaglar, Shah NIPS 2013] Linear iterative update, Jacobi, Gauss-Seidel Other SVD Gaussian elimination 8

Contributions Probabilistic Monte Carlo Stationary Distribution of Markov Chain MCMC methods x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Iterative Algebraic Power method Linear iterative update, Jacobi, Gauss-Seidel Can be implemented asynchronously Maintains sparse intermediate vectors Provides invariant which track progress of method [L., Ozdaglar, Shah ArXiv 1411.2647 2014] Other SVD Gaussian elimination 9

Contributions Probabilistic Monte Carlo Iterative Algebraic Stationary Distribution of Markov Chain MCMC methods [JW03, H03, FRCS05, ALNO07, BCG10, BBCT12] Power method x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Linear iterative update, Jacobi, Gauss-Seidel [ABCHMT07] Other SVD Gaussian elimination Both methods are inspired by previous algorithms designed for computing PageRank 10

Computing Stationary Probability Consider a Markov chain Finite state space Transition matrix Irreducible, positive recurrent Goal: compute stationary probability focusing on nodes with higher values 11

Node-Centric Monte Carlo Standard Monte Carlo method samples long random walks until it converges to stationarity ergodic theorem Our method is based on the property 12

Intuition Estimate: 13

Algorithm Gather Samples (i, N, θ) Sample N truncated return paths to i = fraction of samples truncated i i Path length Returned Keep walking exceeded to i! Q: How do we choose appropriate N, θ? 14

Increment k Algorithm Gather Samples (i, N (k), θ (k) ) Terminate if Satisfied Compute θ (k+1) and N (k+1) Double θ: θ (k+1) = 2*θ (k) Increase N such that by Chernoff s bound: 15

Convergence Q: Can we design a smart termination criteria? 16

Increment k Algorithm Gather Samples (i, N (k), θ (k) ) Terminate if Satisfied(Δ) Output current estimate if (a) Stationary prob. is small (b) Fraction of truncated samples is small Compute θ (k+1) and N (k+1) 17

Results for Termination Criteria Termination condition 1 guarantees π i small Termination condition 2 guarantees multiplicative error bound Dependent on algorithm parameters, not input Markov chain Results extend to countable state space Markov chains. 18

Simulation - Pagerank Stationary probability Random graph using configuration model and power law degree distribution Nodes sorted by stationary probability 19

Simulation - Pagerank Stationary probability Obtain close estimates for important nodes Random graph using configuration model and power law degree distribution Nodes sorted by stationary probability 20

Simulation - Pagerank corrects for the bias! Stationary probability Obtain close estimates for important nodes Random graph corrects using for configuration the Fraction bias! model and power law degree samples distribution not truncated Nodes sorted by stationary probability 21

Compute Stationary Probability by sampling local random walks Pagerank Rank Centrality 22

Q: Extend to solving linear system? Ulam von Neumann algorithm When spectral radius of G < 1, Importance sampling, sample walks and reweight unbiased estimator If, may not exist sampling matrix such that variance is finite Does not exploit sparsity Pagerank Bonacich Centrality 23

Contributions Probabilistic Monte Carlo Iterative Algebraic Stationary Distribution of Markov Chain MCMC methods [JW03, H03, FRCS05, ALNO07, BCG10, BBCT12] Power method x = Gx + z, spectral radius(g) < 1 Ulam von Neumann Linear iterative update, Jacobi, Gauss-Seidel [ABCHMT07] Other SVD Gaussian elimination 24

Synchronous Algorithm Standard stationary linear iterative method x (t) as dense as z Relies upon Neumann series representation Solving for a single coordinate Sparsity pattern is k-hop neighborhood of i 25

Synchronous Algorithm Initialize: Update step: Terminate if: Output: Sparsity of p (t) grows as breadth-first traversal 26

Synchronous - Guarantee If, then the algorithm terminates with (a) Estimate satisfying (b) Total number of multiplications bounded by where Independent of size of matrix 27

Synchronous to Asynchronous Initialize: Update step: Update for node u of r (t), Terminate if: Output: What if we implemented the updates asynchronously instead? 28

Asynchronous Algorithm Initialize: Update step: Choose coordinate u Terminate if: Output: Q: Does this converge? At what rate? Yes, if ρ(g) < 1 and u chosen infinitely often For all t, for any chosen sequence of u, 29

Asynchronous - Guarantee If, then the algorithm terminates with (a) Estimate satisfying (b) With probability > 1 - δ, total number of multiplications is bounded by where Independent of size of matrix 30

Compare different choices for Simulations Bonacich centrality in Erdos-Renyi graph: n=1000 and p=0.0276 31

Compute Stationary Probability by sampling local random walks New Monte Carlo method which uses Samples local truncated returning random walks Convergence is a function of mixing time Estimate always guaranteed to be an upper bound Suggest and analyze specific termination criteria which gives multiplicative error bounds for high π i, and thresholds low π i [L., Ozdaglar, Shah NIPS 2013] Compute Ax = b by expanding computation within neighborhood and utilizing sparsity Can be implemented asynchronously Maintains sparse intermediate vectors Provides invariant which track progress of method [L., Ozdaglar, Shah arxiv:1411.2647 2014] 32