ORIE 6334 Spectral Graph Theory October 13, Lecture 15

Similar documents
Iterative solvers for linear equations

Iterative solvers for linear equations

Iterative solvers for linear equations

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21

ORIE 6334 Spectral Graph Theory November 8, Lecture 22

ORIE 6334 Spectral Graph Theory November 22, Lecture 25

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 1 and 2: Random Spanning Trees

2.1 Laplacian Variants

ORIE 6334 Spectral Graph Theory October 25, Lecture 18

Physical Metaphors for Graphs

Lecture 8: Linear Algebra Background

Iterative Methods. Splitting Methods

Gaussian Models (9/9/13)

Lecture 24: Element-wise Sampling of Graphs and Linear Equation Solving. 22 Element-wise Sampling of Graphs and Linear Equation Solving

Walks, Springs, and Resistor Networks

8.1 Concentration inequality for Gaussian random matrix (cont d)

ORIE 6334 Spectral Graph Theory September 8, Lecture 6. In order to do the first proof, we need to use the following fact.

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Matrices and Deformation

Chapter 3. Matrices. 3.1 Matrices

Solving systems of linear equations

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Algorithms, Graph Theory, and the Solu7on of Laplacian Linear Equa7ons. Daniel A. Spielman Yale University

Lab 1: Iterative Methods for Solving Linear Systems

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Lecture 10: October 27, 2016

Lecture 1. 1 if (i, j) E a i,j = 0 otherwise, l i,j = d i if i = j, and

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Fiedler s Theorems on Nodal Domains

Lecture 13 Spectral Graph Algorithms

Lecture 1: Systems of linear equations and their solutions

Preconditioner Approximations for Probabilistic Graphical Models

Lecture 9: Low Rank Approximation

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

JUST THE MATHS UNIT NUMBER 9.8. MATRICES 8 (Characteristic properties) & (Similarity transformations) A.J.Hobson

Numerical Methods I Non-Square and Sparse Linear Systems

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Effective Resistance and Schur Complements

6.842 Randomness and Computation March 3, Lecture 8

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Iterative Methods for Solving A x = b

Fiedler s Theorems on Nodal Domains

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Approximate Gaussian Elimination for Laplacians

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

18.312: Algebraic Combinatorics Lionel Levine. Lecture 19

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Lecture 13: Spectral Graph Theory

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Lecture Approximate Potentials from Approximate Flow

Lecture 3: Huge-scale optimization problems

1 Matrix notation and preliminaries from spectral graph theory

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Methods I Eigenvalue Problems

Lecture 7. 1 Normalized Adjacency and Laplacian Matrices

1 Review: symmetric matrices, their eigenvalues and eigenvectors

Some notes on Linear Algebra. Mark Schmidt September 10, 2009

Linear algebra and applications to graphs Part 1

An Algorithmist s Toolkit September 10, Lecture 1

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes

Lecture 2: Eigenvalues and their Uses

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Lecture 10 - Eigenvalues problem

1 Counting spanning trees: A determinantal formula

MATH 56A: STOCHASTIC PROCESSES CHAPTER 0

Chapter 5 Eigenvalues and Eigenvectors

Conjugate gradient acceleration of non-linear smoothing filters Iterated edge-preserving smoothing

Ma/CS 6b Class 20: Spectral Graph Theory

Graph Sparsifiers: A Survey

9.1 Preconditioned Krylov Subspace Methods

LIE ALGEBRAS: LECTURE 3 6 April 2010

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Algorithms that use the Arnoldi Basis

Eigenvectors. Prop-Defn

Ma/CS 6b Class 20: Spectral Graph Theory

MTH 464: Computational Linear Algebra

Linear Solvers. Andrew Hazel

A Brief Outline of Math 355

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

MAA507, Power method, QR-method and sparse matrix representation.

Lecture 18 Classical Iterative Methods

Is there life after the Lanczos method? What is LOBPCG?

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

Computational Methods. Eigenvalues and Singular Values

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Chapter 7 Iterative Techniques in Matrix Algebra

Outline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy,

Conjugate Gradient Method

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

The Lanczos and conjugate gradient algorithms

Optimization methods

Designing Information Devices and Systems I Spring 2016 Official Lecture Notes Note 21

Transcription:

ORIE 6334 Spectral Graph heory October 3, 206 Lecture 5 Lecturer: David P. Williamson Scribe: Shijin Rajakrishnan Iterative Methods We have seen in the previous lectures that given an electrical network, to obtain the potentials for an electrical flow for a supply vector b, we need to solve the system L G p = b, where L G is the Laplacian of the graph. Last time, we saw how to construct a low-stretch tree. his lecture is an attempt to explain why low-stretch trees are useful in solving this system. One method we could use to solve this system is to use the pseudo-inverse, L G, and get p = L G b. However, computing the (pseudo) inverse can be a very slow operation - just writing the inverse down takes O(n 2 ) time. his is costly when the input matrix is sparse and only has a few non-zero entries. We would ideally want an algorithm for solving the system whose running time depends on the sparsity of the matrix. his motivates us towards exploring iterative methods for solving linear systems of equations. o solve a linear system Ax = b, iterative algorithms only involve multiplication by the matrix A, and for a matrix A whose sparsity is m, this can be done in time O(m). One disadvantage of such methods is that unlike other methods like Gaussian elimination, this only returns an approximate solution, the gap becoming smaller the longer the algorithm runs. However, they are quite fast and require a low amount of space. he basic idea behind iterative methods is that to solve a system of linear equations Ax = b, where A is symmetric and positive definite, we start with a vector x 0, perform the linear operation A on it (along with some vector additions) to get x, and iteratively keep performing these operations to get the sequence x 0, x,..., x t and we stop when x t is sufficiently close to the vector x which satisfies Ax = b. he exact details will be outlined below, but we can see that the only expensive operation is multiplying by the matrix A, which is fast if A is sparse. Before we dive into the algorithm, we note that if Ax = b, then for any scalar α, we have αax = αb, rearranging which gives us x = (I αa)x + αb. 0 his lecture is derived from Spielman s Lecture 2, http://www.cs.yale.edu/homes/ spielman/56/lect2-5.pdf; and Spielman and Woo 2009 https://arxiv.org/pdf/0903. 286v.pdf. 5-

his tells us that x is the fixed point of the affine transformation indicated by the equation, and naturally leads to an iterative algorithm, called the Richardson Iteration. Algorithm : Richardson Iteration x 0 0 for t to k do x t (I αa)x t + αb We now analyze the above algorithm and prove that this converges to a vector close to the solution vector x. We will see that the convergence of this algorithm depends on the spectral norm, I αa, which is the maximum absolute value of the eigenvalues of I αa. Suppose that λ λ 2 λ n are the eigenvalues of the matrix I αa. hen the eigenvalues of I αa are αλ αλ 2 αλ n, and thus I αa = max i αλ i = max ( αλ, αλ n ) his is minimized when we take α = 2 λ +λ n, yielding I αa = 2λ Now we turn to the analysis of the convergence of the Richardson Iteration. x x t = [(I αa)x + αb] [ (I αa)x t + αb ] = (I αa)(x x t ) = (I αa) 2 (x x t 2 ) = (I αa) t (x x 0 ) = (I αa) t x. λ +λ n. We define x t to be close to the x when the norm of their difference is a small fraction of the norm of x. hen x x t = (I αa) t x (I αa) t ( x ) ( = 2λ ) t x λ + λ ( n ) 2λ exp x, λ + λ n where the final step used the fact that x e x. o make this difference ɛ-small, we set t = λ ( ) ( + λ n λn ln = + ) ( ) ln, 2λ ɛ 2λ 2 ɛ 5-2

so that we obtain x x t ɛ x. We can see that the speed of convergence, i.e the number of iterations required to get close to the solution to Ax = b, depends on the ratio of the largest and smallest λ eigenvalues, n λ, and that the larger it is, the longer it takes for the algorithm to converge to the approximate solution. Definition For a symmetric, positive definite matrix A with eigenvalues λ λ 2... λ n, its condition number is defined as κ(a) = λ n λ his algorithm was just one of the examples of an iterative methods to find an approximate solution to a linear system. here are other, faster methods (such as the Chebyshev method and Conjugate Gradient) that find an ɛ-approximate solution in O( κ(a) ln ( ɛ ) ) iterations. 2 Preconditioning In any case, we can see that if we modify the initial problem so that the condition number decreases, then the algorithms will run faster. Of course, since we change the problem, we need to worry about the implications on how far the new solution is from the old, and how the algorithm changes by changing the initial matrix. One such idea is precondition the matrix. For a matrix B 0 (B 0) that is symmetric and has the same nullspace as A, instead of solving Ax = b, solve B Ax = B b. Now we apply the iterative methods to the matrix B A. his provides an improvement because we will prove that for a careful choice of B, we can reduce the condition number of the new matrix, and thus approximate the solution faster. For solving L G p = b, we precondition by L H, where H is a subgraph of G. In particular, we precondition by L where is a spanning tree of G. his idea is attributed to Vaidya ([]). Now the relevant condition number is λ n (L L G)/λ 2 (L L G): We know that the smallest eigenvalue is zero, and thus look at the smallest positive eigenvalue for the condition number, which assuming the graph is connected is λ 2. Definition 2 A B iff B A 0 Claim L L G, where is a spanning tree of G. Proof: x, x L x (i,j) (x(i) x(j))2 (i,j) E (x(i) x(j))2 = x L G x and thus x (L G L )x 0 L G L 0 L L G. Note that this proof also holds for any subgraph of G, not just a spanning tree. 5-3

Claim 2 L L G has the same spectrum as L /2 L GL /2, where L /2 i:λ i 0 λi x i x i and λ i, x i are corresponding eigenvalues and eigenvectors of L. Proof: Consider an eigenvector x of L L G of eigenvalue λ such that x, e = 0. hen since L L Gx = λx, on setting x = L /2 y, we get L L GL /2 y = λl /2 y. Premultiplying both sides by L /2 G i:λ i 0 λi x i x i, we obtain L /2 L GL /2 y = λy, implying that λ is an eigenvalue of L /2 L GL /2 as well. Using these results, we can prove a bound on the smallest positive eigenvalue of L L G. Lemma 3 λ 2 (L L G). Proof: λ 2 (L L G) = λ 2 (L /2 L GL /2 ) = min x: x,e =0 = min y=l /2 x x,e =0, x L /2 L GL /2 x x y L G y y L y where the final step used the fact that L L G. So we have bounded the denominator of the condition number of L L G, and we now turn to upper-bounding the numerator. Suppose that G is a weighted graph, with weights 0, for each edge (i, j) w(i,j) E. he above proof for bounding λ 2 (L L G) can be used to prove the same even for the weighted case. From the last lecture, recall that for a spanning tree of G, and an edge e = (k, l) E, the stretch of e is defined as w(i, j) st (e) = and that the total stretch of the graph is (i,j) on k-l path st (G) e E w(k, l) st (e) Lemma 4 (Spielman, Woo 09) tr(l L G) = st (G) 5-4

From this lemma, we can arrive at the required bound on the largest eigenvalue, λ n (L L G) st (G). Proof of Lemma 4: tr(l L G) = tr L w(k, l) (e k e l )(e k e l ) (a) = st (G), ( ) w(k, l) tr L (e k e l )(e k e l ) ( ) w(k, l) tr (e k e l ) L (e k e l ) w(k, l) r eff(k, l) w(k, l) (i,j) in k-l path in w(i, j) where (a) used the cyclic property of the trace; that is, the trace is invariant under cyclic permutations, and thus tr(abc) = tr(bca) = tr(cab), and r eff (k, l) is the effective resistance in the tree for sending one unit of current from k to l, with conductances. w(i,k) hus, from the previous two lemmas, we can see that the condition number of L L G is at most st (G), and thus the linear system L L Gp = L b can be ɛ- approximately solved for p in O( st (G) ln ) iterations. ɛ But now each iteration consists of multiplying by the matrix L L G, and initially, we need to compute L b as well. hus we can see that we need to be able to compute the product of a vector with L in an efficient way. Suppose that we have to compute z = L y, equivalently, solve L z = y, then it turns out that since is not just any subgraph but rather a spanning tree, this computation can be done in time O(n). o see this, we write down the equations in the system L z = y: d (i)z(i) z(j) = y(i) i V. j:{i,j} Suppose that i is a leaf in, with an incident edge (i, j). hen the relevant equation for this node is z(i) z(j) = y(i), i.e, z(i) = z(j) + y(i). Note that since i is a leaf, the only equation in which the variable z(i) appears is this one and the equation for z(j). hus we can substitute for z(i) with z(j) + y(i) and recurse on the smaller tree excluding the vertex i. his recursion will continue until we end up with a single edge (k, l). In this case, we set z(k) = 0, and back substitute to find the values of z for all 5-5

the other vertices. It can be seen that this process takes O(n) time, as in each step of the recursion, we do constant work and there are n recursive steps. hus we can compute the matrix product with L L G in time O(m), and recalling that for a graph G, we can find a low stretch spanning tree of stretch st (G) = O(m log n log log n) in time O(m log n log log n), we can see that given the system L G p = b, in O(m log n log log n + m st (G) ln ) = Õ(m 3 2 ln ) time, we can find an ɛ ɛ ɛ-approximate solution. Remember that in finding an upper bound for the largest eigenvector of L L G, we bounded it by its trace. Spielman and Woo improved upon this running time bound by using the following result. heorem 5 (Axelsson, Lindskog 86; as cited in Spielman, Woo 09) For matrices A, B 0 with the same nullspace, let all but q eigenvalues of B A lie in the interval [l, u], with the remaining eigenvalues larger than u. hen for a vector b in the rangespace of A, using the preconditioned conjugate gradient algorithm, an ɛ- approximate solution such that x A b A ɛ A b A can be found in q+ ln 2 u 2 ɛ l iterations, where x A = x Ax. We can use this theorem and since we have a bound on the trace, we can bound the number of large eigenvalues: Set l =, u = (st (G)) 2 3, then we can have at most q = u = (st (G)) 3 eigenvalues of value more than u. Now u = q, and thus l we get that the number of iteration required to solve the system approximately is O((st (G)) 3 ln ) = Õ(m 4 3 ln ). ɛ ɛ References [] Pravin M. Vaidya. Solving linear equations with symmetric diagonally dominant matrices by constructing good preconditioners. Unpublished manuscript UIUC 990. A talk based on the manuscript was presented at the IMA Workshop on Graph heory and Sparse Matrix Computation, October 99, Minneapolis., 990. 5-6