SVD, discrepancy, and regular structure of contingency tables

Size: px
Start display at page:

Download "SVD, discrepancy, and regular structure of contingency tables"

Transcription

1 SVD, discrepancy, and regular structure of contingency tables Marianna Bolla Institute of Mathematics Technical University of Budapest (Research supported by the TÁMOP B-10/ project) European Meeting of Statisticians, Budapest July 21, 2013

2 Outline Motivation: to find relatively small number of groups of objects, belonging to rows and columns of a contingency table which exhibit homogeneous behavior with respect to each other and do not differ significantly in size; to make inferences on the separation that can be achieved for a given number of clusters, minimum normalized bicuts are investigated and related to the SVD of the correspondence matrix. Topics: singular value decomposition (SVD) of a correspondence matrix; SVD and normalized bicuts of the contingency table; volume-regular row column clusters pairs; application and possible extension to directed graphs.

3 References We partly extend the result of Butler, S., Using discrepancy to control singular values for nonnegative matrices, Lin. Alg. Appl. (2006) for estimating the discrepancy of a contingency table by the second largest singular value of the normalized table (one-cluster, rectangular case), and the result of B, M., Modularity spectra, eigen-subspaces, and structure of weighted graphs, European Journal of Combinatorics (2013) for estimating the constant of volume-regularity by the structural eigenvalues and the distances of the corresponding eigen-subspaces of the normalized modularity matrix of an edge-weighted graph.

4 SVD of contingency tables and correspondence matrices C = (c ij ): n m contingency table, c ij 0. Row set: Row = {1,..., n} Column set: Col = {1,..., m} d row,i = m c ij (i = 1,..., n) j=1 n d col,j = c ij (j = 1,..., m) i=1 D row = diag (d row,1,..., d row,n ) D col = diag (d col,1,..., d col,m ).

5 Representation For a given integer 1 k min{n, m}, we are looking for k-dimensional representatives r 1,..., r n of the rows and c 1,..., c m of the columns such that they minimize the objective function Q k = n m c ij r i c j 2 (1) i=1 j=1 subject to n d row,i r i ri T = I k, i=1 m d col,j c j c T j = I k. (2) j=1 When minimized, the objective function Q k favors k-dimensional placement of the rows and columns such that representatives of highly associated rows and columns are forced to be close to each other. As we will see, this is equivalent to the problem of correspondence analysis.

6 Solution X := (x 1,..., x k ) = (r T 1,..., rt n ) T n k Y := (y 1,..., y k ) = (c T 1,..., ct m) T m k subject to Q k = 2k tr (D 1/2 rowx) T C corr (D 1/2 Y) min col X T D row X = I k, Y T D col Y = I k, where C corr = D 1/2 row CD 1/2 col : correspondence matrix (normalized contingency table) belonging to the table C.

7 Representation theorem Let C corr = r i=1 s iv i u T i be SVD, where r min{n, m} is the rank of C corr, or equivalently (since there are not identically zero rows or columns), the rank of C and 1 = s 1 s 2 s r > 0. v 1 = ( d row,1,..., d row,n ) T and u 1 = ( d col,1,..., d col,m ) T. Let k r be a positive integer such that s k > s k+1. Then min Q k = 2k k i=1 and it is attained with X = D 1/2 row (v 1,..., v k ) and Y = D 1/2 col (u 1,..., u k ). s i

8 Normalized bicuts of contingency tables Given an integer k (0 < k r), we want to simultaneously partition the rows and columns into disjoint, nonempty subsets Row = R 1 R k, Col = C 1 C k such that the cuts c(r a, C b ) = i R a j C b c ij (a, b = 1,..., k) between the row-column cluster pairs be as homogeneous as possible. The normalized two-way cut of C given the above k-partitions P row = (R 1,..., R k ) and P col = (C 1,..., C k ) and the collection of signs σ: ν k (P row, P col, σ) = k ( k a=1 b=1 where 1 Vol(R a ) + 1 Vol(C b ) + 2σ ab δ ab Vol(Ra )Vol(C b ) ) c(r a, C b ),

9 Vol(R a ) = i R a d row,i, Vol(C b ) = j C b d col,j, δ ab is the Kronecker delta, σ ab = ±1, and σ = (σ 11,..., σ kk ). For the normalized bicut of C: min ν k(p row, P col, σ) 2k P row,p col,σ k s i. i=1

10 Special case: edge-weighted graph W: symmetric n n edge-weigh matrix. D = D row = D col, W corr = D 1/2 WD 1/2 : normalized modularity matrix, I W corr : normalized Laplacian. When the k 1 largest absolute value eigenvalues of W corr are all positive: r i = c i (i = 1,..., n = m). With the choice σ bb = 1 (b = 1,..., k), ν k (P row, P col, σ) is twice the normalized cut of the graph with respect to the k-partition P row = P col of the vertices. The normalized bicut favors k-partitions with low inter-cluster edge-densities: community structure. When the k 1 largest absolute value eigenvalues of the normalized modularity matrix are all negative: r i = c i and the minimum of the normalized bicut is attained with the choice σ bb = 1 (b = 1,..., k). The normalized bicut favors k-partitions with low intra-cluster edge-densities: anticommunity structure.

11 Regular row-column cluster pairs The Expander Mixing Lemma for edge-weighted graphs naturally extends to this situation (Butler): for all R Row and C Col c(r, C) Vol(R)Vol(C) s 2 Vol(R)Vol(C), where s 2 is the largest but 1 singular value of C corr. Since the spectral gap of C corr is 1 s 2, in view of the above Expander Mixing Lemma, large spectral gap is an indication of small discrepancy: the weighted cut between any row and column subset of the contingency table is near to what is expected in a random table.

12 Volume-regular cluster pairs We extend the notion of discrepancy to volume-regular pairs. Definition The row column cluster pair R Row, C Col of the contingency table C of total volume 1 is γ-volume regular if for all X R and Y C the relation c(x, Y ) ρ(r, C)Vol(X )Vol(Y ) γ Vol(R)Vol(C) c(r,c) holds, where ρ(r, C) = Vol(R)Vol(C) is the relative inter-cluster density of the row column pair R, C. We will show that for given k, if the clusters are formed via applying the weighted k-means algorithm for the optimal row- and column representatives, respectively, then the so obtained row column cluster pairs are homogeneous in the sense that they form equally dense parts of the contingency table.

13 Weighted k-variance The weighted k-variance of the k-dimensional row representatives is defined by k Sk 2 (X) = min d row,j r j r a 2, (R 1,...,R k ) a=1 j R a where r a = 1 Vol(R a) j R a d row,j r j is the weighted center of cluster R a (a = 1,..., k). Similarly, the weighted k-variance of the k-dimensional column representatives is k Sk 2 (Y) = min d col,j c j c a 2, (C 1,...,C k ) a=1 j C a where c a = 1 Vol(C a) j C a d col,j c j is the weighted center of cluster C a (a = 1,..., k). Observe, that the trivial vector components can be omitted, and the k-variance of the so obtained (k 1)-dimensional representatives will be the same.

14 Theorem Theorem Let C be a contingency table of n-element row set Row and m-element column set Col, with row- and column sums d row,1,..., d row,n and d col,1,..., d col,m, respectively. Suppose that n m i=1 j=1 c ij = 1 and there are no dominant rows and columns: d row,i = Θ(1/n), (i = 1,..., n) and d col,j = Θ(1/m), (j = 1,..., m) as n, m. Let the singular values of C corr be 1 = s 1 > s 2 s k > ε s i, i k + 1. The partition (R 1,..., R k ) of Row and (C 1,..., C k ) of Col are defined so that they minimize the weighted k-variances Sk 2 (X) and Sk 2 (Y) of the row and column representatives. Suppose that there are constants 0 < K 1, K 2 1 k such that R i K 1 n and C i K 2 m (i = 1,..., k), respectively. Then the R i, C j pairs are O( 2k(S k (X)S k (Y)) + ε)-volume regular (i, j = 1,..., k).

15 Application We applied the biclustering algorithm to find simultaneously clusters of stores and products based on their consumption in TESCO stores. We found 3 clusters of the stores in which the consumption of the products belonging to the same cluster was c(r homogeneous with consumption-density a,c b ) Vol(R a)vol(c b ) between store-cluster R a and product-cluster C b (a, b = 1,..., 3). After sorting the rows and columns according to their cluster c memberships, we plotted the entries ij d row,i d col,j (there was one exceptional store-cluster which contained only 3 stores, but the others could be identified with groups of smaller and larger stores associated with product groups of high consumption-density)

16 Directed graphs W: n n (not symmetric) weight matrix of a directed graph, where w ij is the weight of the i j edge (i, j = 1,..., n; i j), and w ii = 0 (i = 1,..., n). The generalized in- and out-degrees: n d out,i = w ij (i = 1,..., n) d in,j = j=1 n i=1 w ij (j = 1,..., n). D in = diag (d in,1,..., d in,n ) and D out = diag (d out,1,..., d out,n ) are the in- and out-degree matrices. Suppose that there are no sources and sinks (i.e. no zero out- and in-degrees). W corr = D 1/2 out WD 1/2 in, and its SVD is used to minimize the normalized bicut of W as a contingency table.

17 Regular in- and out-vertex cluster pairs The V in, V out in- and out-vertex cluster pair of the directed graph (with sum of the weights of directed edges 1) is γ-volume regular if for all X V out and Y V in the relation w(x, Y ) ρ(v out, V in )Vol out (X )Vol in (Y ) γ Vol out (V out )Vol in (V in ) holds, where the directed cut w(x, Y ) is the sum the weights of the X Y edges, Vol out (X ) = i X d out,i, Vol in (Y ) = j Y d in,j, and w(v ρ(v out, V in ) = out,v in ) Vol out(v out)vol in (V in ) is the relative inter-cluster density of the out in cluster pair V out, V in. The clustering (V in,1,..., V in,k ) and (V out,1,..., V out,k ) of the columns and rows guaranteed by the above theorem corresponds to in- and out-clusters of the same vertex set such that the directed information flow V out,a V in,b is as homogeneous as possible for all a, b = 1,..., k pairs. Emigration immigration patterns of countries.

18 Thank you for your attention

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Singular value decomposition (SVD) of large random matrices. India, 2010

Singular value decomposition (SVD) of large random matrices. India, 2010 Singular value decomposition (SVD) of large random matrices Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu India, 2010 Motivation New challenge of multivariate statistics:

More information

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010 Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT

More information

Spectral properties of modularity matrices

Spectral properties of modularity matrices Spectral properties of modularity matrices Marianna Bolla a,, Brian Bullins b, Sorathan Chaturapruek c, Shiwen Chen d, Katalin Friedl e a Institute of Mathematics, Budapest University of Technology and

More information

SVD OF LARGE RANDOM MATRICES 1 SINGULAR VALUE DECOMPOSITION OF LARGE RANDOM MATRICES

SVD OF LARGE RANDOM MATRICES 1 SINGULAR VALUE DECOMPOSITION OF LARGE RANDOM MATRICES SVD OF LARGE RANDOM MATRICES 1 SINGULAR VALUE DECOMPOSITION OF LARGE RANDOM MATRICES Marianna Bolla 1, Katalin Friedl 2, András Krámli 3 1 Institute of Mathematics, Budapest University of Technology and

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Graph fundamentals. Matrices associated with a graph

Graph fundamentals. Matrices associated with a graph Graph fundamentals Matrices associated with a graph Drawing a picture of a graph is one way to represent it. Another type of representation is via a matrix. Let G be a graph with V (G) ={v 1,v,...,v n

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics Spectral Graph Theory and You: and Centrality Metrics Jonathan Gootenberg March 11, 2013 1 / 19 Outline of Topics 1 Motivation Basics of Spectral Graph Theory Understanding the characteristic polynomial

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms : Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo A. E. Brouwer & W. H. Haemers 2008-02-28 Abstract We show that if µ j is the j-th largest Laplacian eigenvalue, and d

More information

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs Ma/CS 6b Class 3: Eigenvalues in Regular Graphs By Adam Sheffer Recall: The Spectrum of a Graph Consider a graph G = V, E and let A be the adjacency matrix of G. The eigenvalues of G are the eigenvalues

More information

7. Symmetric Matrices and Quadratic Forms

7. Symmetric Matrices and Quadratic Forms Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value

More information

Tactical Decompositions of Steiner Systems and Orbits of Projective Groups

Tactical Decompositions of Steiner Systems and Orbits of Projective Groups Journal of Algebraic Combinatorics 12 (2000), 123 130 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Tactical Decompositions of Steiner Systems and Orbits of Projective Groups KELDON

More information

Control Systems. Linear Algebra topics. L. Lanari

Control Systems. Linear Algebra topics. L. Lanari Control Systems Linear Algebra topics L Lanari outline basic facts about matrices eigenvalues - eigenvectors - characteristic polynomial - algebraic multiplicity eigenvalues invariance under similarity

More information

DYNAMIC AND COMPROMISE FACTOR ANALYSIS

DYNAMIC AND COMPROMISE FACTOR ANALYSIS DYNAMIC AND COMPROMISE FACTOR ANALYSIS Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu Many parts are joint work with Gy. Michaletzky, Loránd Eötvös University and G. Tusnády,

More information

Peter J. Dukes. 22 August, 2012

Peter J. Dukes. 22 August, 2012 22 August, 22 Graph decomposition Let G and H be graphs on m n vertices. A decompostion of G into copies of H is a collection {H i } of subgraphs of G such that each H i = H, and every edge of G belongs

More information

Chapter 2: Matrix Algebra

Chapter 2: Matrix Algebra Chapter 2: Matrix Algebra (Last Updated: October 12, 2016) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). Write A = 1. Matrix operations [a 1 a n. Then entry

More information

Spectra of Adjacency and Laplacian Matrices

Spectra of Adjacency and Laplacian Matrices Spectra of Adjacency and Laplacian Matrices Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Spectra

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

Lecture 02 Linear Algebra Basics

Lecture 02 Linear Algebra Basics Introduction to Computational Data Analysis CX4240, 2019 Spring Lecture 02 Linear Algebra Basics Chao Zhang College of Computing Georgia Tech These slides are based on slides from Le Song and Andres Mendez-Vazquez.

More information

Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J

Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J Central Groupoids, Central Digraphs, and Zero-One Matrices A Satisfying A 2 = J Frank Curtis, John Drew, Chi-Kwong Li, and Daniel Pragel September 25, 2003 Abstract We study central groupoids, central

More information

On marker set distance Laplacian eigenvalues in graphs.

On marker set distance Laplacian eigenvalues in graphs. Malaya Journal of Matematik, Vol 6, No 2, 369-374, 2018 https://doiorg/1026637/mjm0602/0011 On marker set distance Laplacian eigenvalues in graphs Medha Itagi Huilgol1 * and S Anuradha2 Abstract In our

More information

The Strong Largeur d Arborescence

The Strong Largeur d Arborescence The Strong Largeur d Arborescence Rik Steenkamp (5887321) November 12, 2013 Master Thesis Supervisor: prof.dr. Monique Laurent Local Supervisor: prof.dr. Alexander Schrijver KdV Institute for Mathematics

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

The following two problems were posed by de Caen [4] (see also [6]):

The following two problems were posed by de Caen [4] (see also [6]): BINARY RANKS AND BINARY FACTORIZATIONS OF NONNEGATIVE INTEGER MATRICES JIN ZHONG Abstract A matrix is binary if each of its entries is either or The binary rank of a nonnegative integer matrix A is the

More information

MAT 242 CHAPTER 4: SUBSPACES OF R n

MAT 242 CHAPTER 4: SUBSPACES OF R n MAT 242 CHAPTER 4: SUBSPACES OF R n JOHN QUIGG 1. Subspaces Recall that R n is the set of n 1 matrices, also called vectors, and satisfies the following properties: x + y = y + x x + (y + z) = (x + y)

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman

Kernels of Directed Graph Laplacians. J. S. Caughman and J.J.P. Veerman Kernels of Directed Graph Laplacians J. S. Caughman and J.J.P. Veerman Department of Mathematics and Statistics Portland State University PO Box 751, Portland, OR 97207. caughman@pdx.edu, veerman@pdx.edu

More information

Modularity and Graph Algorithms

Modularity and Graph Algorithms Modularity and Graph Algorithms David Bader Georgia Institute of Technology Joe McCloskey National Security Agency 12 July 2010 1 Outline Modularity Optimization and the Clauset, Newman, and Moore Algorithm

More information

Nonnegative Matrices I

Nonnegative Matrices I Nonnegative Matrices I Daisuke Oyama Topics in Economic Theory September 26, 2017 References J. L. Stuart, Digraphs and Matrices, in Handbook of Linear Algebra, Chapter 29, 2006. R. A. Brualdi and H. J.

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

Reachability of recurrent positions in the chip-firing game

Reachability of recurrent positions in the chip-firing game Egerváry Research Group on Combinatorial Optimization Technical reports TR-2015-10. Published by the Egerváry Research Group, Pázmány P. sétány 1/C, H 1117, Budapest, Hungary. Web site: www.cs.elte.hu/egres.

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs Chapter 17 Graphs and Graph Laplacians 17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs Definition 17.1. A directed graph isa pairg =(V,E), where V = {v

More information

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition

More information

Study Guide for Linear Algebra Exam 2

Study Guide for Linear Algebra Exam 2 Study Guide for Linear Algebra Exam 2 Term Vector Space Definition A Vector Space is a nonempty set V of objects, on which are defined two operations, called addition and multiplication by scalars (real

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

Spectral Analysis of k-balanced Signed Graphs

Spectral Analysis of k-balanced Signed Graphs Spectral Analysis of k-balanced Signed Graphs Leting Wu 1, Xiaowei Ying 1, Xintao Wu 1, Aidong Lu 1 and Zhi-Hua Zhou 2 1 University of North Carolina at Charlotte, USA, {lwu8,xying, xwu,alu1}@uncc.edu

More information

COMPSCI 514: Algorithms for Data Science

COMPSCI 514: Algorithms for Data Science COMPSCI 514: Algorithms for Data Science Arya Mazumdar University of Massachusetts at Amherst Fall 2018 Lecture 8 Spectral Clustering Spectral clustering Curse of dimensionality Dimensionality Reduction

More information

A linear algebraic view of partition regular matrices

A linear algebraic view of partition regular matrices A linear algebraic view of partition regular matrices Leslie Hogben Jillian McLeod June 7, 3 4 5 6 7 8 9 Abstract Rado showed that a rational matrix is partition regular over N if and only if it satisfies

More information

Detailed Proof of The PerronFrobenius Theorem

Detailed Proof of The PerronFrobenius Theorem Detailed Proof of The PerronFrobenius Theorem Arseny M Shur Ural Federal University October 30, 2016 1 Introduction This famous theorem has numerous applications, but to apply it you should understand

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

Commuting nilpotent matrices and pairs of partitions

Commuting nilpotent matrices and pairs of partitions Commuting nilpotent matrices and pairs of partitions Roberta Basili Algebraic Combinatorics Meets Inverse Systems Montréal, January 19-21, 2007 We will explain some results on commuting n n matrices and

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

An Algorithmist s Toolkit September 24, Lecture 5

An Algorithmist s Toolkit September 24, Lecture 5 8.49 An Algorithmist s Toolkit September 24, 29 Lecture 5 Lecturer: Jonathan Kelner Scribe: Shaunak Kishore Administrivia Two additional resources on approximating the permanent Jerrum and Sinclair s original

More information

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering Some Slides Jean Gallier Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104,

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

Eigenvalues of 2-edge-coverings

Eigenvalues of 2-edge-coverings Eigenvalues of -edge-coverings Steve Butler October 3, 007 Abstract A -edge-covering between G and H is an onto homomorphism from the vertices of G to the vertices of H so that each edge is covered twice

More information

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz Spectral K-Way Ratio-Cut Partitioning Part I: Preliminary Results Pak K. Chan, Martine Schlag and Jason Zien Computer Engineering Board of Studies University of California, Santa Cruz May, 99 Abstract

More information

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional

More information

Intrinsic products and factorizations of matrices

Intrinsic products and factorizations of matrices Available online at www.sciencedirect.com Linear Algebra and its Applications 428 (2008) 5 3 www.elsevier.com/locate/laa Intrinsic products and factorizations of matrices Miroslav Fiedler Academy of Sciences

More information

Lecture Summaries for Linear Algebra M51A

Lecture Summaries for Linear Algebra M51A These lecture summaries may also be viewed online by clicking the L icon at the top right of any lecture screen. Lecture Summaries for Linear Algebra M51A refers to the section in the textbook. Lecture

More information

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 222 3. M Test # July, 23 Solutions. For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For

More information

Note on the normalized Laplacian eigenvalues of signed graphs

Note on the normalized Laplacian eigenvalues of signed graphs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 44 (2009), Pages 153 162 Note on the normalized Laplacian eigenvalues of signed graphs Hong-hai Li College of Mathematic and Information Science Jiangxi Normal

More information

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5. 2. LINEAR ALGEBRA Outline 1. Definitions 2. Linear least squares problem 3. QR factorization 4. Singular value decomposition (SVD) 5. Pseudo-inverse 6. Eigenvalue decomposition (EVD) 1 Definitions Vector

More information

Spectral Graph Theory

Spectral Graph Theory Spectral raph Theory to appear in Handbook of Linear Algebra, second edition, CCR Press Steve Butler Fan Chung There are many different ways to associate a matrix with a graph (an introduction of which

More information

A Spectral Characterization of Closed Range Operators 1

A Spectral Characterization of Closed Range Operators 1 A Spectral Characterization of Closed Range Operators 1 M.THAMBAN NAIR (IIT Madras) 1 Closed Range Operators Operator equations of the form T x = y, where T : X Y is a linear operator between normed linear

More information

Dimension Reduction and Iterative Consensus Clustering

Dimension Reduction and Iterative Consensus Clustering Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered

More information

Zero controllability in discrete-time structured systems

Zero controllability in discrete-time structured systems 1 Zero controllability in discrete-time structured systems Jacob van der Woude arxiv:173.8394v1 [math.oc] 24 Mar 217 Abstract In this paper we consider complex dynamical networks modeled by means of state

More information

A Characterization of Distance-Regular Graphs with Diameter Three

A Characterization of Distance-Regular Graphs with Diameter Three Journal of Algebraic Combinatorics 6 (1997), 299 303 c 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. A Characterization of Distance-Regular Graphs with Diameter Three EDWIN R. VAN DAM

More information

ORIE 6334 Spectral Graph Theory September 8, Lecture 6. In order to do the first proof, we need to use the following fact.

ORIE 6334 Spectral Graph Theory September 8, Lecture 6. In order to do the first proof, we need to use the following fact. ORIE 6334 Spectral Graph Theory September 8, 2016 Lecture 6 Lecturer: David P. Williamson Scribe: Faisal Alkaabneh 1 The Matrix-Tree Theorem In this lecture, we continue to see the usefulness of the graph

More information

THIS paper is an extended version of the paper [1],

THIS paper is an extended version of the paper [1], INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES Volume 8 04 Alternating Schur Series and Necessary Conditions for the Existence of Strongly Regular Graphs Vasco Moço Mano and

More information

DRAGAN STEVANOVIĆ. Key words. Modularity matrix; Community structure; Largest eigenvalue; Complete multipartite

DRAGAN STEVANOVIĆ. Key words. Modularity matrix; Community structure; Largest eigenvalue; Complete multipartite A NOTE ON GRAPHS WHOSE LARGEST EIGENVALUE OF THE MODULARITY MATRIX EQUALS ZERO SNJEŽANA MAJSTOROVIĆ AND DRAGAN STEVANOVIĆ Abstract. Informally, a community within a graph is a subgraph whose vertices are

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND First Printing, 99 Chapter LINEAR EQUATIONS Introduction to linear equations A linear equation in n unknowns x,

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Algebraic Methods in Combinatorics

Algebraic Methods in Combinatorics Algebraic Methods in Combinatorics Po-Shen Loh 27 June 2008 1 Warm-up 1. (A result of Bourbaki on finite geometries, from Răzvan) Let X be a finite set, and let F be a family of distinct proper subsets

More information

On star forest ascending subgraph decomposition

On star forest ascending subgraph decomposition On star forest ascending subgraph decomposition Josep M. Aroca and Anna Lladó Department of Mathematics, Univ. Politècnica de Catalunya Barcelona, Spain josep.m.aroca@upc.edu,aina.llado@upc.edu Submitted:

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

Chapter One. Introduction

Chapter One. Introduction Chapter One Introduction Besides the introduction, front matter, back matter, and Appendix (Chapter 15), the book consists of two parts. The first part comprises Chapters 2 7. Here, fundamental properties

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

Regular factors of regular graphs from eigenvalues

Regular factors of regular graphs from eigenvalues Regular factors of regular graphs from eigenvalues Hongliang Lu Center for Combinatorics, LPMC Nankai University, Tianjin, China Abstract Let m and r be two integers. Let G be a connected r-regular graph

More information

Some bounds for the spectral radius of the Hadamard product of matrices

Some bounds for the spectral radius of the Hadamard product of matrices Some bounds for the spectral radius of the Hadamard product of matrices Tin-Yau Tam Mathematics & Statistics Auburn University Georgia State University, May 28, 05 in honor of Prof. Jean H. Bevis Some

More information

Math 4377/6308 Advanced Linear Algebra

Math 4377/6308 Advanced Linear Algebra 1.3 Subspaces Math 4377/6308 Advanced Linear Algebra 1.3 Subspaces Jiwen He Department of Mathematics, University of Houston jiwenhe@math.uh.edu math.uh.edu/ jiwenhe/math4377 Jiwen He, University of Houston

More information

On the closures of orbits of fourth order matrix pencils

On the closures of orbits of fourth order matrix pencils On the closures of orbits of fourth order matrix pencils Dmitri D. Pervouchine Abstract In this work we state a simple criterion for nilpotentness of a square n n matrix pencil with respect to the action

More information

Fiedler s Theorems on Nodal Domains

Fiedler s Theorems on Nodal Domains Spectral Graph Theory Lecture 7 Fiedler s Theorems on Nodal Domains Daniel A Spielman September 9, 202 7 About these notes These notes are not necessarily an accurate representation of what happened in

More information

Hadamard matrices and strongly regular graphs with the 3-e.c. adjacency property

Hadamard matrices and strongly regular graphs with the 3-e.c. adjacency property Hadamard matrices and strongly regular graphs with the 3-e.c. adjacency property Anthony Bonato Department of Mathematics, Wilfrid Laurier University, Waterloo, Ontario, Canada, N2L 3C5 abonato@wlu.ca

More information

3 Best-Fit Subspaces and Singular Value Decomposition

3 Best-Fit Subspaces and Singular Value Decomposition 3 Best-Fit Subspaces and Singular Value Decomposition (SVD) Think of the rows of an n d matrix A as n data points in a d-dimensional space and consider the problem of finding the best k-dimensional subspace

More information

Z-Pencils. November 20, Abstract

Z-Pencils. November 20, Abstract Z-Pencils J. J. McDonald D. D. Olesky H. Schneider M. J. Tsatsomeros P. van den Driessche November 20, 2006 Abstract The matrix pencil (A, B) = {tb A t C} is considered under the assumptions that A is

More information

Topics in Graph Theory

Topics in Graph Theory Topics in Graph Theory September 4, 2018 1 Preliminaries A graph is a system G = (V, E) consisting of a set V of vertices and a set E (disjoint from V ) of edges, together with an incidence function End

More information

A Characterization of (3+1)-Free Posets

A Characterization of (3+1)-Free Posets Journal of Combinatorial Theory, Series A 93, 231241 (2001) doi:10.1006jcta.2000.3075, available online at http:www.idealibrary.com on A Characterization of (3+1)-Free Posets Mark Skandera Department of

More information

The Informativeness of k-means for Learning Mixture Models

The Informativeness of k-means for Learning Mixture Models The Informativeness of k-means for Learning Mixture Models Vincent Y. F. Tan (Joint work with Zhaoqiang Liu) National University of Singapore June 18, 2018 1/35 Gaussian distribution For F dimensions,

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding

More information

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA ELEMENTARY LINEAR ALGEBRA K R MATTHEWS DEPARTMENT OF MATHEMATICS UNIVERSITY OF QUEENSLAND Second Online Version, December 998 Comments to the author at krm@mathsuqeduau All contents copyright c 99 Keith

More information

3.3 Easy ILP problems and totally unimodular matrices

3.3 Easy ILP problems and totally unimodular matrices 3.3 Easy ILP problems and totally unimodular matrices Consider a generic ILP problem expressed in standard form where A Z m n with n m, and b Z m. min{c t x : Ax = b, x Z n +} (1) P(b) = {x R n : Ax =

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Lecture 3: graph theory

Lecture 3: graph theory CONTENTS 1 BASIC NOTIONS Lecture 3: graph theory Sonia Martínez October 15, 2014 Abstract The notion of graph is at the core of cooperative control. Essentially, it allows us to model the interaction topology

More information

The Matrix-Tree Theorem

The Matrix-Tree Theorem The Matrix-Tree Theorem Christopher Eur March 22, 2015 Abstract: We give a brief introduction to graph theory in light of linear algebra. Our results culminates in the proof of Matrix-Tree Theorem. 1 Preliminaries

More information

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix. MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix. Basis Definition. Let V be a vector space. A linearly independent spanning set for V is called a basis.

More information