CMSE 820: Math. Foundations of Data Sci.

Similar documents
MATH10212 Linear Algebra B Proof Problems

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Machine Learning for Data Science (CS 4786)

Stochastic Matrices in a Finite Field

LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK)

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Fastest mixing Markov chain on a path

On Nonsingularity of Saddle Point Matrices. with Vectors of Ones

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

State Space Representation

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Lecture 8: October 20, Applications of SVD: least squares approximation

5.1 Review of Singular Value Decomposition (SVD)

CHAPTER I: Vector Spaces

Chimica Inorganica 3

18.S096: Homework Problem Set 1 (revised)

(VII.A) Review of Orthogonality

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

AN INTRODUCTION TO SPECTRAL GRAPH THEORY

Matrix Algebra from a Statistician s Perspective BIOS 524/ Scalar multiple: ka

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

CHAPTER 5. Theory and Solution Using Matrix Techniques

Math 155 (Lecture 3)

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

1. ARITHMETIC OPERATIONS IN OBSERVER'S MATHEMATICS

Theorem: Let A n n. In this case that A does reduce to I, we search for A 1 as the solution matrix X to the matrix equation A X = I i.e.

Zeros of Polynomials

Matrix Algebra 2.3 CHARACTERIZATIONS OF INVERTIBLE MATRICES Pearson Education, Inc.

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

Spectral Partitioning in the Planted Partition Model

A NOTE ON SPECTRAL CONTINUITY. In Ho Jeon and In Hyoun Kim

Achieving Stationary Distributions in Markov Chains. Monday, November 17, 2008 Rice University

Notes for Lecture 11

1 Last time: similar and diagonalizable matrices

Machine Learning for Data Science (CS 4786)

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

Topic 9: Sampling Distributions of Estimators

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

Topics in Eigen-analysis

Chapter Unary Matrix Operations

Determinants of order 2 and 3 were defined in Chapter 2 by the formulae (5.1)

MAT1026 Calculus II Basic Convergence Tests for Series

Polynomial Functions and Their Graphs

Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

PAPER : IIT-JAM 2010

Math Solutions to homework 6

The inverse eigenvalue problem for symmetric doubly stochastic matrices

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Symmetric Matrices and Quadratic Forms

6.3 Testing Series With Positive Terms

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

ki, X(n) lj (n) = (ϱ (n) ij ) 1 i,j d.

Chapter 6 Principles of Data Reduction

PAijpam.eu ON TENSOR PRODUCT DECOMPOSITION

c 2006 Society for Industrial and Applied Mathematics

Math 778S Spectral Graph Theory Handout #3: Eigenvalues of Adjacency Matrix

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

Chapter Vectors

A class of spectral bounds for Max k-cut

The Basic Space Model

Iterative Techniques for Solving Ax b -(3.8). Assume that the system has a unique solution. Let x be the solution. Then x A 1 b.

Linear Classifiers III

Lecture 23: Minimal sufficiency

A Note On The Exponential Of A Matrix Whose Elements Are All 1

Estimation for Complete Data

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

PRACTICE FINAL/STUDY GUIDE SOLUTIONS

18.S096: Principal Component Analysis in High Dimensions and the Spike Model

4. Hypothesis testing (Hotelling s T 2 -statistic)

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.

Continuous Functions

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

Matrix Representation of Data in Experiment

Solutions to home assignments (sketches)

Some remarks for codes and lattices over imaginary quadratic

Lecture 3. Properties of Summary Statistics: Sampling Distribution

A Risk Comparison of Ordinary Least Squares vs Ridge Regression

Brief Review of Functions of Several Variables

Regression with quadratic loss

Basic Iterative Methods. Basic Iterative Methods

NAME: ALGEBRA 350 BLOCK 7. Simplifying Radicals Packet PART 1: ROOTS

Chandrasekhar Type Algorithms. for the Riccati Equation of Lainiotis Filter

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Linear chord diagrams with long chords

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Lecture 25 (Dec. 6, 2017)

A brief introduction to linear algebra

Solution to Chapter 2 Analytical Exercises

1 Principal Component Analysis in High Dimensions and the Spike Model

A Note on Matrix Rigidity

10-701/ Machine Learning Mid-term Exam Solution

Lecture 7: October 18, 2017

The Jordan Normal Form: A General Approach to Solving Homogeneous Linear Systems. Mike Raugh. March 20, 2005

REGRESSION WITH QUADRATIC LOSS

Chi-Squared Tests Math 6070, Spring 2006

Transcription:

Lecture 17 8.4 Weighted path graphs Take from [10, Lecture 3] As alluded to at the ed of the previous sectio, we ow aalyze weighted path graphs. To that ed, we prove the followig: Theorem 6 (Fiedler). Let P =(V, E, w) be a weighted path graph o vertices, let L P have eigevalues 0 = l 1 < l 2 apple applel, ad let j k be a eigevector with eigevalue l k. The j k chages sig k 1 times. We will eed to first prove a few lemmas i order to prove Theorem 6. The first of these is Sylvester s Law of Iertia: Theorem 7 (Sylvester s Law of Itertia). Let A be ay symmetric matrix ad let B be ay o-sigular matrix (that is, B has o zero sigular values). The, the matrix BAB T has the same umber of positive, egative ad zero eigevalues as A. Proof. We first recall three facts from liear algebra. 1. The first is that BAB 1 has the same eigevalues as A, sice: Aj = lj () BAB 1 (Bj) =l(bj). 2. The secod fact is that rak(a) =rak(bab). 3. The third is that every osigular matrix B ca be writte B = QR, where Q is a orthoormal matrix (meaig Q T Q = QQ T = I) ad R is a upper-triagular matrix with positive diagoals (this is the so-called QR factorizatio). We are goig to begi by usig a slight variatio of the last fact, ad write B = RQ. Now, sice Q T = Q 1, by the first fact we kow that QAQ T has exactly the same eigevalues as A. Defie 8 t 2 [0, 1], R t = tr +(1 t)i, 88

ad cosider the family of matrices 8 t 2 [0, 1], M t = R t QAQ T R T t. At t = 0 we have R 0 = I ad so M 0 = QAQ T has the same eigevalues as A. For t = 1 we have M 1 = BAB T. Sice all of the matrices M t are symmetric, they all have real eigevalues (by the Spectral Theorem). Additioally, the eigevalues of a symmetric matrix are cotiuous fuctios of the etries of the matrix. Therefore, if the umber of positive, egative, or zero eigevalues of BAB T differs from that of A, the there must be some t for which M t has more zero eigevalues tha does A. But the matrices R t are upper triagular with positive diagoal etries, ad hece are o-sigular (sice the eigevalues of R t are the diagoal etries). Thus the rak of M t must equal the rak of A, which meas this caot happe. Fiedler s Theorem will follow from a aalysis of the eigevalues of tri-diagoal matrices with zero row-sums. These may be viewed as Laplacias of weighted path graphs i which some edges are allowed to have egative weights. Lemma 2. Let M be a symmetric matrix such that M1 = 0. The: M = M ij L Gi,j. (53) Proof. Equatio (53) is a equality betwee two matrices. Let A deote the right had side matrix. O the off diagoal it is clear that both M (the LHS) ad A (the RHS) are equal. Notice as well that the right had side satisfies: A1 = M ij L Gi,j 1 = M ij 0 = 0. Thus M1 = 0 ad A1 = 0. Notice that these are sets of equatios ad ukows (i.e., the diagoal etries), which have uique solutios. Sice the off diagoal etries of M ad A are idetical, the equatios are the same, ad thus the solutios are as well, meaig that the diagoal of M ad A are the same. 89

Lemma 3. Let M be a symmetric tri-diagoal matrix with 2q positive off-diagoal etries such that M1 = 0. The M has q egative eigevalues. Proof. By Lemma 2 ad the fact that M is symmetric ad tri-diagoal, we may write: Thus for v 2 R, v T Mv = M = M i 1,i L Gi 1,i. M i 1,i (v[i 1] v[i]) 2. Now we perform a chage variables that will diagoalize the matrix M. Let d[1] =v[1] ad set d[i] =v[i] v[i 1] for i 2, so that: v[i] =d[1]+d[2]+ + d[i]. Notice that if we defie the lower triagular matrix T as: 0 1 1 0 0 0 T = B 1 1 0 0 C @....... A, 1 1 1 1 the v = Td. By Sylvester s Law of Iertia (Theorem 7), we kow that A = T T MT has the same umber of positive, egative ad zero eigevalues as M. O the other had, d T Ad = d T T T MTd = v T Mv = M i 1,i d[i] 2. Thus A has oe zero eigevalue (with eigevector d[1] =1, d[j] =0 for all j 2) ad a egative eigevalue M i 1,i for each M i 1,i > 0 (with eigevector d[i] =1, d[j] =0 for all j 6= i), of which there are q. 90

Proof of Theorem 6. We cosider the case whe j k has o zero etries. The proof for the geeral case may be obtaied by splittig the graph by removig the vertices with zero etries. For simplicity, we also assume that l k has multiplicity 1. Recall we wish to show that j k chages sig k 1 times. This is equivalet to showig that: #{i = 1,..., 1:j k [i]j k [i + 1] < 0} = k 1. Let V k deote the diagoal matrix with j k o the diagoal. Cosider the matrix: M = Vk T (L P l k I)V k. The ier matrix L P l k I has oe zero eigevalue ad k 1 egative eigevalues derived from the eigevalues ad eigevectors of L P. So, by Sylvester s Law of Iertia (Theorem 7), M has k 1 egative eigevalues, oe zero eigevalue, ad k postitive eigevalues. We are ow goig to use Lemma 3. The matrix M is clearly symmetric ad tri-diagoal, ad additioally: M1 = V T k (L P l k I)V k 1 = V T k (L P l k I)j k = V T k 0 = 0. Thus we ca apply Lemma 3 to M. We ote additioally that M i,i+1 = w(i, i + 1)j k [i]j k [i + 1], ad thus we see that M i,i+1 is positive precisely whe j k [i]j k [i + 1] < 0. Sice M has k 1 egative eigevalues, by Lemma 3 it must have k 1 positive etries o the upper diagoal, which meas that j k [i]j k [i + 1] < 0 must occur for exactly k 1 idices. 91

Refereces [1] Berhard Schölkopf ad Alexader J. Smola. Learig with Kerels: Support Vector Machies, Regularizatio, Optimizatio, ad Beyod. Adaptive Computatio ad Machie Learig. The MIT Press, 2002. [2] Afoso S. Badeira. Te lectures ad forty-two ope problems i the mathematics of data sciece. MIT course Topics i Mathematics of Data Sciece, 2015. [3] Jo Shles. A tutorial o pricipal compoet aalysis. arxiv:1404.1100, 2014. [4] Karl Pearso. O lies ad plaes of closest fit to systems of poits i space. Philosophical Magazie, Series 6, 2(11):559 572, 1901. [5] V. A. Marcheko ad L. A. Pastur. Distributio of eigevalues i certai sets of radom matrices. Mat. Sb. (N.S.), 72(114):507 536, 1967. [6] J. Baik, G. Be-Arous, ad S. Péché. Phase trasitio of the largest eigevalue for oull complex sample covariace matrices. The Aals of Probability, 33(5):1643 1697, 2005. [7] Debashis Paul. Asymptotics of sample eigestructure for a large dimesioal spiked covariace model. Statistica Siica, pages 1617 1642, 2007. [8] Trevor Hastie, Robert Tibshirai, ad Jerome Friedma. The Elemets of Statistical Learig. Spriger-Verlag New York, 2d editio, 2009. [9] Athaasios Tsaas ad Ageliki Xifara. Accurate quatitative estimatio of eergy performace of residetial buildigs usig statistical machie learig tools. Eergy ad Buildigs, 49:560 567, 2012. [10] Daiel A. Spielma. Spectral graph theory. Yale Course Notes, Fall, 2009. 97