arxiv: v1 [math.st] 26 Jan 2015

Similar documents
Exercise Sheet 1.

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

Chapter 3 Transformations

MA 1B ANALYTIC - HOMEWORK SET 7 SOLUTIONS

ANOVA: Analysis of Variance - Part I

Linear Algebra Formulas. Ben Lee

Introduction to Matrix Algebra

Matrix Algebra, part 2

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Linear Algebra Review. Vectors

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

STAT200C: Review of Linear Algebra

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

Differential Topology Final Exam With Solutions

Math Matrix Algebra

Conceptual Questions for Review

Linear Algebra: Characteristic Value Problem

Mathematical Methods wk 2: Linear Operators

1. Diagonalize the matrix A if possible, that is, find an invertible matrix P and a diagonal

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 and 2: Random Spanning Trees

1 Principal component analysis and dimensional reduction

MP 472 Quantum Information and Computation

Least-Squares Rigid Motion Using SVD

7. Symmetric Matrices and Quadratic Forms

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

Problem # Max points possible Actual score Total 120

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

Exercise Set 7.2. Skills

Maximum Likelihood Estimation

A Tropical Extremal Problem with Nonlinear Objective Function and Linear Inequality Constraints

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

A A x i x j i j (i, j) (j, i) Let. Compute the value of for and

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

Introduction to quantum information processing

Review of Linear Algebra

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Symmetric Matrices and Eigendecomposition

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Math 489AB Exercises for Chapter 1 Fall Section 1.0

Seminar on Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

ACM 104. Homework Set 4 Solutions February 14, 2001

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit.

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Announce Statistical Motivation ODE Theory Spectral Embedding Properties Spectral Theorem Other. Eigenproblems I

Degenerate Perturbation Theory

Review problems for MA 54, Fall 2004.

Chapter 7: Symmetric Matrices and Quadratic Forms

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

CLASSIFICATION OF COMPLETELY POSITIVE MAPS 1. INTRODUCTION

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Multivariate Statistical Analysis

Boolean Inner-Product Spaces and Boolean Matrices

Family Feud Review. Linear Algebra. October 22, 2013

Exercise Set Suppose that A, B, C, D, and E are matrices with the following sizes: A B C D E

Deep Learning Book Notes Chapter 2: Linear Algebra

Linear Algebra Review

CONCENTRATION OF THE MIXED DISCRIMINANT OF WELL-CONDITIONED MATRICES. Alexander Barvinok

TBP MATH33A Review Sheet. November 24, 2018

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Linear Algebra for Machine Learning. Sargur N. Srihari

Jordan normal form notes (version date: 11/21/07)

Singular Value Decomposition

Lecture 12 Eigenvalue Problem. Review of Eigenvalues Some properties Power method Shift method Inverse power method Deflation QR Method

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

Problems in Linear Algebra and Representation Theory

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

1 Singular Value Decomposition and Principal Component

Regularized Discriminant Analysis and Reduced-Rank LDA

SOLUTION KEY TO THE LINEAR ALGEBRA FINAL EXAM 1 2 ( 2) ( 1) c a = 1 0

Spectral Theorem for Self-adjoint Linear Operators

Example Linear Algebra Competency Test

The Singular Value Decomposition

MATH 583A REVIEW SESSION #1

5. Discriminant analysis

Transpose & Dot Product

Matrix Theory, Math6304 Lecture Notes from October 25, 2012

The Singular Value Decomposition

Part I: Preliminary Results. Pak K. Chan, Martine Schlag and Jason Zien. Computer Engineering Board of Studies. University of California, Santa Cruz

Data Mining and Analysis: Fundamental Concepts and Algorithms

CP3 REVISION LECTURES VECTORS AND MATRICES Lecture 1. Prof. N. Harnew University of Oxford TT 2013

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Transcription:

An alternative to Moran s I for spatial autocorrelation arxiv:1501.06260v1 [math.st] 26 Jan 2015 Yuzo Maruyama Center for Spatial Information Science, University of Tokyo e-mail: maruyama@csis.u-tokyo.ac.jp Abstract: Moran s I statistic, a popular measure of spatial autocorrelation, is revisited. The exact range of Moran s I is given as a function of spatial weights matrix. We demonstrate that some spatial weights matrices lead the absolute value of upper (lower) bound larger than 1 and that others lead the lower bound larger than 0.5. Thus Moran s I is unlike Pearson s correlation coefficient. It is also pointed out that some spatial weights matrices do not allow Moran s I to take positive values regardless of observations. An alternative measure with exact range [ 1, 1] is proposed through a monotone transformation of Moran s I. AMS 2000 subject classifications: Primary 62H11; secondary 62H20. Keywords and phrases: spatial correlation, Moran s I. 1. Introduction Spatial autocorrelation statistics measure and analyze the degree of dependency among observations in a geographic space. In this paper, we are interested in Moran s (1950) I statistic, which seems the most popular measure of spatial autocorrelation. To the best of my knowledge, however, the property has not been thoroughly investigated. To be worse, there are some misunderstandings with respect to similarity to Pearson s correlation coefficient. In some books as well as Wikipedia, it is claimed that the range of Moran s I is exactly [ 1, 1] like Pearson s correlation coefficient. However it is not true. In Section 2 of this paper, we give the exact range of Moran s I as a function of spatial weights matrix. In our numerical experiment, some spatial weights matrices lead the absolute value of upper (lower) bound larger than 1. Others lead the lower bound larger than 0.5. Thus Moran s I is quite different from Pearson s correlation coefficient with exact range [ 1, 1]. It is also pointed out that some spatial weights matrices do not allow Moran s I to take positive values regardless of observations. In Section 3, we propose an alternative measure with the exact range [ 1, 1], which is a monotone transformation of Moran s I. Actually, this is not the first paper which points that the range of Moran s I is not [ 1, 1]. To the best of my knowledge, Cliff and Ord (1981), Goodchild (1988), de Jong, Sprenger and van Veen (1984) and Waller and Gotway (2004) did so. From mathematical viewpoint, Theorem 2.1 in Section 2 is essentially the same as the results by de Jong, Sprenger and van Veen (1984). However any This work was partially supported by KAKENHI #25330035. 1

Y. Maruyama/An alternative to Moran s I 2 alternative with exact range [ 1, 1], which I hope, helps better understanding of spatial autocorrelation, has not yet been proposed. Furthermore it is notable that since the alternative is proposed by a monotone transformation of Moran s I, the permutation test of spatial independence based on Moran s I, which has been implemented in earlier studies, can be saved after the standard measure is replaced by the alternative. 2. Moran s I In this section, we give the exact range of Moran s I. Suppose that, in n spatial units s 1,..., s n, we observe y 1 = y(s 1 ),..., y n = y(s n ). We also suppose that spatial weights w ij, the weights between each pair of spatial units s i and s j, which satisfy w ij 0 for any i j, w ii = 0 for any i, (2.1) are given by some preset rules. Then Moran s I statistic is given by I = n n n i=1 j=1 w ij(y i ȳ)(y j ȳ) n i,j=1 w ij} n i=1 (y. i ȳ) 2 Let y = (y 1,..., y n ) T and ỹ = y ȳ1 n = (I n 1 n 1 T n/n)y where T denotes the transpose and 1 n denotes the n-dimensional vector of ones. With the spatial weights matrix W = (w ij ), Moran s I statistic is rewritten as I = n ỹ T W ỹ n i,j=1 w ij ỹ T ỹ. When W is not symmetric, the symmetric alternative (W + W T )/2 can be identified to W since n ỹ T W ỹ = ỹ T (W + W T )/2}ỹ, for any ỹ, w ij = 1 T nw 1 n = 1 T n(w + W T )/2}1 n = n i,j=1 i,j=1 w ij + w ji }/2. Thus, in this paper, we assume that W is symmetric. Let V be the (n 1)-dimensional subspace of R n orthogonal to 1 n. Then I n 1 n 1 T n/n is the matrix for projection onto V. The spectral decomposition of I n 1 n 1 T n/n is given by I n 1 n 1 T n/n = n 1 i=1 h ih T i = HH T (2.2) where h i R n, h T i h i = 1 and h T i 1 n = 0 for any 1 i n 1. Further, in (2.2), H = (h 1,..., h n 1 ) is the n (n 1) matrix where h 1,..., h n 1 } span the subspace V as the orthonormal bases. Let v = H T y R n 1. Then Moran s I statistic is rewritten as I = ỹt W ỹ n wỹ T ỹ = HH yt T W HH T y W v = vt n wy T HH T y v T v (2.3)

Y. Maruyama/An alternative to Moran s I 3 where w = n i,j=1 w ij/n 2 and W is the (n 1) (n 1) matrix given by W = (n w) 1 H T W H. (2.4) For (2.3), the standard result of maximization and minimization of quadratic form is available. Let λ (1) and λ (n 1) be the smallest and largest eigenvalue, among n 1 eigenvalues of W, respectively. Then, for any v R n 1, I [ λ (1), λ (n 1) ]. The lower and upper bound of I is attained by λ (1) v j (1) I = λ (n 1) v j (n 1), where j (1) and j (n 1) are the eigenvectors corresponding to λ (1) and λ (n 1) respectively. By the following equivalence, v j H T y j HH T y Hj (I 1 n 1 T n/n)y Hj we have a following result. y = c 1 1 n + c 2 Hj for any c 1 and nonzero c 2, Theorem 2.1. The exact range of Moran s I is I [ λ (1), λ (n 1) ] where λ (1) and λ (n 1) be the smallest and largest eigenvalue of among n 1 eigenvalues of W given by (2.4). The lower and upper bound is attained as follows; λ (1) y = c 1 1 n + c 2 Hj (1) I = λ (n 1) y = c 3 1 n + c 4 Hj (n 1), where any c 1, c 3 and any c 2, c 4 0. Remark 2.1. In (2.2), the choice of H or h 1,..., h n 1 is arbitrary. Clearly λ (1) and λ (n 1) are also expressed by λ (1) = z T W z min z T 1 n=0 n wz T z, λ z T W z (n 1) = max z T 1 n=0 n wz T z which do depend on W not on H. In practice, when λ (1) and λ (n 1) are calculated, the choice of H = (h 1,..., h n 1 ) based on Helmert orthogonal matrix is available, that is, h i for i = 1,..., n 1 given by h T i = (1 T i, i, 0 T n i 1)/ i(i + 1).

Y. Maruyama/An alternative to Moran s I 4 Table 1 attainable lower & upper bound of Moran s I n\q 1 2 3 lower upper lower upper lower upper 10-1.066 0.935-0.541 0.831-0.482 0.746 20-1.041 1.006-0.526 0.981-0.457 0.955 30-1.029 1.013-0.519 1.005-0.449 0.995 40-1.023 1.014-0.514 1.011-0.444 1.006 50-1.018 1.013-0.512 1.012-0.441 1.010 Example 2.1. For simplicity, suppose s 1,..., s n are on a line with equal spacing and spatial weights for s i, s j is 2 i j +1 1 i j q w ij = 0 i = j, i j > q. Table 1 gives lower and upper bound of Moran s I for some n and q. For any n and q, the range of I is not [ 1, 1]. It is demonstrated that some spatial weights matrices lead the absolute value of upper (lower) bound larger than 1 and that others lead the lower bound larger than 0.5, which is unexpected. Thus Moran s I is unlike Pearson s correlation coefficient with exact range [ 1, 1]. Remark 2.2. In all cases in Table 1, both λ (1) < 0 and λ (n 1) > 0 are satisfied. Researchers and practitioners would like measure of spatial correlation to take both positive and negative values depending upon positive and negative spatial correlation. It is clear that indefiniteness of W is equivalent to λ(1) < 0 and λ (n 1) > 0 simultaneously. Actually, as in below, some spatial weights matrices lead λ (n 1) < 0, or negative definiteness of W. In such a case, Moran s I cannot take positive values for any y, which is not desirable at all. In order to investigate definiteness of W, the following properties are useful on eigenvalues, trace or sum of all diagonal elements and their relationship; P1 When AB and BA are square matrices, tr(ab) = tr(ba), P2 When both A and B are square matrices, tr(a + B) = tra + trb, P3 Trace is equal to sum of all eigenvalues. From P1, P2 and w ii = 0, we have tr W = (n w) 1 trh T W H = (n w) 1 trw HH T = (n w) 1 trw (I 1 n 1 T n/n) = (n w) 1 trw (n 2 w) 1 tr1 T nw 1 n = (n w) 1 n w ii (n 2 w) 1 n w ij = 1. i=1 i,j=1 (2.5) From P3 together with tr W = 1, λ (1) < 0 are guaranteed whereas λ (n 1) > 0 are not. In fact, if w ij = 1 for any i j or W = 1 n 1 T n I, we have n w = n 1 and W = (n w) 1 H T W H = (n 1) 1 H T (1 n 1 T n I)H

Y. Maruyama/An alternative to Moran s I 5 = (n 1) 1 H T H = (n 1) 1 I n 1 whose eigenvalues are all equal to (n 1) 1. This implies that W is negative definite and that Moran s I is negative for any y. The negative definiteness with w ij = 1 for any i j suggests possible negative definiteness for w ij with small variability. Suppose, for 0 < a < 1, we observe n(n 1) uniform random variables on (1 a, 1 + a), which are assigned to w ij for i j. We are interested in definiteness of W = (n w) 1 H T (W + W T )/2} H. From numerical experiment, we have negative definiteness and indefiniteness of W for a (0, a ) and a (a, 1), respectively, where a is approximately equal to 0.3 for n = 25; 0.2 for n = 50; 0.14 for n = 75; 0.12 for n = 100. 3. An alternative to Moran s I In section 2, we pointed out some disadvantages of the original Moran s I. In this section, we propose an alternative to Moran s I, the range of which is exactly [ 1, 1] and which is a monotone transformation of the original Moran s I. There are two steps of modification. The one is a linear transformation for acquirement of indefiniteness. The other is an adequate normalization for the linear transformation. As pointed out in (2.5), the sum of eigenvalues of W = (n w) 1 H T W H is 1, which seems related to the bias of the original Moran s I, that is, E[I ] = 1 n 1 < 0 under spatial independence, as in Cliff and Ord (1981). In order to get a kind of unbiasedness, we start with a linear transformation of Moran s I, (n 1)I + 1 = (n 1) vt W v v T v + 1 = vt Qv v T v, where v = H T y R n 1 and Q = (n 1) W + I n 1. Since we have trq = (n 1)tr W + tri n 1 = (n 1) + (n 1) = 0, the (n 1) (n 1) matrix Q is indefinite, that is, (n 1)I + 1 can take both positive and negative values depending on y. We are now in position to give the adequate normalization of (n 1)I +1. The smallest and largest eigenvalue of Q is (n 1)λ (1) +1 and (n 1)λ (n 1) +1, which are guaranteed to be negative and positive respectively, by the indefiniteness of Q or trq = 0. Hence, for v which satisfies v T Qv < 0 or equivalently (n 1)I + 1 < 0, we have v T Qv (n 1)λ(1) + 1 v T v

Y. Maruyama/An alternative to Moran s I 6 and the equality is attained by I = λ (1). For v which satisfies v T Qv > 0 or equivalently (n 1)I + 1 > 0, we have v T Qv v T v (n 1)λ (n 1) + 1 and the equality is attained by I = λ (n 1). Let I M be an alternative to Moran s I given by I M = (n 1)I + 1, C = C (n 1)λ (1) + 1 (n 1)I + 1 < 0 (n 1)λ (n 1) + 1 (n 1)I + 1 0 where λ (1) and λ (n 1) are the smallest and largest eigenvalue of the (n 1) (n 1) matrix W given by (2.4). Then we have a following result. Theorem 3.1. The exact range of I M is I M [ 1, 1]. The lower and upper bound is attained as follows; 1 I = λ (1) I M = 1 I = λ (n 1). As far as we know, this is the first measure with exact range [ 1, 1], which I hope, helps better understanding of spatial autocorrelation. Furthermore it is notable that since the alternative I M is a monotone transformation of Moran s I, the permutation test of spatial independence based on Moran s I, which has been implemented in earlier studies, can be saved after the standard measure is replaced by I M. References Cliff, A. D. and Ord, J. K. (1981). Spatial processes: models & applications. Pion Ltd., London. MR632256 de Jong, P., Sprenger, C. and van Veen, F. (1984). On Extreme Values of Moran s I and Geary s c. Geographical Analysis 16 17 24. Goodchild, M. F. (1988). Spatial Autocorrelation. Concepts and Techniques in Modern Geography 47. Norwich, Geobooks. Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika 37 17 23. MR0035933 Waller, L. A. and Gotway, C. A. (2004). Applied spatial statistics for public health data. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], Hoboken, NJ. MR2075123