The line, the circle, and the ray. R + x r. Science is linear, is nt? But behaviors take place in nonlinear spaces. The line The circle The ray

Similar documents
Lyapunov functions on nonlinear spaces S 1 R R + V =(1 cos ) Constructing Lyapunov functions: a personal journey

Differential Geometry and Lie Groups with Applications to Medical Imaging, Computer Vision and Geometric Modeling CIS610, Spring 2008

distances between objects of different dimensions

GEOMETRIC DISTANCE BETWEEN POSITIVE DEFINITE MATRICES OF DIFFERENT DIMENSIONS

The nonsmooth Newton method on Riemannian manifolds

1 Kernel methods & optimization

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

6. Linear Transformations.

Projective geometry and spacetime structure. David Delphenich Bethany College Lindsborg, KS USA

Statistical Geometry Processing Winter Semester 2011/2012

Sparse Optimization Lecture: Basic Sparse Optimization Models

Spectral Processing. Misha Kazhdan

Stochastic gradient descent on Riemannian manifolds

IT is well-known that the cone of real symmetric positive

A Riemannian Framework for Denoising Diffusion Tensor Images

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Math Subject GRE Questions

Stochastic gradient descent on Riemannian manifolds

Distances between spectral densities. V x. V y. Genesis of this talk. The key point : the value of chordal distances. Spectrum approximation problem

MATH 311: COMPLEX ANALYSIS CONFORMAL MAPPINGS LECTURE

Numerical Integration (Quadrature) Another application for our interpolation tools!

Lecture 2: Linear Algebra Review

C/CS/Phys C191 Grover s Quantum Search Algorithm 11/06/07 Fall 2007 Lecture 21

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M =

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Manifolds, Lie Groups, Lie Algebras, with Applications. Kurt W.A.J.H.Y. Reillag (alias Jean Gallier) CIS610, Spring 2005

Generalized Shifted Inverse Iterations on Grassmann Manifolds 1

Lecture 4.2 Finite Difference Approximation

18-660: Numerical Methods for Engineering Design and Optimization

ABSTRACT ALGEBRA WITH APPLICATIONS

Visual SLAM Tutorial: Bundle Adjustment

Mathematics 309 Conic sections and their applicationsn. Chapter 2. Quadric figures. ai,j x i x j + b i x i + c =0. 1. Coordinate changes

10. Smooth Varieties. 82 Andreas Gathmann

33A Linear Algebra and Applications: Practice Final Exam - Solutions

Summary of Prof. Yau s lecture, Monday, April 2 [with additional references and remarks] (for people who missed the lecture)

Using Hankel structured low-rank approximation for sparse signal recovery

The Erlangen Program and General Relativity

Vector and Affine Math

M E M O R A N D U M. Faculty Senate approved November 1, 2018

Given the vectors u, v, w and real numbers α, β, γ. Calculate vector a, which is equal to the linear combination α u + β v + γ w.

LECTURE 16: LIE GROUPS AND THEIR LIE ALGEBRAS. 1. Lie groups

Modern Geometric Structures and Fields

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Linear Algebra. Min Yan

Interval solutions for interval algebraic equations

NPTEL

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

Numerical Methods I Singular Value Decomposition

A Padé approximation to the scalar wavefield extrapolator for inhomogeneous media

Population Games and Evolutionary Dynamics

Lecture: Face Recognition and Feature Reduction

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Exact and Approximate Numbers:

Subgroups of Lie groups. Definition 0.7. A Lie subgroup of a Lie group G is a subgroup which is also a submanifold.

Paul Heckbert. Computer Science Department Carnegie Mellon University. 26 Sept B - Introduction to Scientific Computing 1

DIFFERENTIAL GEOMETRY HW 12

STA414/2104 Statistical Methods for Machine Learning II

REGULAR TRIPLETS IN COMPACT SYMMETRIC SPACES

2. On integer geometry (22 March 2011)

The Karcher Mean of Points on SO n

1 Computing with constraints

Lecture 13 The Fundamental Forms of a Surface

Tangent bundles, vector fields

Induced Representations and Frobenius Reciprocity. 1 Generalities about Induced Representations

CS 4300 Computer Graphics. Prof. Harriet Fell Fall 2011 Lecture 11 September 29, 2011

9-12 Mathematics Vertical Alignment ( )

(MTH5109) GEOMETRY II: KNOTS AND SURFACES LECTURE NOTES. 1. Introduction to Curves and Surfaces

We wish the reader success in future encounters with the concepts of linear algebra.

Robustness of Principal Components

From Wikipedia, the free encyclopedia

L26: Advanced dimensionality reduction

MLCC 2015 Dimensionality Reduction and PCA

LAKELAND COMMUNITY COLLEGE COURSE OUTLINE FORM

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Linear Algebra: Matrix Eigenvalue Problems

Review of Linear Algebra

Clifford Algebras and Spin Groups

Lecture 7: Positive Semidefinite Matrices

The structure tensor in projective spaces

S.F. Xu (Department of Mathematics, Peking University, Beijing)

0.1. Linear transformations

Geometry and Kinematics with Uncertain Data

Chapter 3 Numerical Methods

Fitting Linear Statistical Models to Data by Least Squares I: Introduction

Some preconditioners for systems of linear inequalities

Numerical Methods in Matrix Computations

LECTURE 25-26: CARTAN S THEOREM OF MAXIMAL TORI. 1. Maximal Tori

Discriminative Direction for Kernel Classifiers

Covariance Tracking Algorithm on Bilateral Filtering under Lie Group Structure Yinghong Xie 1,2,a Chengdong Wu 1,b

How curvature shapes space

Lecture: Face Recognition and Feature Reduction

Vector-valued quadratic forms in control theory

Lecture 7: Weak Duality

SYMMETRIES OF SECTIONAL CURVATURE ON 3 MANIFOLDS. May 27, 1992

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

Applied Linear Algebra

THE EULER CHARACTERISTIC OF A LIE GROUP

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Transcription:

Science is linear, is nt The line, the circle, and the ray Nonlinear spaces with efficient linearizations R. Sepulchre -- University of Cambridge Francqui Chair UCL, 05 Page rank algorithm Consensus algorithms Power method all share the same linear iteration + = A But behaviors take place in nonlinear spaces The line The circle The ray The power algorithm is an iteration on the projective space (orthogonality constraint, i.e. the circle) Perron-Frobenius theorem is a fied point in the projective space (positivity constraints, i.e. the ray) R S R + r

Part I: Homogeneity is essential to nonlinear behaviors The line : linear spaces Homogeneity is the net best thing to linearity It is necessary to account for the nonlinear nature of data y + y R Linear combinations: the basis of calculus It is sufficient to make local analysis efficient R, R n, R n n, C, C n, Sym(n), Skew(n),... The circle : phase and rotation spaces The ray : intensity spaces e i Embedding: Projection: S! C :! e i C! S : z! arg(z) Embedding: R +! R : r! log r S, S n, SO(n), SU(n),... R +, + (n), GL + (n),... phase, rotation, attitude, orthogonal matrices, unitary matrices,... Representation: linear spaces + orthogonality constraints radius, intensity, concentration, probability, density, Representation: linear spaces + positivity constraints

Polar coordinates r z A fundamental result of linear matri analysis A = QR Linear transformations (matrices) have intensities and orientation A = U V T z = re i C Any linear transformation can be decomposed as two rotations and one diagonal scaling. Linear objects (vectors) have intensities and orientation Nonlinear data Eamples of nonlinear measurements Most sensors have a preference for phase or intensity. For good reasons. concentration signals Our ears favor amplitude. Our eyes favor phase. Our nose favors concentrations. phase & intensity signals intensity signals

Behaviors (Willems) Linear behaviors (a mature theory) (V, B) V T! W vector space The Universum : space in which we observe/measure/collect the data The Behavior : mathematical law that govern the data B linear subspace E.g. Dynamical systems: V signal space, i.e. T! W :(t, )! w(t, ) B local law in (t, ) : F (w, ẇ,..., w (m) )=0 Linearization = local (and efficient) calculus Linearized behaviors Filtering Interpolating Optimizing (least squares, grad) Averaging,... (lecture 4) Linearization principle (Newton): linear behaviors capture local behaviors near a nominal solution w B w w + w w B(w ) linearized behavior around w From Taylor s epansion: DF(w ) w =0 Note: this requires W to be a linear space...

Linearized phase behavior Local analysis of nonlinear behaviors Embed the space W in a linear space and make the embedding part of the behavior W = S, B : F ( ) =0 B(w ) B(w ) B(w 3) is equivalent to z W = C, B : F ( ) =0 z = e i B Patching linearized behaviors: intractable Note: conceptually, the embedding trick works for arbitrary differentiable manifolds. What is special about the phase constraint Invariance and nonlinear data We like to think of local laws over to be invariant with respect to data phases intensities rotation scaling Homogeneity: linearization should look the same everywhere... Moon phase measured in Tokyo or Paris T measured in C or F...

Behaviors in invariant spaces LT(S)I behaviors are maimally invariant Laws independent from the locality of our data are invariant to specific transformations of the space The law is the same everywhere and everytime W = R W = S W = R + Invariance to translation of data Invariance to rotation of data Invariance to scaling of data shift-invariant in T translation invariant in W Note: those are the eact assumptions under which behavioral theory is mature and efficient. The key property The geometry of scaling and rotating Phase and rotation spaces are homogeneous spaces. In those spaces, linearization (and hence local calculus) can be made the same everywhere. The line, the circle, and the ray share a common mathematical structure. + z e i.e iz e.e z Transitive group action. Reaching any point from any point.

Lie groups Matri Lie groups From Wikipedia: three major themes in 9th century mathematics were combined by Lie in creating his new theory: the idea of symmetry, as eemplified by Galois through the algebraic notion of a group; geometric theory and the eplicit solutions of differential equations of mechanics, worked out by Poisson and Jacobi; and the new understanding of geometry that emerged in the works of Plücker, Möbius, Grassmann and others, and culminated in Riemann's revolutionary vision of the subject. R n n SO(n) matri translation matri rotation Pioneers in engineering: e.g. A.S. Willsky, Dynamical Systems Defined on Groups: Structural Properties and Estimation, Ph.D. Thesis, MIT Dept. of Aeronautics and Astronautics, May 973. Thesis Advisors: R. W. Brockett, Wallace E. Vander Velde. GL(n) matri scaling Homogeneity is essential to nonlinear modeling The local description of the law can be made independent from the locality of the data window only if W is homogeneous. Homogeneity is key to tractability: local coordinates can be made the same everywhere... Part II: Eamples of behaviors on homogeneous spaces A homogeneous space is a space with a transitive group action by a Lie group. (The sphere is not a Lie group but it is a homogenous space)

Homogeneous spaces A homogeneous space M is a space with a transitive group action by a Lie group G. Notation: M G/H H is the stabilizer: Two (important) eamples S + (n) Symmetric positive definite matrices (Behaviors in homogeneous spaces; positivity constraints) H = {g G g.e = e} Gr(p, n) Sphere: S O(3)/O() Behaviors on the space of positive definite matrices the set of p-dimensional subspaces of Rn (Homogeneous behaviors in linear spaces; orthogonality and rank constraints) Diffusion Tensor Imaging voel data = local measure of diffusion of water molecules Phase Intensity Homogeneous data processing (Filtering, interpolating, registering,...)

Quadratic forms on linear data Zero-mean gaussian distributions are characterized by covariance matrices A linear transformation of the data points results in the group action GL(n) X = E( T )! A X! AXA T S + (n) The group acts transitively on by congruence. S + (n) GL(n)/O(n) Other eamples of quadratic forms: kernels, distance matrices,... Engineering impact of affine-invariant geometry of the cone Statistical engineering S. T. Smith, Covariance, subspace, and intrinsic Cramer-Rao bounds, IEEE Trans. Signal Process., 53 (005), pp. 60 630. Conve optimization Yu. E. Nesterov and M. J. Todd, On the Riemannian geometry defined for selfconcordant barriers and interior point methods, Found. Comput. Math., (00), pp. 333 36 DTI imaging X. Pennec, P. Fillard, and N. Ayache, A Riemannian framework for tensor computing, International Journal of Computer Vision, 66 (006), pp. 4 66. Behaviors on quadratic forms: gaussian processes, kernel optimization,... Different ways to make a space homogeneous Diffusion Tensor Imaging : filtering and interpolating smarties X = AA T = U U T = ep(z) Affine-invariant geometry (intrinsic) X GL(n)/O(n) Group embedding (etrinsic) (U, ) O(n) + linear embedding (etrinsic) Z Sym(n) Issues: computation, singularities, invariance properties (PhD Anne Collard, 03: anisotropy preserving midpoints)

Matri completion: a popular benchmark Big data behaviors ~ 07 known ratings (0.0% - 0.%) A recurrent theme: Scarcity of data points in huge dimensional spaces make behaviors ill-posed. Remedy: rank and orthogonality constraints 3 4 4 3 5 5 4 3 ~ 05 items ~ 06 users Matri completion with a low-rank prior Statistics with scarce data Make the search space dimension consistent with the number of data points 3 4 4 3 5 5 4 3 spots are gene epression levels each row is an eperiment (~0) each column is a gene (~04) DNA. mrna Protein

Statistics with a low-rank /sparsity prior The Grassmann manifold Gr(p, n) the set of p-dimensional subspaces of Rn A subspace is determined by the first p columns of an orthogonal matri Gr(p, n) O(n)/stab e epression of a component for all eperiments test correlation with clinical data gene signature of a component test overlap with pathways, regulatory modules Rank constraints M (p, m n) Transitive group action : A key homogeneous space of behavioral theory Miing rank and positivity constraints Space of m by n matrices of rank p (A, B)! AXB T M (p, m n) Gl(n) Gl(m)/stab ep The set of positive semidefinite matrices of size n and rank p S+ (p, n) = {X Rn n X = X T 0} p Transitive group action: A! AXAT S+ (p, n) GL(n)/stabe 44

The line, the circle, and the ray Nonlinear spaces with efficient linearizations Nonlinear data is a source of nonlinear behaviors. Phase and intensities spaces are homogeneous. Rank, orthogonality, and positivity constraints are homogeneous. Behaviors on homogeneous spaces can be made independent from the locality of data. Calculus on homogeneous spaces can be made invariant (lecture 3). Even if they are not invariant, behaviors on homogeneous spaces might have invariant properties (see lecture 6). Behaviors defined on non homogeneous spaces are ill-posed Behaviors with invariant properties are tractable