Linear Algebra. Maths Preliminaries. Wei CSE, UNSW. September 9, /29

Similar documents
CS 143 Linear Algebra Review

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Review of Some Concepts from Linear Algebra: Part 2

Elementary linear algebra

Linear Algebra for Machine Learning. Sargur N. Srihari

Background Mathematics (2/2) 1. David Barber

Mathematical foundations - linear algebra

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Basic Calculus Review

Linear Algebra Review. Vectors

Econ Slides from Lecture 7

Knowledge Discovery and Data Mining 1 (VO) ( )

14 Singular Value Decomposition

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

Lecture: Face Recognition and Feature Reduction

Linear Algebra Review

Review of Linear Algebra

Fundamentals of Matrices

Linear Algebra - Part II

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Tutorials in Optimization. Richard Socher

Section 3.1: Definition and Examples (Vector Spaces)

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Math 307 Learning Goals. March 23, 2010

Elements of linear algebra

Exercises * on Linear Algebra

GQE ALGEBRA PROBLEMS

Foundations of Computer Vision

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Functional Analysis Review

Vector Spaces and Linear Transformations

Mathematical foundations - linear algebra

MAT Linear Algebra Collection of sample exams

Lecture 2: Linear Algebra

Large Scale Data Analysis Using Deep Learning

Elementary Linear Algebra

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Stat 159/259: Linear Algebra Notes

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Kernel Method: Data Analysis with Positive Definite Kernels

10-701/ Recitation : Linear Algebra Review (based on notes written by Jing Xiang)

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Review of Linear Algebra Definitions, Change of Basis, Trace, Spectral Theorem

Linear Algebra (Review) Volker Tresp 2018

OR MSc Maths Revision Course

Math 307 Learning Goals

Properties of Linear Transformations from R n to R m

The Hilbert Space of Random Variables

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Matrices A brief introduction

Lecture 1: Review of linear algebra

MATH 583A REVIEW SESSION #1

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Chap 3. Linear Algebra

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition

Symmetric and anti symmetric matrices

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Math Linear Algebra II. 1. Inner Products and Norms

MATH 167: APPLIED LINEAR ALGEBRA Chapter 3

Homework 3 Solutions Math 309, Fall 2015

Math 315: Linear Algebra Solutions to Assignment 7

Lecture 10 - Eigenvalues problem

Matrix Arithmetic. (Multidimensional Math) Institute for Advanced Analytics. Shaina Race, PhD

Basic Math Review for CS4830

Elementary maths for GMT

STA141C: Big Data & High Performance Statistical Computing

Singular Value Decomposition and Principal Component Analysis (PCA) I

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Lecture II: Linear Algebra Revisited

The following definition is fundamental.

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

15 Singular Value Decomposition

Review of Basic Concepts in Linear Algebra

Lecture: Face Recognition and Feature Reduction

Applied Linear Algebra in Geoscience Using MATLAB

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Math 52: Course Summary

(, ) : R n R n R. 1. It is bilinear, meaning it s linear in each argument: that is

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

1 Inner Product and Orthogonality

Notes on Linear Algebra and Matrix Theory

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

A Brief Outline of Math 355

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

Announcements Monday, November 13

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

Lecture Summaries for Linear Algebra M51A

Lecture 2: Linear Algebra Review

Linear Algebra. Session 12

CHAPTER 3. Matrix Eigenvalue Problems

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Basic Math Review for CS1340

MATH 225 Summer 2005 Linear Algebra II Solutions to Assignment 1 Due: Wednesday July 13, 2005

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Pre-School Linear Algebra

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

Transcription:

September 9, 2018 1/29

Introduction This review focuses on, in the context of COMP6714. Key take-away points Matrices as Linear mappings/functions 2/29

Note You ve probability learned from matrix/system of linear equations, etc. We will review key concepts in LA from the perspective of linear transformations (think of it as functions for now). This perspective provides semantics and intuition into most of the ML models and operations. Here we emphasize more on intuitions; We deliberately skip many concepts and present some contents in an informal way. It is a great exercise for you to view related maths and ML models/operations in this perspective throughout this course! 3/29

A Common Trick in Maths I Question Calculate 2 10, 2 1, 2 ln 5 and 2 4 3i? Properties: f a (n) = f a (n 1) a, for n 1; f a (1) = a. f (u) f (v) = f (u + v). f (x) = y ln(y) = x ln(a) f (x) = exp{x ln a}. e ix = cos(x) + i sin(x). The trick: Same in Linear algebra 4/29

Objects and Their Representations Goal We need to study the objects On one side: A good representation helps (a lot)! On the other side: Properties of the objects should be independent of the representation! 5/29

Basic Concepts I Algebra a set of objects two operations and their identity objects (aka. identify element): addition (+); its identity is 0. scalar multiplication ( ); its identity is 1. constraints: Closed for both operations Some nice properties of these operations: Communicative: a + b = b + a. Associative: (a + b) + c = a + (b + c). Distributive: λ(a + b) = λa + λb. 6/29

Basic Concepts II Think: What about substraction and division? Tips Always use analogy from algebra on integers (Z) and algebra on Polynomials (P). Why these constraints are natural and useful? 7/29

Basic Concepts III Representation matters? Consider even geometric vectors: c = a + b What if we represent vectors by a column of their coordinates? What if by their polar coordinates? Notes Informally, the objects we are concerned with in this course are (column) vectors. The set of all n-dimensional real vectors is called R n. 8/29

(Column) Vector Vector A n-dimensional vector, v, is a n 1 matrix. We can emphasize its shape by calling it a column vector. A row vector is a transposed column vector: v. Operations Addition: v 1 + v 2 = (Scalar) Multiplication: λv 1 = 9/29

Linearity I Linear Combination: Generalization of Univariate Linear Functions Let λ i R, given a set of k vectors v i (i [k]), a linear combination of them is λ 1 v 1 + λ 2 v 2 +... + λ k v k = i [k] λ iv i Later, this is just Vλ, where V = v 1 v 2... v k λ 1 λ 2 λ =... Span: All linear combination of a set of vectors is the span of them. Basis: The minimal set of vectors whose span is exactly the whole R n. λ k 10/29

Linearity II Benefit: every vector has a unique decomposition into basis. Think: Why uniqueness is desirable? Examples [ [ 1 0 Span of and is R 0] 1] 2. They are also the basis. [ [ [ 1 0 2 Span of,, is R 0] 1] 3] 2. But one of them is redundant. Think: Who? [ 4 Decompose 6] 11/29

Linearity III Exercises What are the (natural) basis of all (univariate) Polynomials of degrees up to d? Decompose 3x 2 + 4x 8 into the linear combination of 2, 2x 3, x 2 + 1. 3x 2 + 4x 7 = 3(x 2 + 1) + 2(2x 3) + ( 2)(2). The same polynomial is mapped to two different vectors under two different bases. Think: Any analogy? 12/29

Matrix I Linear Transformation is a nice linear function that maps a vector in R n to another vector in R m. [ ] y 1 x1 f y 2 x 2 y 3 The general form: [ x1 x 2 ] f y 1 y 2 y 3 = y 1 = M 11 x 1 + M 12 x 2 y 2 = M 21 x 1 + M 22 x 2 y 3 = M 31 x 1 + M 32 x 2 13/29

Matrix II Nonexample [ x1 x 2 ] f y 1 y 2 y 3 = y 1 = αx 2 1 + βx 2 y 2 = γx 2 1 + θx 1 + τx 2 y 3 = cos(x 1 ) + e x 2 14/29

Matrix III Why Only Linear Transformation? Simple and nice properties: (f 1 + f 2 )(x) = f 1 (x) + f 2 (x) (λf )(x) = λ f (x) What about f (g(x))? Useful 15/29

Matrix I Definition A m n matrix corresponds to a linear transformation from R n to R m f ( x ) = y = M x = y, where matrix-vector multiplication is defined as: y i = k M ik x k M outdim indim Transformation or Mapping emphasizes more on the mapping between two sets, rather than the detailed specification of the mapping; the latter is more or less the elementary understanding of a function. These are all specific instances of morphism in category theory. Semantic Interpretation 16/29

Matrix II Linear combination of columns of M: x 1 y 1 x 2 M 1 M 2... M n x = M 1 M 2... M n. =. y x m n y = x 1 M 1 +... + x n M n 17/29

Matrix III Example: 1 2 [ ] 1 2 21 4 9 1 = 1 4 + 10 9 = 86 25 1 10 25 1 35 [ ] [ ] [ ] [ ] [ ] 1 2 1 1 2 21 = 1 + 10 = 4 9 10 4 9 86 Think: What does M do for the last example? Rotation and scaling When x is also a matrix, 1 2 [ ] 21 42 4 9 1 2 = 86 172 10 20 25 1 35 70 18/29

System of Linear Equations I y 1 = M 11 x 1 + M 12 x 2 y 2 = M 21 x 1 + M 22 x 2 y 3 = M 31 x 1 + M 32 x 2 = y 1 y 2 y 3 M 11 M 12 = M 21 M 22 M 31 M 32 y = Mx [ x1 x 2 ] Interpretation: find a vector in R 2 such that its image (under M) is exactly the given vector y R 3. How to solve it? 19/29

System of Linear Equations II x 1 y 1 x 2 y 2 x 3 y 3 Domain M Range The above transformation is injective, but not surjective. 20/29

A Matrix Also Specifies a (Generalized) Coordinate System I Yet another interpretation y = Mx = Iy = Mx. The vector y wrt standard coordinate system, I, is the same as x wrt the coordinate system defined by column vectors of M. Think: why columns of M? 21/29

A Matrix Also Specifies a (Generalized) Coordinate System II Example for polynomials I: for 1 1 0 0 for x 0 1 0 = M: for x 2 0 0 1 ] Let x = [ 1 2 3 = Mx = I [ 7 13 6 for 3 3 1 4 for x 1 0 1 5 for 2x 2 +5x 4 ] 0 0 2 22/29

Exercise I What if y is given in the above example? What does the following mean? 3 1 4 1 0 0 3 1 4 0 1 5 0 1 0 = 0 1 5 0 0 2 0 0 1 0 0 2 Think about representing polynomials using the basis: (x 1) 2, x 2 1, x 2 + 1. 23/29

Inner Product THE binary operator some kind of similarity Type signature: vector vector scalar: x, y. In R n, usually called dot product: x y def = x y = i x iy i. For certain functions, f, g = b f (t)g(t) dt. leads to the a Hilbert Space Properties / definitions for R: conjugate symmetry: x, y = y, x linearity in the first argument: ax + y, z = a x, z + y, z positive definitiveness: x, x 0; x, x x = 0; Generalizes many geometric concepts to vector spaces: angle (orthogonal), projection, norm sin nt, sin mt = 0 within [ π, π] (m n) they are orthogonal to each other. C = A B: C ij = A i, B j Special case: A A. 24/29

Eigenvalues/vectors and Eigen Decomposition Eigen means characteristic of (German) A (right) eigenvector of a square matrix A is u such that Au = λu. Not all matrices have eigenvalues. Here, we only consider good matrices. Not all eigenvalues need to be distinct. Traditionally, we normalize u (such that u u = 1). We can use all eigenvectors of A to construct a matrix U (as columns). Then AU = UΛ, or equivalently, A = UΛU 1. This is the Eigen Decomposition. We can interpret U as a transformation between two coordinate systems. Note that vectors in U are not necessarily orthogonal. Λ as the scaling on each of the directions in the new coordinate system. 25/29

Applications Compute A n Apply Eigen Decomposition. A = UΛU 1. Then A n = ( UΛU 1) ( UΛU 1)... = UΛ n U 1 26/29

Exercises I Rewrite n i=1 a ib i in vector/matrix operations. Write a matrix operation such that given x = [ x1 2x 2 3x 3 ]. [ x1 x 2 x 3 ], it returns 27/29

Exercises II Suppose we want to apply the linear mapping W to more than one x vectors. Draw a schematic diagram to show how this can be done using matrix operations (rather than a loop). In machine learning, we usually store training data as a data matrix; if it is n m, then it has n training samples (as rows) and each sample is characterized by its m feature values. How can I apply the above linear mapping to all the training samples. 28/29

References and Further Reading I Gaussian Quadrature: https://www.youtube.com/watch?v=k-yudqrxijo Review and Reference. http://cs229.stanford.edu/section/cs229-linalg.pdf Scipy LA tutorial. https://docs.scipy.org/doc/scipy/ reference/tutorial/linalg.html We Recommend a Singular Value Decomposition. http://www.ams.org/samplings/feature-column/fcarc-svd 29/29