Lecture 3: Review of Linear Algebra

Similar documents
Lecture 3: Review of Linear Algebra

2. Review of Linear Algebra

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Linear Algebra Massoud Malek

Chapter 3 Transformations

The following definition is fundamental.

Applied Linear Algebra in Geoscience Using MATLAB

Inner product spaces. Layers of structure:

Vectors. Vectors and the scalar multiplication and vector addition operations:

INNER PRODUCT SPACE. Definition 1

Solutions to Review Problems for Chapter 6 ( ), 7.1

Quantum Computing Lecture 2. Review of Linear Algebra

Mathematical foundations - linear algebra

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Lecture notes: Applied linear algebra Part 1. Version 2

Review problems for MA 54, Fall 2004.

Linear Algebra Review. Vectors

Chapter 4 Euclid Space

Lecture Notes 1: Vector spaces

Linear Algebra. Session 12

Functional Analysis Review

Mathematical foundations - linear algebra

Basic Calculus Review

Linear Algebra - Part II

Lecture 2: Linear Algebra Review

Chapter 6: Orthogonality

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Linear Algebra- Final Exam Review

Numerical Linear Algebra

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math Linear Algebra II. 1. Inner Products and Norms

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

UNIT 6: The singular value decomposition.

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

7. Symmetric Matrices and Quadratic Forms

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Diagonalizing Matrices

Elementary linear algebra

1. General Vector Spaces

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Large Scale Data Analysis Using Deep Learning

EE731 Lecture Notes: Matrix Computations for Signal Processing

Basic Concepts in Matrix Algebra

Chapter 7: Symmetric Matrices and Quadratic Forms

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

IV. Matrix Approximation using Least-Squares

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Lecture 1: Review of linear algebra

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Definitions for Quizzes

There are two things that are particularly nice about the first basis

MAT Linear Algebra Collection of sample exams

Vectors in Function Spaces

14 Singular Value Decomposition

NORMS ON SPACE OF MATRICES

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

Fall 2016 MATH*1160 Final Exam

Linear Algebra in Actuarial Science: Slides to the lecture

Quadratic forms. Here. Thus symmetric matrices are diagonalizable, and the diagonalization can be performed by means of an orthogonal matrix.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

EXAM. Exam 1. Math 5316, Fall December 2, 2012

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Chap 3. Linear Algebra

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

Stat 159/259: Linear Algebra Notes

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

A Review of Linear Algebra

Applied Linear Algebra in Geoscience Using MATLAB

Properties of Matrices and Operations on Matrices

Review of some mathematical tools

The Eigenvalue Problem: Perturbation Theory

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Lecture 7: Positive Semidefinite Matrices

Multivariate Statistical Analysis

Lecture 1: Systems of linear equations and their solutions

MATH Linear Algebra

I. Multiple Choice Questions (Answer any eight)

Problem # Max points possible Actual score Total 120

Principal Component Analysis

MTH 2032 SemesterII

Mathematical Methods wk 1: Vectors

Mathematical Methods wk 1: Vectors

Basic Elements of Linear Algebra

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003

6. Orthogonality and Least-Squares

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

EECS 275 Matrix Computation

EE731 Lecture Notes: Matrix Computations for Signal Processing

The Singular Value Decomposition and Least Squares Problems

5 Compact linear operators

MA 265 FINAL EXAM Fall 2012

Transcription:

ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms, etc) as matrices This lecture reviews basic concepts from linear algebra that will be useful Linear Vector Space Definition A linear vector space X is a collection of elements satisfying the following properties: addition: x, y, z X a) x + y X b) x + y = y + x c) (x + y) + z = x + (y + z) d) X, such that x + = x e) x X, x X such that x + ( x) = multiplication: x, y X and a, b R a) a x X b) a(b x) = (ab) x c) x = x, x = d) a(x + y) = ax + ay Example Here are two examples of linear vector spaces The familiar d-dimensional Euclidean space R d and the space of finite energy signals/functions supported on the interval [, T ] { } L 2 ([, T ]) := x : T It is easy to verify the properties above for both examples x 2 (t) dt < Definition 2 A subset M X is subspace if x, y M ax + by M, scalars a, b Definition 3 An inner product is a mapping from X X to R The inner product between any x, y X is denoted by x, y and it satisfies the following properties for all x, y, z X : a) x, y = y, x b) ax, y = a x, y, scalars a c) x + y, z = x, z + y, z d) x, x A space X equipped with an inner product is called an inner product space

Lecture 3: Review of Linear Algebra 2 Example 2 Let X = R n Then x, y := x T y = n i= x iy i Example 3 Let X = L 2 ([, ]) Then x, y := x(t)y(t) dt The inner product measures the alignment of the two vectors The inner product induces a norm defined as x := x, x The norm measures the length/size of x The inner product x, y = x y cos(θ), where θ is the angle between x and y Thus, in general, for every x, y X we have x, y x y, with equality if and only if x and y are linearly dependent or parallel ; ie, θ = This is called the Cauchy-Schwarz Inequality Two vectors x, y are orthogonal if x, y = Example 4 Let X = R 2, then x = and y = are orthogonal, as are u = and v = Definition 4 An inner product space that contains all its limits is called a Hilbert Space and in this case we often denote the space by H; ie, if x, x 2, are in H and lim n x n exists, then the limit is also in H It is easy to verify that R n, L 2 ([, T ]), and l 2 (Z), the set of all finite energy sequences (eg, discrete-time signals), are all Hilbert Spaces 2 Bases and Representations Definition 5 A collection of vectors {x,, x k } are said to be linearly independent if none of them can be represented as a linear combination of the others That is, for any x i and every set of scalar weights {θ j } we have x i j i θ jx j Definition 6 The set of all vectors that can be generated by taking linear combinations of {x,, x k } have the form k v = θ i x i, is called the span of {x,, x k }, denoted span(x,, x k ) i= Definition 7 A set of linearly independent vectors {φ i } i is a basis for H if every x H can be represented as a unique linear combination of {φ i } That is, every x H can be expressed as x = i θ i φ i for a certain unique set of scalar weights {θ i } [ Example 5 Let H = R 2 Then and are a basis (since they are orthogonal) Also, [ ] are a basis because they are linearly independent (although not orthogonal) ] and Definition 8 An orthonormal basis is one satisfying φ i, φ j = δ ij := {, i = j, i j

Lecture 3: Review of Linear Algebra 3 Every x H can be represented in terms of an orthonormal basis {φ i } i (or orthobasis for short) according to: x = i x, φ i φ i This is easy to see as follows Suppose x has a representation i θ iφ i Then x, φ j = i θ iφ i, φ j = i θ i φ i, φ j = i θ iδ i,j = θ j Example 6 Here is an orthobasis for L 2 ([, ]): for i =, 2, Doesn t it look familiar? φ 2i (t) := 2 cos(2π(i )t) φ 2i (t) := 2 sin(2πit) Any basis can be converted into an orthonormal basis using Gram-Schmidt Orthogonalization Let {φ i } be a basis for a vector space X An orthobasis for X can be constructed as follows Gram-Schmidt Orthogonalization ψ := φ / φ 2 ψ 2 := φ 2 ψ, φ 2 ψ ; ψ 2 = ψ 2/ ψ 2 k ψ k = φ k k i= ψ i, φ k ψ i ; ψ k = ψ k / ψ k 3 Orthogonal Projections and Filters One of the most important tools that we will use from linear algebra is the notion of an orthogonal projection Most linear filters used in signal processing (eg, bandpass filters, averaging filters, etc) may all be interpreted as or related to orthogonal projections Let H be a Hilbert space and let M H be a subspace Every x H can be written as x = y + z, where y M and z M, which is shorthand for z orthogonal to M; that is v M, v, z = The vector y is the optimal approximation to x in terms of vectors in M in the following sense: x y = min x v v M The vector y is called the projection of x onto M Here is an application of this concept Let M H and let {φ i } r i= be an orthobasis for M We say that the subspace M is spanned by {φ i } r i= Note that this implies that M is an r-dimensional subspace of H (and it is isometric to R r ) For any x H, the projection of x onto M is given by y = r φ i, x φ i i= and this projection can be viewed as a sort of filter that removes all components of the signal that are orthogonal to M

Lecture 3: Review of Linear Algebra 4 [ ] [ Example 7 Let H = R 2 Consider the canonical coordinate system φ = and φ 2 = the subspace spanned by φ The projection of any x = [x x 2 ] T R 2 onto this subspace is ( [ ]) x P x = x, = [x x 2 ] = ] Consider The projection operator P is just a matrix and it is given by P := φ φ T = [ ] = [ ] [ ] / 2 It is also easy to check that φ = / / 2 and φ 2 = 2 / is an orthobasis for R 2 What is the 2 projection operator onto the span of φ in this case? More generally suppose we are considering R n and we have an orthobasis {φ i } r i= for some r-dimensional, r < n, subspace M of R n Then the projection matrix is given by P M = r i= φ iφ T i Moreover, if {φ i} r i= is a basis for M, but not necessarily orthonormal, then P M = Φ(Φ T Φ) Φ T, where Φ = [φ,, φ r ], a matrix whose columns are the basis vectors Example 8 Let H = L 2 ([, ]) and let M = {linear functions on [, ]} Since all linear functions have the form at + b, for t [, ], here is a basis for M: φ (t) =, φ 2 (t) = t Note that this means that M is two-dimensional That makes sense since every line is defined by its slope and intercept (two real numbers) Using the Gram-Schmidt procedure we can construct the orthobasis ψ (t) =, ψ 2 (t) = t /2 Now, consider any signal/function x L 2 ([, ]) The projection of x onto M is 4 Eigenanalysis P M x = x, + x, t /2 (t /2) = x(τ) dτ + (t /2) (τ /2)x(t) dτ Suppose A is an m n matrix with entries from a field (eg, R or C, the latter being the complex numbers) Then there exists a factorization of the form A = U D V where U = [u u m ] is m m with orthonormal columns, V = [v v n ] is n n with orthonormal columns and the superscript means transposition and conjugation (if complex-valued), and the matrix D is m n and has the form σ σ 2 D = σ m The values σ,, σ m are called the singular values of A and the factorization is called the singular value decomposition (SVD) Because of the orthonormality of the columns of U and V we have Av i = σ i u i and A u i = σ i v i, i =,, m A vector u is called an eigenvector of A if A u = λ u for some scalar λ The scalar λ is called the eigenvalue associated with u Symmetric matrices (which are always square) always have real eigenvalues and have an eigendecomposition of the form A = UDU, where the columns of U the orthonormal eigenvectors of A, D is a diagonal matrix written D = diag(λ,, λ n ), and diagonal entries are the eigenvalues This is just a special case of the SVD A symmetric positive-semidefinite matrix satisfies the property v T Av for all v This implies that the eigenvalues of symmetric positive-semidefinite matrices are non-negative

Lecture 3: Review of Linear Algebra 5 Example 9 Let X be a random vector taking values in R n and recall the definition of the covariance matrix: Σ := E[(X µ)(x µ) T ] It is easy to see that v T Σv, and of course Σ is symmetric Therefore, every covariance matrix has an eigendecomposition of the form Σ = UDU, where D = diag(λ,, λ n ) and λ i for i =,, n The Karhunen-Loève Transform (KLT), which is also called Principal Component Analysis (PCA), is based on transforming a random vector X into the coordinate system associated with the eigendecomposition of the covariance of X Let X be an n-dimensional random vector with covariance matrix Σ = UDU Let u,, u n be the eigenvectors Assume that the eigenvectors and eigenvalues are ordered such that λ λ 2 λ n The KLT or PCA representation of the random vector X is given by X = n (u T i X)u i i= The coefficients in this representation can be arranged in a vector as θ = U T X, where U is as defined above The vector θ is called the KLT or PCA of X Using this representation we can define the approximation to X in the span of the first r < n eigenvectors X r = r (u T i X)u i i= Note that this approximation involves only r scalar random variables {(u T i X)}r i= rather than n In fact, it is easy to show that among all r-term linear approximations of X in terms of r random variables, X r has the smallest mean square error; that is if we let S r denote all r-term linear approximations to X, then E[ X X r 2 ] = min E[ X Y r ] Y r S r